Formal concept analysis

From Wikipedia, the free encyclopedia

In information science, formal concept analysis (FCA) is a principled way of deriving a concept hierarchy or formal ontology from a collection of objects and their properties. Each concept in the hierarchy represents the objects sharing some set of properties; and each sub-concept in the hierarchy represents a subset of the objects (as well as a superset of the properties) in the concepts above it. The term was introduced by Rudolf Wille in 1981, and builds on the mathematical theory of lattices and ordered sets that was developed by Garrett Birkhoff and others in the 1930s.

Formal concept analysis finds practical application in fields including data mining, text mining, machine learning, knowledge management, semantic web, software development, chemistry and biology.

Overview and history[]

The original motivation of formal concept analysis was the search for real-world meaning of mathematical order theory. One such possibility of very general nature is that data tables can be transformed into algebraic structures called complete lattices, and that these can be utilized for data visualization and interpretation. A data table that represents a heterogeneous relation between objects and attributes, tabulating pairs of the form "object g has attribute m", is considered as a basic data type. It is referred to as a formal context. In this theory, a formal concept is defined to be a pair (A, B), where A is a set of objects (called the extent) and B is a set of attributes (the intent) such that

  • the extent A consists of all objects that share the attributes in B, and dually
  • the intent B consists of all attributes shared by the objects in A.

In this way, formal concept analysis formalizes the semantic notions of extension and intension.

The formal concepts of any formal context can—as explained below—be ordered in a hierarchy called more formally the context's "concept lattice." The concept lattice can be graphically visualized as a "line diagram", which then may be helpful for understanding the data. Often however these lattices get too large for visualization. Then the mathematical theory of formal concept analysis may be helpful, e.g., for decomposing the lattice into smaller pieces without information loss, or for embedding it into another structure which is easier to interpret.

The theory in its present form goes back to the early 1980s and a research group led by Rudolf Wille, Bernhard Ganter and Peter Burmeister at the Technische Universität Darmstadt. Its basic mathematical definitions, however, were already introduced in the 1930s by Garrett Birkhoff as part of general lattice theory. Other previous approaches to the same idea arose from various French research groups, but the Darmstadt group normalised the field and systematically worked out both its mathematical theory and its philosophical foundations. The latter refer in particular to Charles S. Peirce, but also to the Port-Royal Logic.

Motivation and philosophical background[]

In his article "Restructuring Lattice Theory" (1982),[1] initiating formal concept analysis as a mathematical discipline, Wille starts from a discontent with the current lattice theory and pure mathematics in general: The production of theoretical results—often achieved by "elaborate mental gymnastics"—were impressive, but the connections between neighboring domains, even parts of a theory were getting weaker.

Restructuring lattice theory is an attempt to reinvigorate connections with our general culture by interpreting the theory as concretely as possible, and in this way to promote better communication between lattice theorists and potential users of lattice theory

— Rudolf Wille, [1]

This aim traces back to the educationalist Hartmut von Hentig, who in 1972 pleaded for restructuring sciences in view of better teaching and in order to make sciences mutually available and more generally (i.e. also without specialized knowledge) critiqueable.[2] Hence, by its origins formal concept analysis aims at interdisciplinarity and democratic control of research.[3]

It corrects the starting point of lattice theory during the development of formal logic in the 19th century. Then—and later in model theory—a concept as unary predicate had been reduced to its extent. Now again, the philosophy of concepts should become less abstract by considering the intent. Hence, formal concept analysis is oriented towards the categories extension and intension of linguistics and classical conceptual logic.[4]

Formal concept analysis aims at the clarity of concepts according to Charles S. Peirce's pragmatic maxim by unfolding observable, elementary properties of the subsumed objects.[3] In his late philosophy, Peirce assumed that logical thinking aims at perceiving reality, by the triade concept, judgement and conclusion. Mathematics is an abstraction of logic, develops patterns of possible realities and therefore may support rational communication. On this background, Wille defines:

The aim and meaning of Formal Concept Analysis as mathematical theory of concepts and concept hierarchies is to support the rational communication of humans by mathematically developing appropriate conceptual structures which can be logically activated.

— Rudolf Wille, [5]

Example[]

The data in the example is taken from a semantic field study, where different kinds of bodies of water were systematically categorized by their attributes.[6] For the purpose here it has been simplified.

The data table represents a formal context, the line diagram next to it shows its concept lattice. Formal definitions follow below.

Example for a formal context: "bodies of water"
bodies of water attributes
temporary running natural stagnant constant maritime
objects
canal Yes Yes
channel Yes Yes
lagoon Yes Yes Yes Yes
lake Yes Yes Yes
maar Yes Yes Yes
puddle Yes Yes Yes
pond Yes Yes Yes
pool Yes Yes Yes
reservoir Yes Yes
river Yes Yes Yes
rivulet Yes Yes Yes
runnel Yes Yes Yes
sea Yes Yes Yes Yes
stream Yes Yes Yes
tarn Yes Yes Yes
torrent Yes Yes Yes
trickle Yes Yes Yes

 

Line diagram corresponding to the formal context bodies of water on the left

The above line diagram consists of circles, connecting line segments, and labels. Circles represent formal concepts. The lines allow to read off the subconcept-superconcept hierarchy. Each object and attribute name is used as a label exactly once in the diagram, with objects below and attributes above concept circles. This is done in a way that an attribute can be reached from an object via an ascending path if and only if the object has the attribute.

In the diagram shown, e.g. the object reservoir has the attributes stagnant and constant, but not the attributes temporary, running, natural, maritime. Accordingly, puddle has exactly the characteristics temporary, stagnant and natural.

The original formal context can be reconstructed from the labelled diagram, as well as the formal concepts. The extent of a concept consists of those objects from which an ascending path leads to the circle representing the concept. The intent consists of those attributes to which there is an ascending path from that concept circle (in the diagram). In this diagram the concept immediately to the left of the label reservoir has the intent stagnant and natural and the extent puddle, maar, lake, pond, tarn, pool, lagoon, and sea.

Formal contexts and concepts[]

A formal context is a triple K = (G, M, I), where G is a set of objects, M is a set of attributes, and IG × M is a binary relation called incidence that expresses which objects have which attributes.[4] For subsets AG of objects and subsets BM of attributes, one defines two derivation operators as follows:

A′ = {mM | (g,m) ∈ I for all gA}, i.e., a set of all attributes shared by all objects from A, and dually
B′ = {gG | (g,m) ∈ I for all mB}, i.e., a set of all objects sharing all attributes from B.

Applying either derivation operator and then the other constitutes two closure operators:

A   ↦  A′′ = (A′)′   for A ⊆ G   (extent closure), and
B   ↦  B′′ = (B′)′   for B ⊆ M   (intent closure).

The derivation operators define a Galois connection between sets of objects and of attributes. This is why in French a concept lattice is sometimes called a treillis de Galois (Galois lattice).

With these derivation operators, Wille gave an elegant definition of a formal concept: a pair (A,B) is a formal concept of a context (G, M, I) provided that:

AG,   BM,   A′ = B,   and  B′ = A.

Equivalently and more intuitively, (A,B) is a formal concept precisely when:

  • every object in A has every attribute in B,
  • for every object in G that is not in A, there is some attribute in B that the object does not have,
  • for every attribute in M that is not in B, there is some object in A that does not have that attribute.

For computing purposes, a formal context may be naturally represented as a (0,1)-matrix K in which the rows correspond to the objects, the columns correspond to the attributes, and each entry ki,j equals to 1 if "object i has attribute j." In this matrix representation, each formal concept corresponds to a maximal submatrix (not necessarily contiguous) all of whose elements equal 1. It is however misleading to consider a formal context as boolean, because the negated incidence ("object g does not have attribute m") is not concept forming in the same way as defined above. For this reason, the values 1 and 0 or TRUE and FALSE are usually avoided when representing formal contexts, and a symbol like × is used to express incidence.

Concept lattice of a formal context[]

The concepts (Ai, Bi) of a context K can be (partially) ordered by the inclusion of extents, or, equivalently, by the dual inclusion of intents. An order ≤ on the concepts is defined as follows: for any two concepts (A1, B1) and (A2, B2) of K, we say that (A1, B1) ≤ (A2, B2) precisely when A1A2. Equivalently, (A1, B1) ≤ (A2, B2) whenever B1B2.

In this order, every set of formal concepts has a greatest common subconcept, or meet. Its extent consists of those objects that are common to all extents of the set. Dually, every set of formal concepts has a least common superconcept, the intent of which comprises all attributes which all objects of that set of concepts have.

These meet and join operations satisfy the axioms defining a lattice, in fact a complete lattice. Conversely, it can be shown that every complete lattice is the concept lattice of some formal context (up to isomorphism).

Attribute values and negation[]

Real-world data is often given in the form of an object-attribute table, where the attributes have "values". Formal concept analysis handles such data by transforming them into the basic type of a ("one-valued") formal context. The method is called conceptual scaling.

The negation of an attribute m is an attribute ¬m, the extent of which is just the complement of the extent of m, i.e., with (¬m)′ = G \ m′. It is in general not assumed that negated attributes are available for concept formation. But pairs of attributes which are negations of each other often naturally occur, for example in contexts derived from conceptual scaling.

For possible negations of formal concepts see the section concept algebras below.

Implications[]

An implication AB relates two sets A and B of attributes and expresses that every object possessing each attribute from A also has each attribute from B. When (G,M,I) is a formal context and A, B are subsets of the set M of attributes (i.e., A,BM), then the implication AB is valid if A′B′. For each finite formal context, the set of all valid implications has a canonical basis,[7] an irredundant set of implications from which all valid implications can be derived by the natural inference (Armstrong rules). This is used in attribute exploration, a knowledge acquisition method based on implications.[8]

Arrow relations[]

Formal concept analysis has elaborate mathematical foundations,[4] making the field versatile. As a basic example we mention the arrow relations, which are simple and easy to compute, but very useful. They are defined as follows: For gG and mM let

gm  ⇔  (g, m) ∉ I and if mn′ and m′ ≠ n′ , then (g, n) ∈ I,

and dually

gm  ⇔  (g, m) ∉ I and if g′h′ and g′ ≠ h′ , then (h, m) ∈ I.

Since only non-incident object-attribute pairs can be related, these relations can conveniently be recorded in the table representing a formal context. Many lattice properties can be read off from the arrow relations, including distributivity and several of its generalizations. They also reveal structural information and can be used for determining, e.g., the congruence relations of the lattice.

Extensions of the theory[]

  • Triadic concept analysis replaces the binary incidence relation between objects and attributes by a ternary relation between objects, attributes, and conditions. An incidence then expresses that the object g has the attribute m under the condition c. Although triadic concepts can be defined in analogy to the formal concepts above, the theory of the trilattices formed by them is much less developed than that of concept lattices, and seems to be difficult.[9] Voutsadakis has studied the n-ary case.[10]
  • Fuzzy concept analysis: Extensive work has been done on a fuzzy version of formal concept analysis.[11]
  • Concept algebras: Modelling negation of formal concepts is somewhat problematic because the complement (G \ A, M \ B) of a formal concept (A, B) is in general not a concept. However, since the concept lattice is complete one can consider the join (A, B)Δ of all concepts (C, D) that satisfy CG \ A; or dually the meet (A, B)