Naive Set Theory

Paradoxes, Cantor's original theory, and the need for axiomatization.


Naive set theory is the original, intuitive approach to the mathematics of collections — the idea that any clearly defined property determines a set of objects satisfying that property. Developed by Georg Cantor in the 1870s, it provided the first rigorous language for talking about infinity, functions, and mathematical structures in a unified way. Yet this very generality harbors contradictions, most famously Russell’s paradox, which revealed that unrestricted set formation leads to logical inconsistency and spurred the development of the axiomatic set theories explored later in this branch.

Sets, Elements, and Membership

The concept of a set is among the most fundamental in all of mathematics. In his 1895 Beiträge zur Begründung der transfiniten Mengenlehre, Georg Cantor defined a set as “a gathering together into a whole of definite, distinct objects of our perception or our thought.” Those objects are called the elements or members of the set. The relationship between an element and a set is called membership, and it is written with the symbol \in: if xx is a member of the set AA, we write xAx \in A, and if xx is not a member, we write xAx \notin A. Membership is the only primitive relation in set theory — everything else is built from it.

There are two standard ways to specify which elements belong to a set. Roster notation (also called enumeration) simply lists the elements inside curly braces:

A={1,2,3,4,5}A = \{1, 2, 3, 4, 5\}

This works well for small or finite sets, but it becomes impractical — or impossible — for large or infinite collections. The more powerful alternative is set-builder notation, which defines a set by a property that its members must satisfy:

B={x:x is a prime number}B = \{x : x \text{ is a prime number}\}

The colon (or vertical bar) is read “such that,” so BB is “the set of all xx such that xx is a prime number.” Set-builder notation is where the real expressive power lies, and also where the danger lurks — as we shall see shortly.

Two special sets deserve immediate mention. The empty set, denoted \emptyset or {}\{\}, is the set with no elements at all. It is unique: there is only one set with no members, a fact that follows from the extensionality principle discussed in the next section. At the other extreme, naive set theory entertains the notion of a universal set UU, the set of all objects under discussion in a given context. Venn diagrams, the overlapping-circle illustrations introduced by John Venn in 1880, typically depict sets as regions inside a rectangle representing UU. While Venn diagrams are invaluable as an intuitive aid, they should be treated as illustrations rather than proofs — the visual metaphor can mislead when sets have complex or infinite structure.

Cantor’s original vision — that every well-defined property determines a set — is sometimes called the unrestricted comprehension principle: for any property P(x)P(x), there exists a set {x:P(x)}\{x : P(x)\}. This principle is wonderfully intuitive and, for most everyday mathematical purposes, entirely safe. But in 1901, Bertrand Russell demonstrated that it is logically self-contradictory. Consider the property “does not contain itself,” and form the set

R={x:xx}.R = \{x : x \notin x\}.

Now ask: is RR a member of itself? If RRR \in R, then by the defining property, RRR \notin R — a contradiction. If RRR \notin R, then RR satisfies the defining property, so RRR \in R — again a contradiction. This is Russell’s paradox, and it shows that unrestricted comprehension is inconsistent. The paradox is not an exotic curiosity; it strikes at the very foundation of the system. Cantor himself had encountered related difficulties, including the paradox of the “set of all sets” (if such a set VV exists, its power set P(V)\mathcal{P}(V) must be both a subset of VV and strictly larger than VV — a contradiction by Cantor’s own theorem). These paradoxes are precisely what motivated the development of axiomatic set theories such as Zermelo-Fraenkel set theory (ZFC), which replace unrestricted comprehension with carefully controlled axioms that avoid self-referential constructions while preserving the vast majority of useful set-theoretic reasoning.

Despite its logical flaws, naive set theory remains the working language of most mathematicians. The intuition it provides is indispensable, and its constructions are sound whenever we restrict attention to sets built from well-understood starting materials — which is exactly what the axioms of ZFC formalize, as explored in Axiomatic Set Theory (ZFC).

Subsets, Power Sets, and Equality

Given two sets AA and BB, we say that AA is a subset of BB, written ABA \subseteq B, if every element of AA is also an element of BB:

AB    if and only if    for every x,  xA    xB.A \subseteq B \;\;\text{if and only if}\;\; \text{for every } x,\; x \in A \implies x \in B.

If ABA \subseteq B and ABA \neq B — that is, BB contains at least one element not in AA — we say AA is a proper subset of BB and write ABA \subset B (though some authors use ABA \subsetneq B to avoid ambiguity). The empty set is a subset of every set: since \emptyset has no elements, the condition “every element of \emptyset is in BB” is vacuously true for any BB. Every set is also a subset of itself, so the subset relation is reflexive.

The concept of set equality rests on the axiom of extensionality, one of the most important principles in all of set theory. It states that two sets are equal if and only if they have exactly the same elements:

A=B    if and only if    for every x,  xA    xB.A = B \;\;\text{if and only if}\;\; \text{for every } x,\; x \in A \iff x \in B.

Equivalently, A=BA = B if and only if ABA \subseteq B and BAB \subseteq A. This “double containment” argument is the standard technique for proving two sets are equal: show that each is a subset of the other. Extensionality also tells us that a set is determined entirely by its members, not by how it is described. The sets {1,2,3}\{1, 2, 3\}, {3,1,2}\{3, 1, 2\}, and {x:x is a positive integer less than 4}\{x : x \text{ is a positive integer less than } 4\} are all the same set, because they contain exactly the same elements. Order and repetition in roster notation do not matter.

The power set of a set AA, denoted P(A)\mathcal{P}(A), is the set of all subsets of AA:

P(A)={S:SA}.\mathcal{P}(A) = \{S : S \subseteq A\}.

For a concrete example, if A={1,2}A = \{1, 2\}, then P(A)={,{1},{2},{1,2}}\mathcal{P}(A) = \{\emptyset, \{1\}, \{2\}, \{1, 2\}\}. The power set always includes both \emptyset (which is a subset of every set) and AA itself. For a finite set with nn elements, the power set has exactly 2n2^n elements:

P(A)=2A|\mathcal{P}(A)| = 2^{|A|}

The reason is combinatorial: each element of AA is either included or excluded from a given subset, giving two independent choices per element. This formula also explains the common notation 2A2^A for the power set.

The power set construction is far more than a bookkeeping device — it is the engine behind some of the deepest results in set theory. Cantor’s theorem, which we will encounter in Functions and Cardinality, states that for any set AA (finite or infinite), there is no surjection from AA onto P(A)\mathcal{P}(A). In other words, the power set is always strictly larger than the original set. For finite sets, this is the obvious fact that 2n>n2^n > n. For infinite sets, it is a profound result: it means there is no single “largest” infinity, but rather an endless hierarchy of ever-larger infinite cardinalities. This hierarchy, discovered by Cantor, was one of the most revolutionary ideas in the history of mathematics and remains at the heart of modern set theory.

Set Operations and Algebraic Laws

Just as arithmetic provides operations for combining numbers, set theory provides operations for combining sets. The union of two sets AA and BB is the set of all elements belonging to at least one of them:

AB={x:xA or xB}.A \cup B = \{x : x \in A \text{ or } x \in B\}.

The intersection of AA and BB is the set of elements belonging to both:

AB={x:xA and xB}.A \cap B = \{x : x \in A \text{ and } x \in B\}.

When AB=A \cap B = \emptyset — the two sets share no elements — we say AA and BB are disjoint. The complement of a set AA relative to a universal set UU is the set of all elements in UU that are not in AA:

Ac={xU:xA}.A^c = \{x \in U : x \notin A\}.

More generally, the relative complement (or set difference) of BB in AA is defined without reference to a universal set:

AB={x:xA and xB}.A \setminus B = \{x : x \in A \text{ and } x \notin B\}.

Finally, the symmetric difference of AA and BB collects those elements belonging to exactly one of the two sets:

AB=(AB)(BA)=(AB)(AB).A \mathbin{\triangle} B = (A \setminus B) \cup (B \setminus A) = (A \cup B) \setminus (A \cap B).

These operations obey a rich collection of algebraic laws that mirror — and historically inspired — the laws of Boolean algebra from propositional logic. The parallel is not accidental: union corresponds to disjunction (\lor), intersection to conjunction (\land), and complementation to negation (¬\neg). The key identities are summarized below.

LawIdentity
CommutativityAB=BAA \cup B = B \cup A and AB=BAA \cap B = B \cap A
Associativity(AB)C=A(BC)(A \cup B) \cup C = A \cup (B \cup C) and likewise for \cap
DistributivityA(BC)=(AB)(AC)A \cap (B \cup C) = (A \cap B) \cup (A \cap C) and dually
IdempotenceAA=AA \cup A = A and AA=AA \cap A = A
IdentityA=AA \cup \emptyset = A and AU=AA \cap U = A
AnnihilationAU=UA \cup U = U and A=A \cap \emptyset = \emptyset
ComplementAAc=UA \cup A^c = U and AAc=A \cap A^c = \emptyset
Double complement(Ac)c=A(A^c)^c = A
De Morgan’s laws(AB)c=AcBc(A \cup B)^c = A^c \cap B^c and (AB)c=AcBc(A \cap B)^c = A^c \cup B^c

De Morgan’s laws for sets, named after Augustus De Morgan who stated the logical versions in the 1840s, are particularly important. They express a fundamental duality: complementing a union yields the intersection of the complements, and vice versa. These laws generalize to arbitrary (even infinite) unions and intersections:

(iIAi)c=iIAicand(iIAi)c=iIAic.\left(\bigcup_{i \in I} A_i\right)^c = \bigcap_{i \in I} A_i^c \qquad \text{and} \qquad \left(\bigcap_{i \in I} A_i\right)^c = \bigcup_{i \in I} A_i^c.

The algebraic structure formed by a collection of sets under \cup, \cap, and complementation (with \emptyset and UU as identity elements) is a Boolean algebra — the very same structure that Boole’s propositional calculus gives rise to. This correspondence between logic and set theory is one of the most elegant features of the foundations of mathematics: the algebra of truth values and the algebra of collections are two manifestations of the same abstract structure. The connection runs even deeper through Stone’s representation theorem, which shows that every Boolean algebra is isomorphic to a field of sets — a topic explored further in the Lattices and Boolean Algebras section of axiomatic set theory.

Ordered Pairs and Relations

The sets we have discussed so far are unordered — the set {a,b}\{a, b\} is the same as {b,a}\{b, a\}. But much of mathematics depends on order: coordinates in the plane, the input-output behavior of functions, and the structure of relations all require us to distinguish (a,b)(a, b) from (b,a)(b, a). The challenge is to define the notion of an ordered pair purely in terms of sets, so that the entire apparatus of set theory can be brought to bear.

The standard solution, due to Kazimierz Kuratowski in 1921, defines the ordered pair (a,b)(a, b) as:

(a,b)={{a},{a,b}}.(a, b) = \{\{a\}, \{a, b\}\}.

This encoding looks peculiar, but it has the one property that matters: (a,b)=(c,d)(a, b) = (c, d) if and only if a=ca = c and b=db = d. The proof is a short exercise in extensionality — one verifies by case analysis that equality of the outer sets forces equality of the components. Other definitions of ordered pair have been proposed (notably by Norbert Wiener in 1914), but Kuratowski’s is the one adopted as standard because of its simplicity.

With ordered pairs in hand, we define the Cartesian product of two sets AA and BB as the set of all ordered pairs whose first component comes from AA and whose second comes from BB:

A×B={(a,b):aA and bB}.A \times B = \{(a, b) : a \in A \text{ and } b \in B\}.

The name honors Rene Descartes, whose coordinate geometry identified points in the plane with pairs of real numbers — the plane being R×R=R2\mathbb{R} \times \mathbb{R} = \mathbb{R}^2. For finite sets, A×B=AB|A \times B| = |A| \cdot |B|, which is why the operation is called a “product.” The construction extends to nn-tuples: A1×A2××AnA_1 \times A_2 \times \cdots \times A_n is the set of all ordered nn-tuples (a1,a2,,an)(a_1, a_2, \ldots, a_n) with each aiAia_i \in A_i.

A relation from AA to BB is any subset RA×BR \subseteq A \times B. When (a,b)R(a, b) \in R, we say that aa is related to bb by RR, often written aRba \mathrel{R} b. The domain of RR is the set of all first components, dom(R)={aA:(a,b)R for some b}\text{dom}(R) = \{a \in A : (a, b) \in R \text{ for some } b\}, and the range (or image) is the set of all second components, ran(R)={bB:(a,b)R for some a}\text{ran}(R) = \{b \in B : (a, b) \in R \text{ for some } a\}. A relation on a single set AA — that is, a subset of A×AA \times A — is where the most interesting structural properties emerge.

Three properties of a relation RR on AA are especially important. RR is reflexive if aRaa \mathrel{R} a for every aAa \in A — every element is related to itself. RR is symmetric if aRba \mathrel{R} b implies bRab \mathrel{R} a — the relation goes both ways. RR is transitive if aRba \mathrel{R} b and bRcb \mathrel{R} c together imply aRca \mathrel{R} c — the relation can be chained. A relation that is reflexive, symmetric, and transitive is called an equivalence relation, and equivalence relations are ubiquitous in mathematics. Equality itself is the simplest example, but the concept extends to congruence modulo nn in number theory, similarity of geometric figures, isomorphism of algebraic structures, and countless other settings.

Every equivalence relation \sim on a set AA partitions AA into disjoint equivalence classes: for each element aAa \in A, its equivalence class is [a]={xA:xa}[a] = \{x \in A : x \sim a\}, and the collection of all equivalence classes forms a partition of AA — a family of nonempty, pairwise disjoint subsets whose union is all of AA. Conversely, every partition of AA determines an equivalence relation (declare two elements equivalent if they belong to the same block). This bijection between equivalence relations and partitions is one of the most useful structural results in naive set theory.

When a relation is reflexive and transitive but replaces symmetry with antisymmetry — the condition that aRba \mathrel{R} b and bRab \mathrel{R} a together imply a=ba = b — the result is a partial order. Partial orders formalize the idea of ranking or hierarchy: the subset relation \subseteq on the power set of any set is a partial order, as is the divisibility relation on positive integers. If, in addition, every two elements are comparable (for any a,ba, b, either aRba \mathrel{R} b or bRab \mathrel{R} a), the partial order is a total order (or linear order) — the usual ordering \leq on the real numbers is the prototypical example.

Finally, a function from AA to BB is a special kind of relation: it is a subset fA×Bf \subseteq A \times B such that for every aAa \in A, there is exactly one bBb \in B with (a,b)f(a, b) \in f. This definition — a function as a set of ordered pairs satisfying a uniqueness condition — is one of the great conceptual achievements of the set-theoretic approach to mathematics. It frees the notion of function from any dependence on formulas, rules, or algorithms, and allows us to reason about functions with the full power of set theory. The deep study of functions, their properties, and their role in comparing the sizes of infinite sets is the subject of Functions and Cardinality.