Axiomatic Set Theory

The formal foundation of mathematics — ZFC axioms, ordinals, cardinals, and the continuum hypothesis.


Axiomatic set theory is the formal framework in which virtually all of modern mathematics can be expressed and proved. Where naive set theory relies on intuitive notions of “collection” and runs headlong into paradoxes, the axiomatic approach replaces intuition with a precise list of first-order axioms — the Zermelo-Fraenkel axioms with Choice (ZFC) — that govern what sets exist and how they behave. This topic examines the logical structure of ZFC, the models and hierarchies it gives rise to, and the remarkable independence phenomena that reveal the inherent limits of any single axiom system for mathematics.

Set-Theoretic Axioms and Foundations

The need to axiomatize set theory became urgent in 1901, when Bertrand Russell discovered a devastating contradiction in Frege’s system. Consider the set R={x:xx}R = \{x : x \notin x\}. If RRR \in R, then by definition RRR \notin R; if RRR \notin R, then RRR \in R. This is Russell’s paradox, and it showed that unrestricted set comprehension — the principle that any property determines a set — is inconsistent. The resolution required abandoning the naive “a set is any collection” and replacing it with carefully controlled axioms that specify which sets can be formed and how.

Ernst Zermelo published the first axiomatization in 1908, motivated both by Russell’s paradox and by the desire to place his 1904 proof of the well-ordering theorem on rigorous foundations. Abraham Fraenkel (1922) and Thoralf Skolem (1922) independently identified the need for the axiom schema of replacement, and John von Neumann contributed the axiom of foundation and the modern treatment of ordinals. The resulting system, ZFC, consists of the following axioms, all expressed in the first-order language of set theory with a single binary relation symbol \in:

  1. Extensionality. Two sets are equal if and only if they have the same elements: xy[z(zxzy)x=y]\forall x\, \forall y\, [\forall z\, (z \in x \leftrightarrow z \in y) \to x = y]

  2. Foundation (Regularity). Every nonempty set xx contains an element disjoint from xx: x[xy(yxyx=)]\forall x\, [x \neq \emptyset \to \exists y\, (y \in x \land y \cap x = \emptyset)] This prevents pathological structures like xxx \in x or infinite descending \in-chains.

  3. Comprehension (Separation) Schema. For any formula φ(z)\varphi(z) with parameters, and any set aa, the collection {za:φ(z)}\{z \in a : \varphi(z)\} is a set: abz[zb(zaφ(z))]\forall a\, \exists b\, \forall z\, [z \in b \leftrightarrow (z \in a \land \varphi(z))] This is the safe replacement for unrestricted comprehension: you can only separate elements from an already-existing set, not conjure a set from an arbitrary property.

  4. Pairing. For any two sets aa and bb, the pair {a,b}\{a, b\} exists: abcz[zc(z=az=b)]\forall a\, \forall b\, \exists c\, \forall z\, [z \in c \leftrightarrow (z = a \lor z = b)]

  5. Union. For any set aa, the union a={z:y(zyya)}\bigcup a = \{z : \exists y\, (z \in y \land y \in a)\} exists.

  6. Power Set. For any set aa, the power set P(a)={z:za}\mathcal{P}(a) = \{z : z \subseteq a\} exists.

  7. Infinity. There exists an infinite set. Specifically, there exists a set ω\omega containing \emptyset and closed under the successor operation xx{x}x \mapsto x \cup \{x\}: ω[ω    x(xωx{x}ω)]\exists \omega\, [\emptyset \in \omega \;\land\; \forall x\, (x \in \omega \to x \cup \{x\} \in \omega)]

  8. Replacement Schema. If φ(x,y)\varphi(x, y) is a functional formula (for every xx in a set aa there is a unique yy with φ(x,y)\varphi(x, y)), then the image {y:xaφ(x,y)}\{y : \exists x \in a\, \varphi(x, y)\} is a set. This is strictly stronger than separation and is needed to construct sets of high rank, such as Vω+ωV_{\omega + \omega}.

  9. Choice (AC). For every family of nonempty sets, there exists a function selecting one element from each: a[af:aa    xa(f(x)x)]\forall a\, [\emptyset \notin a \to \exists f : a \to \bigcup a\;\; \forall x \in a\, (f(x) \in x)]

Each axiom addresses a specific need. Extensionality ensures that sets are determined by their elements, not by how they are described. Foundation gives the universe of sets its well-founded, hierarchical character. Comprehension and replacement control which new sets can be formed. Pairing, union, and power set provide basic closure operations. Infinity guarantees the existence of the natural numbers. Choice, the most controversial axiom historically, is equivalent to Zorn’s lemma and the well-ordering theorem, and is indispensable in algebra, analysis, and topology.

The axioms of ZFC organize the set-theoretic universe into the cumulative hierarchy. Define by transfinite recursion:

V0=,Vα+1=P(Vα),Vλ=α<λVα  for limit λV_0 = \emptyset, \qquad V_{\alpha + 1} = \mathcal{P}(V_\alpha), \qquad V_\lambda = \bigcup_{\alpha < \lambda} V_\alpha \;\text{for limit } \lambda

The universe of all sets is then V=αOrdVαV = \bigcup_{\alpha \in \text{Ord}} V_\alpha, where Ord\text{Ord} denotes the class of all ordinals. Foundation guarantees that every set appears at some level: for every set xx, there is a least ordinal α\alpha such that xVα+1x \in V_{\alpha + 1}, called the rank of xx. The cumulative hierarchy is not just a visualization tool — it is the structural backbone of ZFC and the starting point for understanding inner models and forcing extensions.

An important metatheoretic tool is the reflection principle: for any finite list of first-order sentences true in VV, there exists an ordinal α\alpha such that VαV_\alpha satisfies those same sentences. Reflection provides a way to “localize” arguments about the entire universe to arguments about set-sized initial segments, and it underlies many constructions in large cardinal theory.

Constructible Universe and Inner Models

In 1938, Kurt Godel achieved one of the landmark results of twentieth-century mathematics by constructing a model of set theory in which both the Axiom of Choice and the Generalized Continuum Hypothesis hold. His tool was the constructible universe LL, a “thin” inner model of VV in which every set is explicitly definable from those that came before.

The constructible hierarchy is defined by transfinite recursion, paralleling the cumulative hierarchy but replacing the power set with the definable power set:

L0=,Lα+1=Def(Lα),Lλ=α<λLα  for limit λL_0 = \emptyset, \qquad L_{\alpha + 1} = \text{Def}(L_\alpha), \qquad L_\lambda = \bigcup_{\alpha < \lambda} L_\alpha \;\text{for limit } \lambda

Here Def(M)\text{Def}(M) denotes the set of all subsets of MM that are first-order definable over MM with parameters from MM. The constructible universe is then L=αOrdLαL = \bigcup_{\alpha \in \text{Ord}} L_\alpha. Since Def(M)P(M)\text{Def}(M) \subseteq \mathcal{P}(M), we always have LαVαL_\alpha \subseteq V_\alpha and hence LVL \subseteq V. The axiom V=LV = L asserts that every set is constructible — that the universe is as “thin” as possible.

The key structural result about LL is the condensation lemma: if XX is an elementary substructure of some LαL_\alpha (written XLαX \prec L_\alpha), then the Mostowski collapse of XX is isomorphic to some LβL_\beta with βα\beta \leq \alpha. This lemma is the engine behind most of the important properties of LL. It implies, for instance, that LL satisfies the Generalized Continuum Hypothesis (GCH): 2α=α+12^{\aleph_\alpha} = \aleph_{\alpha + 1} for all ordinals α\alpha. The argument proceeds by showing that every subset of ω\omega that appears in LL is definable at a countable level, so there are at most 1\aleph_1 subsets of ω\omega in LL.

A central concept in the study of LL is absoluteness. A formula φ\varphi is absolute between two transitive models MNM \subseteq N if for all parameters from MM, φ\varphi holds in MM if and only if it holds in NN. The Δ1\Delta_1 formulas — those that are both Σ1\Sigma_1 and Π1\Pi_1 in the Levy hierarchy — are absolute between transitive models that satisfy enough of ZFC. Absoluteness is what allows results proved in LL to be transferred to VV: if φ\varphi is absolute and provable in LL, then it holds in VV as well.

Godel’s achievement was a relative consistency result: if ZFC is consistent, then so is ZFC + V=LV = L, and hence ZFC + GCH + AC. The proof works by showing that LL is a model of ZFC that additionally satisfies V=LV = L. Since V=LV = L implies both AC and GCH, the consistency of these principles follows. This was the first half of the independence of CH; the second half — showing that ¬\negCH is also consistent — would have to wait twenty-five years for Cohen’s forcing.

Inner model theory extends Godel’s ideas by constructing models between LL and VV that accommodate large cardinal axioms. The model L[U]L[U], built from a normal measure UU on a measurable cardinal, was studied by Kenneth Kunen in the 1970s. More sophisticated constructions — the core model KK developed by Anthony Dodd and Ronald Jensen, and later extended by John Steel and others — provide canonical inner models for increasingly large cardinals. The covering lemma of Jensen states that if 0#0^\# does not exist (a condition related to the non-existence of certain large cardinals), then VV is “close to” LL in a precise sense: every uncountable set of ordinals in VV can be covered by a constructible set of the same cardinality.

Descriptive Set Theory

Descriptive set theory studies the structure of “definable” subsets of Polish spaces — complete, separable metric spaces such as R\mathbb{R}, the Cantor space 2ω2^\omega, and the Baire space ωω\omega^\omega. The central question is: which regularity properties (Lebesgue measurability, the Baire property, the perfect set property) do definable sets possess, and how does the answer depend on the complexity of the definition?

The starting point is the Borel hierarchy. A subset of a Polish space XX is Borel if it belongs to the σ\sigma-algebra generated by the open sets. The Borel sets are stratified into levels indexed by countable ordinals. The open sets form the class Σ10\Sigma^0_1 and the closed sets form Π10\Pi^0_1. At the next level, Σ20\Sigma^0_2 consists of countable unions of closed sets (FσF_\sigma sets) and Π20\Pi^0_2 consists of countable intersections of open sets (GδG_\delta sets). In general, for a countable ordinal α>0\alpha > 0:

Σα0={nωAn:AnΠβn0,  βn<α}\Sigma^0_\alpha = \left\{\bigcup_{n \in \omega} A_n : A_n \in \Pi^0_{\beta_n}, \; \beta_n < \alpha \right\}

Πα0={XA:AΣα0}\Pi^0_\alpha = \{X \setminus A : A \in \Sigma^0_\alpha\}

The hierarchy is proper: at each level α<ω1\alpha < \omega_1, there exist sets in Σα0\Sigma^0_\alpha that are not in Πα0\Pi^0_\alpha and vice versa. All Borel sets are Lebesgue measurable, have the Baire property, and satisfy the perfect set property (every uncountable Borel set contains a perfect subset and hence has cardinality 202^{\aleph_0}).

Beyond the Borel sets lies the projective hierarchy. A set AXA \subseteq X is analytic (or Σ11\Sigma^1_1) if it is the continuous image of a Borel set, equivalently the projection of a Borel subset of X×ωωX \times \omega^\omega. The coanalytic sets (Π11\Pi^1_1) are the complements of analytic sets. The projective hierarchy continues: Σn+11\Sigma^1_{n+1} sets are projections of Πn1\Pi^1_n sets, and Πn+11\Pi^1_{n+1} sets are their complements. Every Borel set is both Σ11\Sigma^1_1 and Π11\Pi^1_1. The classical theorem of Mikhail Suslin (1917) establishes that the Borel sets are exactly the sets that are simultaneously analytic and coanalytic: Δ11=Borel\mathbf{\Delta}^1_1 = \text{Borel}.

Analytic sets retain many regularity properties: they are Lebesgue measurable, have the Baire property, and satisfy the perfect set property (a result due to Suslin). However, the situation at the Σ21\Sigma^1_2 level and beyond becomes sensitive to the axioms. In Godel’s LL, there exist Σ21\Sigma^1_2 sets without the perfect set property, and there exist Δ21\Delta^1_2 sets that are not Lebesgue measurable. These results show that ZFC alone cannot decide the regularity of projective sets — resolving the question requires either large cardinal axioms or determinacy hypotheses.

Effective descriptive set theory refines the boldface hierarchy by restricting to computable or arithmetically definable operations. The lightface classes Σn0\Sigma^0_n and Πn0\Pi^0_n correspond to the arithmetic hierarchy from computability theory: a set is Σ10\Sigma^0_1 (lightface) if it is computably enumerable, and so on. At the analytic level, the lightface Σ11\Sigma^1_1 sets are exactly the sets definable by a Σ11\Sigma^1_1 formula without parameters, and these coincide with the hyperarithmetic sets (via the Suslin-Kleene theorem). The effective theory provides a bridge between descriptive set theory and computability theory, connecting the structure of definable sets of reals to the recursion-theoretic hierarchy.

Determinacy and Large Cardinals in Logic

Consider a two-player infinite game on natural numbers associated to a set AωωA \subseteq \omega^\omega: Players I and II alternately choose natural numbers a0,b0,a1,b1,a_0, b_0, a_1, b_1, \ldots, producing a sequence x=(a0,b0,a1,b1,)ωωx = (a_0, b_0, a_1, b_1, \ldots) \in \omega^\omega. Player I wins if xAx \in A; otherwise Player II wins. The set AA is determined if one of the two players has a winning strategy — a function that tells that player how to move at each turn, guaranteeing victory regardless of the opponent’s play.

The Axiom of Determinacy (AD) asserts that every subset of ωω\omega^\omega is determined. AD is incompatible with the full Axiom of Choice (using AC, one can construct non-determined sets via a well-ordering of the reals), but it has remarkably strong consequences for the regularity of all sets of reals: under AD, every set of reals is Lebesgue measurable, has the Baire property, and satisfies the perfect set property. AD thus provides an alternative foundational framework in which all sets of reals are “well-behaved.”

The first breakthrough connecting determinacy to ZFC was Donald Martin’s proof (1975) that Borel determinacy is a theorem of ZFC. This is a difficult result — Martin showed that the proof requires the power set axiom applied α\alpha times for Borel sets of rank α\alpha, and Harvey Friedman proved that Borel determinacy cannot be established in Zermelo set theory (ZFC without replacement), demonstrating that the replacement schema is genuinely needed. Borel determinacy is thus a natural mathematical theorem whose proof requires the full strength of ZFC.

Beyond the Borel level, determinacy requires large cardinal hypotheses. Large cardinal axioms assert the existence of cardinals with strong closure, reflection, or embedding properties that cannot be proved to exist in ZFC alone. The hierarchy includes:

  • Inaccessible cardinals: κ\kappa is inaccessible if it is uncountable, regular, and a strong limit (2λ<κ2^\lambda < \kappa for all λ<κ\lambda < \kappa). The existence of an inaccessible cardinal implies the consistency of ZFC, so by the second incompleteness theorem, ZFC cannot prove that inaccessible cardinals exist.
  • Measurable cardinals: κ\kappa is measurable if there exists a non-principal κ\kappa-complete ultrafilter on κ\kappa, equivalently if there is a non-trivial elementary embedding j:VMj : V \to M with critical point κ\kappa. Measurable cardinals imply Σ11\Sigma^1_1-determinacy.
  • Woodin cardinals: δ\delta is Woodin if for every function f:δδf : \delta \to \delta, there exists a cardinal κ<δ\kappa < \delta and an elementary embedding j:VMj : V \to M with critical point κ\kappa such that Vj(f)(κ)MV_{j(f)(\kappa)} \subseteq M. Woodin cardinals are the key to projective determinacy.
  • Supercompact cardinals: κ\kappa is supercompact if for every λκ\lambda \geq \kappa, there exists an elementary embedding j:VMj : V \to M with critical point κ\kappa such that MM is closed under λ\lambda-sequences.

The deep connection between large cardinals and determinacy is one of the central themes of modern set theory. Woodin’s theorem (building on work of Martin and Steel) establishes that if there exist infinitely many Woodin cardinals with a measurable cardinal above them all, then Projective Determinacy (PD) holds — every projective set is determined. PD in turn implies that all projective sets are Lebesgue measurable and have the perfect set property, resolving questions that ZFC alone leaves open. The large cardinal hierarchy thus provides a natural calibration of the logical strength needed to establish determinacy at each level of the projective hierarchy: Σ11\Sigma^1_1-determinacy follows from the existence of a measurable cardinal, Σ21\Sigma^1_2-determinacy from a Woodin cardinal, and full PD from infinitely many Woodin cardinals.

Independence Results

The independence of the Continuum Hypothesis from ZFC is one of the most significant achievements in the history of mathematics, resolved through two complementary results separated by twenty-five years.

The first half, as discussed above, was Godel’s 1938 result: if ZFC is consistent, then ZFC + CH is consistent. The proof proceeds by showing that the constructible universe LL is a model of ZFC in which GCH holds. Since every model of ZFC contains LL as an inner model, the consistency of ZFC implies the consistency of ZFC + GCH. Godel himself believed that CH was false and that new axioms would eventually settle the question, but his consistency proof showed that any such refutation could not be carried out within ZFC.

The second half came in 1963, when Paul Cohen invented the method of forcing and used it to construct a model of ZFC in which CH fails. Cohen’s work earned him the Fields Medal in 1966 — the only Fields Medal ever awarded for a result in mathematical logic. The forcing method is arguably the most powerful technique in all of set theory, and its invention transformed the field.

The basic idea of forcing is to extend a countable transitive model MM of ZFC (a “ground model”) by adjoining a new set GG — called a generic filter — to obtain a larger model M[G]M[G] that still satisfies ZFC but may satisfy new statements that MM does not. The construction proceeds as follows. Choose a partial order (P,)(\mathbb{P}, \leq) in MM, whose elements are called forcing conditions. A filter GPG \subseteq \mathbb{P} is MM-generic if GG meets every dense set DPD \subseteq \mathbb{P} that belongs to MM. A set DD is dense in P\mathbb{P} if every condition pPp \in \mathbb{P} has an extension in DD. The Rasiowa-Sikorski lemma guarantees that MM-generic filters exist whenever MM is countable.

The forcing relation pφp \Vdash \varphi (”pp forces φ\varphi”) is defined by recursion on the complexity of φ\varphi and has the crucial property that a sentence φ\varphi holds in M[G]M[G] if and only if some condition pGp \in G forces φ\varphi. The forcing relation is definable within MM, even though the generic filter GG is not an element of MM. This allows properties of the extension M[G]M[G] to be analyzed entirely within the ground model.

To prove the independence of CH, Cohen used Cohen forcing, where P\mathbb{P} consists of finite partial functions from ω2×ω\omega_2 \times \omega to {0,1}\{0, 1\}, ordered by reverse inclusion. Each generic filter GG determines 2\aleph_2 new subsets of ω\omega (new reals), and the resulting model satisfies 2022^{\aleph_0} \geq \aleph_2, which contradicts CH. A careful argument (using the countable chain condition of Cohen forcing) shows that the construction preserves all cardinals, so 2\aleph_2 in the extension is the same as 2\aleph_2 in the ground model.

An elegant reformulation of forcing uses Boolean-valued models. Instead of working with a partial order and a generic filter, one replaces the two-valued truth values {0,1}\{0, 1\} with a complete Boolean algebra B\mathbb{B} and constructs a model VBV^{\mathbb{B}} in which every sentence receives a truth value in B\mathbb{B}. The sentence φ\varphi holds in the forcing extension if and only if its Boolean value φ\|\varphi\| is nonzero. This approach, developed by Dana Scott and Robert Solovay shortly after Cohen’s work, eliminates the need to work with generic filters and makes the algebraic structure of forcing more transparent.

The forcing method has yielded a vast landscape of independence results beyond CH. Suslin’s hypothesis (that every complete dense linear order without endpoints satisfying the countable chain condition is isomorphic to R\mathbb{R}) was shown to be independent of ZFC by Solovay and Stanley Tennenbaum (1971). Martin’s axiom (MA), a combinatorial principle that holds in models obtained by certain iterated forcings, has become a standard tool; it implies many consequences of CH for sets of reals without implying CH itself. The current frontier includes the study of forcing axioms such as the Proper Forcing Axiom (PFA) and Martin’s Maximum (MM), introduced by Saharon Shelah and by Matthew Foreman, Menachem Magidor, and Shelah in the 1980s. These axioms, which assert that generic filters exist for broad classes of forcing notions, have strong structural consequences — for instance, MM implies 20=22^{\aleph_0} = \aleph_2 — and they represent candidates for natural extensions of ZFC that might eventually settle open problems about the combinatorics of the continuum.

The independence phenomena revealed by Godel and Cohen fundamentally reshaped the philosophy of mathematics. They showed that the ZFC axioms, while sufficient to formalize virtually all of ordinary mathematics, leave deep questions about infinite sets genuinely undetermined. The search for new axioms — whether large cardinals, determinacy principles, or forcing axioms — that might resolve these questions in a natural and convincing way remains one of the central programs of contemporary set theory and mathematical logic.