Extremal Combinatorics
Ramsey theory, Turán-type problems, and the Erdős-Ko-Rado theorem.
Extremal combinatorics asks a deceptively simple question: how large or small can a combinatorial structure be if it must avoid some forbidden configuration? The field sits at the intersection of combinatorics, graph theory, and probability, producing theorems of striking precision — sharp numerical bounds that reveal the hidden order in discrete systems. From the pigeonhole principle to some of the deepest results of twentieth-century mathematics, extremal combinatorics has become one of the richest and most active areas in all of discrete mathematics.
Ramsey Theory
The starting point for Ramsey theory is an observation that feels almost philosophical: in any sufficiently large system, complete disorder is impossible. No matter how chaotically you color or label the elements of a large enough structure, some organized substructure must emerge. This idea, formalized by the British mathematician Frank Plumpton Ramsey in a 1930 paper on mathematical logic, has grown into an entire branch of combinatorics.
The simplest version is the party problem, or the computation of the Ramsey number . This number is defined as the smallest integer such that any red-blue coloring of the edges of the complete graph must contain either a red or a blue . The question is: how big must a party be before it is guaranteed to contain either mutual friends or mutual strangers? Ramsey proved that these numbers are always finite — the existence theorem alone was his contribution; computing the actual values turns out to be extraordinarily hard.
The first non-trivial case is . In a group of six people, there must always exist three who all know each other or three who are all strangers. The proof is a clean application of the pigeonhole principle: fix any person ; among the five others, by pigeonhole, either at least three are friends of or at least three are strangers. In either case one can show a monochromatic triangle is forced. Checking that five people are not sufficient requires exhibiting a 2-coloring of with no monochromatic triangle, which is possible using a cycle-based construction. Beyond the small cases, exact values are known only for for a handful of , and the general diagonal numbers remain one of the great open problems in combinatorics. The current state of knowledge places and somewhere between and .
To prove that is always finite, Ramsey’s original argument gives the recursive upper bound
from which one derives . For the diagonal numbers this gives . The matching lower bound — showing that is at least exponential in — was established by Paul Erdős in 1947 using the probabilistic method. Color each edge of red or blue independently and uniformly at random. The probability that any fixed set of vertices spans a monochromatic clique is . Since there are such sets, the expected number of monochromatic -cliques is less than one when . This non-constructive argument — existence proven through expectation — launched the entire probabilistic method as a technique in combinatorics, a tool that would prove indispensable for decades to come.
Ramsey theory extends far beyond graphs. Van der Waerden’s theorem (1927) asserts that for any positive integers and , there is a number such that any -coloring of must contain a monochromatic arithmetic progression of length . The infinite version of Ramsey’s theorem, also due to Ramsey, states that for any coloring of all -element subsets of the natural numbers with finitely many colors, there is an infinite monochromatic subset — a set all of whose -element subsets receive the same color. Ramsey theory thus encompasses a family of results with a unified moral: large enough structures are never truly chaotic.
Extremal Graph Theory and Turan’s Theorem
While Ramsey theory asks when an ordered substructure must appear, extremal graph theory asks a more quantitative question: what is the maximum number of edges a graph on vertices can have without containing a particular subgraph ? This quantity is denoted and called the Turán number of .
The foundational result is Turán’s theorem, proved by the Hungarian mathematician Pál Turán in 1941 while he was interned in a forced labor camp. The theorem gives a complete answer when the forbidden subgraph is a complete graph . The unique -free graph on vertices with the most edges is the Turán graph , which is the complete -partite graph with part sizes as equal as possible. The number of edges in is
and more precisely up to lower-order terms. Turán’s proof combined a clever double-counting argument with an optimization over partitions, and the theorem has since been re-proved in many elegant ways, including a beautiful two-line proof using a probabilistic reweighting technique.
The case deserves special attention: Turán’s theorem says that a triangle-free graph on vertices can have at most edges, achieved uniquely by the complete bipartite graph . This is Mantel’s theorem, first proved by Willem Mantel in 1907, and it is one of the oldest results in extremal graph theory.
When the forbidden subgraph is bipartite, Turán numbers behave differently — they grow sub-quadratically. The Kővári-Sós-Turán theorem (1954) addresses the case of the complete bipartite graph with . It states that
This is one of the key tools for bounding edges in graphs that avoid certain dense bipartite subgraphs, with applications throughout incidence geometry and additive combinatorics.
A central modern tool in extremal graph theory is Szemerédi’s regularity lemma, proved by Endre Szemerédi in 1975. It says that the vertex set of any large graph can be partitioned into a bounded number of roughly equal parts such that most pairs of parts behave “pseudorandomly” — the edge density between them is approximately uniform across subsets. Formally, a pair of vertex sets is -regular if for all and with and , the edge density satisfies . The lemma guarantees that for any , the vertices can be partitioned into at most parts with all but of pairs being regular. Although the bound is a tower function and thus enormous, the lemma is extraordinarily powerful as a structural tool, enabling the so-called graph removal lemma and serving as the engine behind Szemerédi’s theorem on arithmetic progressions.
Turán density captures the asymptotic fraction of edges a graph can have while remaining -free. For complete graphs , Turán’s theorem gives . For bipartite graphs , the Kővári-Sós-Turán theorem implies . Computing or even estimating Turán density for specific hypergraphs remains one of the hardest open problems in combinatorics.
Extremal Set Theory
Extremal set theory asks analogous questions for families of sets rather than graphs: given a constraint on how sets in a family may intersect (or avoid intersecting), how large can the family be? This field was shaped in large part by the extraordinary insight and prolific output of Paul Erdős, who collaborated with researchers across the world over more than six decades.
The first landmark result is Sperner’s theorem (1928), proved by the German mathematician Emanuel Sperner. A family of subsets of is an antichain (also called a Sperner family) if no set in is a subset of another. Sperner proved that the maximum size of an antichain is , achieved by taking all subsets of size . The proof uses the beautiful LYM inequality (named after Lubell, Yamamoto, and Meshalkin, who found it independently in the 1950s and 60s): for any antichain ,
This inequality follows from counting maximal chains in the Boolean lattice . Each maximal chain has orderings, there are maximal chains in total (corresponding to permutations of ), and a chain can pass through at most one member of an antichain. The bound on antichain size then follows immediately from the LYM inequality by noting that .
The Erdős-Ko-Rado theorem (1961), proved by Erdős, Ko, and Rado after more than twenty years of effort on the problem, addresses intersecting families — collections of -element subsets of such that every two sets in the family share at least one element. The theorem states that if , then any such family has size at most . Equality holds for the family of all -element subsets containing a fixed element, called a star. The condition is necessary: when any two -subsets of automatically intersect (by pigeonhole), and the entire family of all subsets is intersecting.
The original proof by Erdős, Ko, and Rado used a beautiful cyclic argument: arrange the elements of in a random cyclic order, and count how many members of can form consecutive arcs of length in that order. Each cyclic order contributes at most arcs in , and the averaging argument gives the bound. A later proof by Katona (1972) is even more elegant and is widely regarded as one of the most beautiful arguments in combinatorics. The EKR theorem has spawned an enormous industry of intersection theorems: versions for -intersecting families (where any two sets share at least elements), versions for vector spaces and permutations, and analytical generalizations using spectral methods.
The concept of shifting (also called compression) is the key technical tool underlying many extremal set theory arguments. Given a family of sets and indices , the shift operation replaces each set containing but not with , provided this new set is not already in . Shifting reduces the family toward the “initial segment” of the lexicographic order without decreasing its size or destroying the intersection property, allowing one to assume without loss of generality that the extremal family has a particularly clean structure.
Fisher’s inequality, another landmark of this area, states that if is a family of subsets of such that every two distinct sets in have the same intersection size , and all sets in have the same size , then . The proof is a linear algebra argument: consider the incidence vectors of the sets as vectors in and show they are linearly independent.
Forbidden Substructures and Stability
Once we know the maximum size of a combinatorial structure that avoids some forbidden configuration, the next natural question is: what does the extremal structure look like? And more subtly, are all near-extremal structures structurally close to the actual extremal ones? The theory that addresses this is called stability theory in extremal combinatorics.
The prototype is the stability version of Turán’s theorem, due to Erdős and Simonovits. The Erdős-Simonovits stability theorem says that if is a -free graph on vertices with edges, then can be made into the Turán graph by adding and deleting edges. In other words, any graph that is nearly as dense as possible while remaining -free must be nearly -partite. The proof uses the regularity lemma: apply Szemerédi’s lemma to , observe that most pairs of parts are regular with density close to , then use the extremality to deduce the approximate multipartite structure.
Stability results transform qualitative existence theorems into structural characterizations, and they have been obtained for many other forbidden subgraphs and hypergraphs. The general pattern: an extremal configuration is unique (or belongs to a small explicit class), and near-extremal configurations are structurally close to it. This is analogous to stability results in functional analysis, where the only functions achieving equality in an inequality are exactly the functions one expects, and nearby functions have a similar form.
The sunflower lemma, proved by Erdős and Rado in 1960, is another fundamental result about forbidden substructures in set families. A sunflower (or -system) with petals is a family of sets such that all pairwise intersections equal a common core for . The sunflower lemma states: if is a family of more than sets each of size , then contains a sunflower with petals. Equivalently, any sufficiently large uniform set family must contain a structured “flower.” The lemma is a powerful tool in combinatorics, complexity theory, and the theory of algorithms, and the question of whether the factorial factor in the bound can be replaced by something smaller is the sunflower conjecture — one of the major open problems in combinatorics, attracting renewed attention from computer scientists in recent years due to its implications for circuit lower bounds.
For hypergraph Turán problems, almost nothing is known. The hypergraph Turán density for a -uniform hypergraph is defined analogously to the graph case, but determining even a single value for and a non-trivial is open in most cases. Turán himself conjectured in 1941 that , where is the complete 3-uniform hypergraph on 4 vertices (i.e., all triples). Despite enormous effort by many researchers over eight decades, this conjecture remains open. The difficulty of hypergraph Turán problems reflects a genuine gap in our understanding: the techniques that work beautifully for graphs — the regularity lemma, stability arguments, algebraic methods — do not transfer cleanly to the hypergraph setting.
The flag algebra method, introduced by Alexander Razborov in 2007, represents the most significant methodological advance in extremal combinatorics in many years. The idea is to express extremal inequalities as positive semidefinite conditions on algebras of “typed” combinatorial objects, then use semidefinite programming to find certificates for those inequalities. The method has resolved a number of previously intractable problems, including sharp bounds on hypergraph Turán densities in special cases and new results in Ramsey multiplicity. It represents a striking confluence of combinatorics, linear algebra, and optimization, and it continues to be an active area of development. Flag algebras, together with the classical regularity method, stability theory, and the probabilistic method, constitute the modern toolkit with which extremal combinatorics confronts its hardest open questions — questions that lie at the boundary between what is known and what can be imagined.