Probabilistic & Algebraic Methods

The probabilistic method, Lovász local lemma, and symmetric functions.


Probabilistic combinatorics is the art of proving that certain mathematical objects must exist — not by constructing them explicitly, but by demonstrating that a randomly chosen object has the desired properties with positive probability. This counter-intuitive approach, developed in the mid-twentieth century largely by Paul Erdős, transformed combinatorics from a discipline of direct construction into one where randomness itself becomes a proof technique of extraordinary power. The methods that emerged — concentration inequalities, local lemmas, threshold functions, and derandomization — now permeate theoretical computer science, information theory, and statistical physics.

The Probabilistic Method

The probabilistic method rests on a single observation: if a random object from some probability space has a positive probability of satisfying a desired property, then at least one such object exists. Conversely, if the expected value of some quantity is less than a threshold, then there exists a choice where the quantity is below that threshold. This sounds almost trivially obvious, yet it unlocks existence proofs for objects that are utterly inaccessible to explicit construction.

The method was essentially born in 1947 when Erdős used it to establish a lower bound on Ramsey numbers. Recall that R(k,k)R(k,k) denotes the smallest nn such that every red-blue coloring of the edges of the complete graph KnK_n contains a monochromatic KkK_k. Erdős asked: how large can nn be while still allowing a coloring with no monochromatic KkK_k? He colored the edges of KnK_n uniformly at random, each edge independently red or blue with equal probability. For a fixed set SS of kk vertices, the probability that all (k2)\binom{k}{2} edges among SS are the same color is 2(1/2)(k2)=21(k2)2 \cdot (1/2)^{\binom{k}{2}} = 2^{1-\binom{k}{2}}. By the union bound over all (nk)\binom{n}{k} such sets:

Pr[some monochromatic Kk](nk)21(k2).\Pr[\text{some monochromatic } K_k] \le \binom{n}{k} \cdot 2^{1-\binom{k}{2}}.

When this expression is less than 1, a valid coloring exists. Choosing n=2k/2n = \lfloor 2^{k/2} \rfloor makes the right-hand side go below 1, giving the lower bound

R(k,k)>2k/2.R(k,k) > 2^{k/2}.

This is the first moment method: bound the expected number of “bad” events and show it is less than 1. When the expected number of violations is less than 1, the probability that no violation occurs is positive, so a good configuration exists.

The second moment method strengthens this. Let XX count the number of desired structures in a random object. If E[X]\mathbb{E}[X] \to \infty but one suspects XX concentrates, the Chebyshev inequality provides the key tool. Since Pr[X=0]Pr[XE[X]E[X]]Var(X)/(E[X])2\Pr[X = 0] \le \Pr[|X - \mathbb{E}[X]| \ge \mathbb{E}[X]] \le \mathrm{Var}(X)/(\mathbb{E}[X])^2, it suffices to show that the variance is small relative to the square of the mean. This approach was crucial in establishing when random graphs acquire specific properties: for instance, showing that the expected number of triangles in G(n,p)G(n,p) grows while their variance stays comparably small reveals that triangles are ubiquitous rather than concentrated in rare, anomalous graphs.

The probabilistic method also applies in the alterations style: start with a random object, compute the expected number of defects, and then alter or delete a few elements to eliminate all defects at a controllable cost. This technique yields tight bounds in problems ranging from independent sets in triangle-free graphs to discrepancy theory.

Lovász Local Lemma and Concentration Inequalities

The union bound used in the basic probabilistic method is often too crude. When there are many bad events A1,,AmA_1, \ldots, A_m each with small probability pp, the union bound gives Pr[Ai]mp\Pr[\bigcup A_i] \le mp, which can exceed 1 even when the events are nearly independent and the probability of avoiding all of them is actually close to 1. The Lovász Local Lemma, discovered by László Lovász and Paul Erdős in 1975, precisely handles this scenario.

The Local Lemma requires that each bad event is dependent on only a few others. Formally, given events A1,,AmA_1, \ldots, A_m in a probability space, construct a dependency graph GG on vertex set {A1,,Am}\{A_1, \ldots, A_m\} where AiA_i and AjA_j are connected by an edge if they are not independent of each other. If each event has probability at most pp and each vertex in the dependency graph has degree at most dd, and if the condition

ep(d+1)1ep(d+1) \le 1

holds (where ee is Euler’s number), then with positive probability none of the bad events occur:

Pr ⁣[i=1mAi]>0.\Pr\!\left[\bigcap_{i=1}^m \overline{A_i}\right] > 0.

The remarkable feature is that this holds regardless of how many events there are — as long as the local dependence structure stays sparse. The proof proceeds by induction, carefully tracking conditional probabilities. A lopsided (or asymmetric) version replaces the uniform bound pp with event-specific probabilities pip_i and allows for a more refined dependency condition: if there exist xi(0,1)x_i \in (0,1) such that Pr[Ai]xiji(1xj)\Pr[A_i] \le x_i \prod_{j \sim i} (1 - x_j) for all ii, then Pr[Ai]i(1xi)>0\Pr[\bigcap \overline{A_i}] \ge \prod_i (1-x_i) > 0.

The Local Lemma has stunning applications in combinatorics. For hypergraph 2-coloring: a kk-uniform hypergraph is 2-colorable (its vertices can be colored red and blue so that no edge is monochromatic) whenever every edge shares vertices with at most 2k22^{k-2} other edges. This follows from the symmetric LLL applied to the events “edge ee is monochromatic” under a random 2-coloring, since Pr[edge monochromatic]=21k\Pr[\text{edge monochromatic}] = 2^{1-k} and ep(d+1)=e21k(2k2+1)1ep \cdot (d+1) = e \cdot 2^{1-k} \cdot (2^{k-2}+1) \le 1 for large kk. For kk-SAT: any kk-CNF formula where every clause shares variables with at most 2k32^{k-3} other clauses is satisfiable.

Alongside the LLL, concentration inequalities are the workhorses of modern probabilistic combinatorics. The Chernoff bounds quantify how tightly a sum of independent Bernoulli random variables concentrates around its mean. If X=X1++XnX = X_1 + \cdots + X_n with each Xi{0,1}X_i \in \{0,1\} independent and μ=E[X]\mu = \mathbb{E}[X], then for any δ>0\delta > 0:

Pr[X(1+δ)μ](eδ(1+δ)1+δ)μ.\Pr[X \ge (1+\delta)\mu] \le \left(\frac{e^\delta}{(1+\delta)^{1+\delta}}\right)^\mu.

In many settings the simpler bound Pr[X(1+δ)μ]eμδ2/3\Pr[X \ge (1+\delta)\mu] \le e^{-\mu\delta^2/3} suffices for δ1\delta \le 1. The Azuma-Hoeffding inequality extends concentration to functions of many independent variables that change by at most cic_i when the ii-th variable changes, without requiring the function to be a sum:

Pr[XE[X]t]2exp ⁣(2t2ici2).\Pr[|X - \mathbb{E}[X]| \ge t] \le 2\exp\!\left(-\frac{2t^2}{\sum_i c_i^2}\right).

This martingale method — exposing variables one at a time and tracking how the conditional expectation drifts — underlies proofs of concentration in random graphs, random colorings, and random permutations. The method of bounded differences (also called the Azuma-McDiarmid inequality) is the principal tool when independence is present but the structure is not simply additive.

Random Structures and Threshold Functions

One of the most beautiful phenomena in probabilistic combinatorics is the emergence of sharp threshold functions: as a parameter varies continuously, a graph property transitions from almost certainly absent to almost certainly present in an interval of vanishing relative width. This was formalized by Erdős and Rényi in their landmark 1960 paper on random graphs, which launched the systematic study of G(n,p)G(n,p), the Erdős-Rényi random graph in which each of the (n2)\binom{n}{2} potential edges is included independently with probability pp.

A monotone property P\mathcal{P} (one closed under the addition of edges) has a threshold p(n)p^*(n) if

Pr[G(n,p)P]{0if p/p(n)01if p/p(n)\Pr[G(n,p) \in \mathcal{P}] \to \begin{cases} 0 & \text{if } p/p^*(n) \to 0 \\ 1 & \text{if } p/p^*(n) \to \infty \end{cases}

as nn \to \infty. The celebrated result of Bollobás and Thomason (1987) is that every monotone graph property has a threshold. Many thresholds are explicitly known. The threshold for containing a triangle is p=n1p = n^{-1}, meaning triangles appear around the time the average degree reaches 1. The threshold for connectivity is p=(logn)/np = (\log n)/n: below this, isolated vertices remain; above it, the graph is almost surely connected.

The most dramatic threshold is the giant component phenomenon, the centerpiece of Erdős-Rényi theory. When p=c/np = c/n with c<1c < 1, all connected components of G(n,p)G(n,p) have size O(logn)O(\log n) with high probability. At c=1c = 1 the largest component has size Θ(n2/3)\Theta(n^{2/3}). When c>1c > 1, a unique giant component of size Θ(n)\Theta(n) — containing a (1β(c)+o(1))(1 - \beta(c) + o(1)) fraction of all vertices, where β(c)\beta(c) is determined by β=ec(1β)\beta = e^{-c(1-\beta)} — emerges, while all other components remain logarithmically small. This transition at c=1c = 1 is a genuine phase transition, with direct analogies to percolation in statistical physics.

Random regular graphs present a richer structure: in G(n,d)G(n,d), the uniform distribution over dd-regular graphs on nn vertices (with ndnd even), properties like Hamiltonicity, chromatic number concentration, and expansion are more accessible than in G(n,p)G(n,p). It is known that for d3d \ge 3, G(n,d)G(n,d) is almost surely Hamiltonian and has chromatic number approximately d/(2logd)d/(2\log d). The configuration model — a generalization that allows any degree sequence — enables analysis of scale-free and power-law random graphs, which better model real-world networks.

Sharp thresholds occur when the transition window is truly narrow: Pr[P]\Pr[\mathcal{P}] jumps from near 0 to near 1 within a pp-interval of width o(p)o(p^*). This is related to the influential KKL theorem (Kahn, Kalai, Linial, 1988), which shows that for Boolean functions computed by monotone properties, there always exists a variable with high influence — meaning changing that coordinate significantly affects the outcome. Sharp thresholds then follow from hyper-contractive inequalities applied to the Fourier expansion of the indicator of P\mathcal{P}, a deep connection between combinatorics, harmonic analysis, and complexity theory.

Derandomization Techniques

The probabilistic method is phenomenally effective at proving existence, but it rarely produces explicit objects. Derandomization transforms probabilistic proofs into deterministic constructions, yielding explicit combinatorial objects with the properties guaranteed by the probabilistic argument.

The most fundamental derandomization tool is the method of conditional expectations, developed by Erdős and Selfridge in the context of discrepancy and refined by many others. The idea is to expose the random choices one at a time — fixing the first variable to the value that keeps the conditional expected number of violations below its starting value, then fixing the second variable, and so on. At each step, at least one choice maintains or improves the quantity being tracked. After fixing all variables, the resulting deterministic object satisfies the required property. This method applies whenever the probabilistic argument uses only the first moment: if E[X]<1\mathbb{E}[X] < 1 where XX counts violations, the conditional expectation framework produces an explicit zero-violation configuration in polynomial time, provided that conditional expectations can be computed efficiently.

When the Local Lemma is involved, derandomization is harder — Moser and Tardos (2010) gave a breakthrough algorithmic version of the LLL that runs in expected polynomial time. The algorithm is strikingly simple: repeat any violated constraint by resampling all variables in its scope uniformly at random, until no constraint is violated. Moser-Tardos analysis proves that this terminates quickly using an entropy argument — a clever counting of “witness trees” that would need to exist if the algorithm ran for too long. The constructive LLL has been applied to k-SAT, graph coloring, and hypergraph coloring, turning existential guarantees into polynomial-time algorithms.

Pairwise independent hash families and ϵ\epsilon-biased sets provide another route to derandomization. Many probabilistic arguments use only pairwise or kk-wise independence rather than full independence; replacing a fully random sequence with a kk-wise independent one obtained from an algebraic construction reduces the randomness requirement from nn bits to O(klogn)O(k \log n) bits. Pseudorandom generators (PRGs) formalize this: a deterministic function G:{0,1}s{0,1}nG: \{0,1\}^s \to \{0,1\}^n is a PRG for a class of tests if no test in the class can distinguish GG‘s output from a truly random string. Small-bias PRGs (Naor-Reingold), ϵ\epsilon-nets, and sample and modify approaches all give explicit objects matching probabilistic bounds.

The discrepancy method connects derandomization to a beautiful geometric picture. The discrepancy of a set system (X,F)(X, \mathcal{F}) under a ±1\pm 1 coloring χ:X{1,+1}\chi: X \to \{-1,+1\} is maxSFxSχ(x)\max_{S \in \mathcal{F}} |\sum_{x \in S} \chi(x)||; the combinatorial discrepancy is the minimum over all colorings. Spencer’s theorem (1985) uses the probabilistic method via partial coloring and entropy to show that any mm sets on nn points have discrepancy at most 6n6\sqrt{n} — a bound independent of mm that is entirely non-constructive. The spectacular 2010 result by Bansal gave a polynomial-time algorithm achieving this bound by solving a semidefinite program guided by the Lovász theta function, converting the probabilistic existence proof into an efficient construction via convex optimization.

Symmetrization and algebraic constructions also play a central role: the algebraic geometry codes of Goppa, explicit Ramsey graphs from finite geometry, and Paley graphs (defined via quadratic residues in finite fields) all provide explicit combinatorial objects whose properties match or approach what the probabilistic method guarantees. The Paley graph on Fq\mathbb{F}_q for a prime power q1(mod4)q \equiv 1 \pmod{4}, where vertices are field elements and edges connect xx and yy if xyx - y is a nonzero square, is a (q1)/2(q-1)/2-regular graph known to be strongly pseudo-random. Its clique and independence numbers are both O(qlogq)O(\sqrt{q} \log q) by character sum estimates, matching the probabilistic lower bound R(k,k)>2k/2R(k,k) > 2^{k/2} up to logarithmic factors. Constructing an explicit graph achieving the probabilistic lower bound remains one of the great open problems in combinatorics.

The confluence of the probabilistic method, concentration inequalities, and derandomization has become one of the richest and most active corners of modern mathematics — a field where elegant non-constructive existence arguments and hard-won explicit constructions illuminate each other, and where combinatorics, probability theory, harmonic analysis, and theoretical computer science have developed a common language.