Enumerative Combinatorics

Counting principles, generating functions, and the twelvefold way.


Enumerative combinatorics is the branch of mathematics devoted to the rigorous counting of finite or countably infinite collections of objects that satisfy given criteria. Rather than listing objects one by one, it seeks closed-form expressions, recurrence relations, or generating functions that capture the size of a collection in terms of its parameters. The subject sits at an extraordinary crossroads: it draws on set theory for its language, algebra for its techniques, and analysis for the asymptotics that make results computable — and in turn it feeds ideas into probability theory, computer science, and statistical physics.

Fundamental Counting Principles

Every edifice in enumerative combinatorics rests on two elementary principles that most students first encounter informally. The multiplication principle (or product rule) says: if a task can be decomposed into a sequence of kk independent steps where step ii can be performed in nin_i ways, then the total number of ways to complete the entire task is n1×n2××nkn_1 \times n_2 \times \cdots \times n_k. The addition principle (or sum rule) says: if a task can be accomplished by exactly one of kk mutually exclusive methods and method ii can be done in nin_i ways, then the total number of ways is n1+n2++nkn_1 + n_2 + \cdots + n_k. These two rules are not theorems that need proof so much as definitions of what it means to count — they encode the structure of Cartesian products and disjoint unions in terms of cardinality.

From the multiplication principle, the count of ordered arrangements of nn distinct objects taken rr at a time — called rr-permutations — follows immediately. There are nn choices for the first position, n1n-1 for the second (since one object is now used), and so on, giving:

P(n,r)=n(n1)(n2)(nr+1)=n!(nr)!P(n, r) = n \cdot (n-1) \cdot (n-2) \cdots (n - r + 1) = \frac{n!}{(n-r)!}

When r=nr = n, this is just n!n!, the total number of ways to arrange all nn objects. The factorial function, ubiquitous in combinatorics, grows so fast that even modest values of nn produce astronomically large counts — 10!=3,628,80010! = 3{,}628{,}800 and 20!2.4×101820! \approx 2.4 \times 10^{18} — a fact that has deep implications for algorithm design and the difficulty of brute-force enumeration.

When order does not matter, we count combinations. The number of rr-element subsets of an nn-element set is the binomial coefficient, written (nr)\binom{n}{r} (read “n choose r”):

(nr)=n!r!(nr)!\binom{n}{r} = \frac{n!}{r!\,(n-r)!}

The name comes from the binomial theorem, which Isaac Newton extended to non-integer exponents in the 1660s, building on earlier work by Blaise Pascal and Yang Hui: (x+y)n=r=0n(nr)xrynr(x + y)^n = \sum_{r=0}^{n} \binom{n}{r} x^r y^{n-r}. Pascal’s triangle — in which each interior entry is the sum of the two entries above it — is the oldest known systematic tabulation of binomial coefficients; it appeared in Chinese and Persian mathematical manuscripts centuries before Pascal, though his 1654 treatise Traité du Triangle Arithmétique established it firmly in the European tradition.

The binomial coefficients satisfy several fundamental identities. The symmetry identity (nr)=(nnr)\binom{n}{r} = \binom{n}{n-r} reflects the bijection between rr-subsets and their complements. Pascal’s rule (nr)=(n1r1)+(n1r)\binom{n}{r} = \binom{n-1}{r-1} + \binom{n-1}{r} follows from conditioning on whether a distinguished element is included in the subset. The Vandermonde convolution k=0r(mk)(nrk)=(m+nr)\sum_{k=0}^{r} \binom{m}{k}\binom{n}{r-k} = \binom{m+n}{r} counts rr-subsets of an (m+n)(m+n)-element set by splitting into how many come from each half. The hockey-stick identity k=0r(kr)=(r+1r+1)\sum_{k=0}^{r} \binom{k}{r} = \binom{r+1}{r+1} gives a cumulative sum along a diagonal of Pascal’s triangle.

A third class of problems concerns distributing nn identical objects into kk distinct bins, where empty bins are allowed. The stars and bars technique, attributed in its modern combinatorial form to William Feller, models this as placing k1k-1 dividers among nn stars arranged in a row. The count is (n+k1k1)\binom{n+k-1}{k-1}, a formula that reappears constantly in problems about integer solutions, multisets, and polynomial coefficient extraction.

Permutations and Special Sequences

Going deeper into permutation structure reveals a wealth of beautiful combinatorial objects. A derangement is a permutation of {1,2,,n}\{1, 2, \ldots, n\} that has no fixed point — no element maps to itself. The problème des rencontres, posed in 1708 by Pierre de Montmort, asks: in how many ways can nn letters be placed in nn envelopes so that no letter ends up in its correct envelope? The answer, denoted DnD_n, is computed by inclusion-exclusion:

Dn=n!k=0n(1)kk!D_n = n! \sum_{k=0}^{n} \frac{(-1)^k}{k!}

As nn grows, Dn/n!e10.3679D_n / n! \to e^{-1} \approx 0.3679, so roughly 37%37\% of all permutations are derangements — a startling and counterintuitive fact. The sequence begins D1=0D_1 = 0, D2=1D_2 = 1, D3=2D_3 = 2, D4=9D_4 = 9, and satisfies the recurrence Dn=(n1)(Dn1+Dn2)D_n = (n-1)(D_{n-1} + D_{n-2}).

Permutations can be decomposed into cycles: a permutation σ\sigma is written as a product of disjoint cycles, and the cycle type records how many cycles of each length appear. The Stirling numbers of the first kind, denoted c(n,k)c(n, k) (unsigned) or s(n,k)s(n, k) (signed), count permutations of nn elements with exactly kk cycles. They are defined by the generating identity:

x(x+1)(x+2)(x+n1)=k=0nc(n,k)xkx(x+1)(x+2)\cdots(x+n-1) = \sum_{k=0}^{n} c(n,k)\, x^k

The Stirling numbers of the second kind, S(n,k)S(n, k), count partitions of an nn-element set into exactly kk non-empty blocks. They satisfy S(n,k)=kS(n1,k)+S(n1,k1)S(n, k) = k \cdot S(n-1, k) + S(n-1, k-1), reflecting whether the nn-th element forms a singleton block or joins an existing one. The Bell numbers Bn=k=0nS(n,k)B_n = \sum_{k=0}^{n} S(n,k) count all set partitions of [n][n]: B0=1B_0 = 1, B1=1B_1 = 1, B2=2B_2 = 2, B3=5B_3 = 5, B4=15B_4 = 15, B5=52B_5 = 52. They were studied by Eric Temple Bell in the 1930s, though the sequence had appeared earlier in the work of Ramanujan.

Among the most celebrated sequences in all of combinatorics are the Catalan numbers, defined by:

Cn=1n+1(2nn)C_n = \frac{1}{n+1}\binom{2n}{n}

giving C0=1C_0 = 1, C1=1C_1 = 1, C2=2C_2 = 2, C3=5C_3 = 5, C4=14C_4 = 14, C5=42C_5 = 42. The Catalan numbers count an astonishing variety of objects: the number of distinct ways to fully parenthesize a product of n+1n+1 factors, the number of full binary trees with nn internal nodes, the number of Dyck paths (lattice paths from (0,0)(0,0) to (2n,0)(2n, 0) that step +1+1 or 1-1 and never go below zero), and the number of triangulations of a convex (n+2)(n+2)-gon. This universality is not coincidental — all of these structures encode the same underlying recursive decomposition, captured by the recurrence Cn+1=k=0nCkCnkC_{n+1} = \sum_{k=0}^{n} C_k C_{n-k}. The sequence is named for Eugène Catalan, who studied the parenthesization problem in 1838, though Leonhard Euler had considered the triangulation problem decades earlier.

The multinomial coefficient (nn1,n2,,nk)=n!n1!n2!nk!\binom{n}{n_1, n_2, \ldots, n_k} = \frac{n!}{n_1!\, n_2! \cdots n_k!}, where n1+n2++nk=nn_1 + n_2 + \cdots + n_k = n, counts the number of ways to partition nn distinct objects into kk ordered groups of sizes n1,,nkn_1, \ldots, n_k. It appears as the coefficient of x1n1x2n2xknkx_1^{n_1} x_2^{n_2} \cdots x_k^{n_k} in the expansion of (x1+x2++xk)n(x_1 + x_2 + \cdots + x_k)^n.

Inclusion-Exclusion and Möbius Inversion

The principle of inclusion-exclusion is one of the most versatile tools in combinatorics. Its motivation is simple: when we want to count elements that avoid certain “bad” properties, direct counting is often impossible, but counting those that have each property — and then correcting for overcounting — is tractable. If A1,A2,,AmA_1, A_2, \ldots, A_m are finite sets, then:

A1A2Am=iAii<jAiAj+i<j<kAiAjAk+(1)m+1A1Am\left| A_1 \cup A_2 \cup \cdots \cup A_m \right| = \sum_{i} |A_i| - \sum_{i < j} |A_i \cap A_j| + \sum_{i < j < k} |A_i \cap A_j \cap A_k| - \cdots + (-1)^{m+1} |A_1 \cap \cdots \cap A_m|

In compact notation, with II ranging over non-empty subsets of [m][m]:

i=1mAi=I[m](1)I+1iIAi\left| \bigcup_{i=1}^{m} A_i \right| = \sum_{\emptyset \neq I \subseteq [m]} (-1)^{|I|+1} \left| \bigcap_{i \in I} A_i \right|

The formula is proved by showing that any element belonging to exactly rr of the sets is counted (r1)(r2)++(1)r+1(rr)=1\binom{r}{1} - \binom{r}{2} + \cdots + (-1)^{r+1}\binom{r}{r} = 1 time on the right-hand side — a calculation that follows from the binomial theorem applied to (11)r(1-1)^r. To count elements with none of the properties, one subtracts from the total: if NN is the size of the universe, then NA1AmN - |A_1 \cup \cdots \cup A_m| counts elements in none of the AiA_i.

A canonical application is counting surjective functions from an nn-element set to a kk-element set — functions where every output value is achieved. Let AiA_i be the set of functions that miss the ii-th output value. Then by inclusion-exclusion, the number of surjections is:

j=0k(1)j(kj)(kj)n\sum_{j=0}^{k} (-1)^j \binom{k}{j} (k-j)^n

This formula also gives k!S(n,k)k! \cdot S(n,k), linking inclusion-exclusion directly to Stirling numbers. Another application is Euler’s totient function ϕ(n)\phi(n), which counts integers in {1,,n}\{1, \ldots, n\} coprime to nn. If p1,,prp_1, \ldots, p_r are the distinct prime divisors of nn, then:

ϕ(n)=ni=1r(11pi)\phi(n) = n \prod_{i=1}^{r} \left(1 - \frac{1}{p_i}\right)

a formula most elegantly derived by applying inclusion-exclusion to the sets of multiples of each pip_i within {1,,n}\{1, \ldots, n\}.

Inclusion-exclusion has a far-reaching generalization in the Möbius inversion formula on partially ordered sets. Given a locally finite poset (P,)(P, \leq), the Möbius function μ:P×PZ\mu: P \times P \to \mathbb{Z} is defined recursively by μ(x,x)=1\mu(x, x) = 1 and xzyμ(x,z)=0\sum_{x \leq z \leq y} \mu(x, z) = 0 for x<yx < y. The Möbius inversion theorem states: if f,g:PZf, g: P \to \mathbb{Z} satisfy g(y)=xyf(x)g(y) = \sum_{x \leq y} f(x), then f(y)=xyμ(x,y)g(x)f(y) = \sum_{x \leq y} \mu(x, y)\, g(x). On the divisibility poset of positive integers, the Möbius function reduces to the classical number-theoretic Möbius function μ(n)\mu(n), recovering the inversion formula from analytic number theory. On the Boolean lattice 2[n]2^{[n]}, Möbius inversion specializes precisely to inclusion-exclusion. This unification, developed by Gian-Carlo Rota in his landmark 1964 paper “On the Foundations of Combinatorial Theory,” gave combinatorics much of its modern algebraic character.

Ordinary Generating Functions

A generating function transforms a combinatorial counting problem into an algebraic problem about formal power series. The ordinary generating function (OGF) of a sequence a0,a1,a2,a_0, a_1, a_2, \ldots is the formal power series:

A(x)=n0anxnA(x) = \sum_{n \geq 0} a_n x^n

We treat xx not as a variable ranging over the real numbers but as a “place-holder” that keeps track of the index — convergence is irrelevant for most combinatorial purposes, though when it does converge, analytic properties give powerful asymptotics. The key insight is that many operations on sequences correspond to simple algebraic operations on their generating functions, allowing us to encode recurrences, symmetries, and structural constraints as equations.

The most fundamental OGF is the geometric series: n0xn=11x\sum_{n \geq 0} x^n = \frac{1}{1-x}. More generally, n0(n+k1k1)xn=1(1x)k\sum_{n \geq 0} \binom{n+k-1}{k-1} x^n = \frac{1}{(1-x)^k}, encoding the stars-and-bars formula for distributing objects into kk bins. The convolution of two sequences is reflected in multiplication of generating functions: if A(x)=anxnA(x) = \sum a_n x^n and B(x)=bnxnB(x) = \sum b_n x^n, then A(x)B(x)=n(k=0nakbnk)xnA(x) \cdot B(x) = \sum_n \left(\sum_{k=0}^{n} a_k b_{n-k}\right) x^n. This algebraic fact, which corresponds to the combinatorial operation of choosing a pair of objects (one counted by AA, one by BB), makes generating functions the natural tool for analyzing composite structures.

Linear recurrences with constant coefficients yield particularly cleanly to the generating function method. The Fibonacci sequence F0=0F_0 = 0, F1=1F_1 = 1, Fn=Fn1+Fn2F_n = F_{n-1} + F_{n-2} has OGF:

F(x)=x1xx2F(x) = \frac{x}{1 - x - x^2}

Partial fraction decomposition gives F(x)=15(11ϕx11ϕ^x)F(x) = \frac{1}{\sqrt{5}}\left(\frac{1}{1-\phi x} - \frac{1}{1-\hat\phi x}\right), where ϕ=(1+5)/2\phi = (1+\sqrt{5})/2 is the golden ratio and ϕ^=(15)/2\hat\phi = (1-\sqrt{5})/2. Extracting coefficients yields Binet’s formula:

Fn=ϕnϕ^n5F_n = \frac{\phi^n - \hat\phi^n}{\sqrt{5}}

an exact closed form for a sequence defined by a simple recurrence. The method generalizes: any linear recurrence of order kk with constant coefficients has a rational OGF, and partial fractions decompose it into a linear combination of geometric series, yielding a closed form as a sum of exponentials.

Generating functions also encode combinatorial interpretations directly. The OGF for the number of integer partitions of nn (ways to write nn as an unordered sum of positive integers) is the infinite product:

n0p(n)xn=k=111xk\sum_{n \geq 0} p(n) x^n = \prod_{k=1}^{\infty} \frac{1}{1-x^k}

This identity, observed by Euler in the 18th century, encodes the fact that each part size kk can appear with multiplicity 0,1,2,0, 1, 2, \ldots, contributing a factor 1+xk+x2k+=11xk1 + x^k + x^{2k} + \cdots = \frac{1}{1-x^k}. Euler also discovered the remarkable pentagonal number theorem, an identity relating the partition function to a sparse generating function:

k=1(1xk)=n=(1)nxn(3n1)/2=1xx2+x5+x7x12\prod_{k=1}^{\infty}(1-x^k) = \sum_{n=-\infty}^{\infty} (-1)^n x^{n(3n-1)/2} = 1 - x - x^2 + x^5 + x^7 - x^{12} - \cdots

where the exponents 1,2,5,7,12,15,1, 2, 5, 7, 12, 15, \ldots are the pentagonal numbers. This identity gives a fast recurrence for p(n)p(n) and was one of Euler’s deepest combinatorial discoveries.

Exponential Generating Functions

When the objects being counted carry internal labels — distinguishable elements arranged in structures — the right tool is the exponential generating function (EGF). The EGF of a sequence a0,a1,a2,a_0, a_1, a_2, \ldots is:

A^(x)=n0ann!xn\hat A(x) = \sum_{n \geq 0} \frac{a_n}{n!} x^n

The factor of n!n! in the denominator is the key feature: it accounts for the fact that in labeled combinatorics, the n!n! ways to assign labels to positions are not all distinct structures but rather different labellings of the same structure. The EGF of the sequence an=n!a_n = n! (all permutations) is 11x\frac{1}{1-x}; the EGF of the sequence an=1a_n = 1 (one structure for each nn, the unique “empty” structure) is exe^x.

The fundamental product lemma for EGFs states: if a labeled structure of size nn is formed by partitioning [n][n] into two ordered groups and placing an AA-structure on the first group and a BB-structure on the second, then the EGF for the combined structures is A^(x)B^(x)\hat A(x) \cdot \hat B(x). The n!n! factor precisely cancels the (nk)\binom{n}{k} factors that arise from choosing the partition. This makes EGFs the natural language for labeled species — a concept formalized by André Joyal in 1981, building on earlier work by Gilbert Labelle and others.

The EGF for the Bell numbers BnB_n (counting set partitions) is:

n0Bnn!xn=eex1\sum_{n \geq 0} \frac{B_n}{n!} x^n = e^{e^x - 1}

This elegant identity has a direct combinatorial explanation: a set partition is a set of non-empty blocks, and the EGF for non-empty sets is ex1e^x - 1 (since there is one non-empty set of each positive size), so the EGF for sets of non-empty sets is eex1e^{e^x - 1}. The exponential formula generalizes this: if the EGF for connected structures is C(x)C(x), then the EGF for all structures (viewed as sets of connected components) is eC(x)e^{C(x)}. Applied to trees, this yields Cayley’s formula: the number of labeled trees on nn vertices is nn2n^{n-2}, with EGF T(x)=xeT(x)T(x) = x e^{T(x)}, a functional equation solved by Arthur Cayley in 1889 using the Lagrange inversion theorem.

The EGF framework also illuminates the structure of permutations. Since every permutation is a set of cycles and the EGF for cycles is log11x\log\frac{1}{1-x}, the exponential formula gives:

n0n!n!xn=11x=elog11x\sum_{n \geq 0} \frac{n!}{n!} x^n = \frac{1}{1-x} = e^{\log \frac{1}{1-x}}

a tautology that nonetheless encodes deep structure: specializing to cycles of restricted lengths (e.g., allowing only odd cycles, or cycles of length at most kk) immediately yields the EGF for permutations with those restrictions. This modularity is a defining feature of the EGF framework, making it far more compositional than direct combinatorial argument.

Differentiation of EGFs has a clean interpretation: if A^(x)\hat A(x) is the EGF for structures of type AA, then ddxA^(x)\frac{d}{dx}\hat A(x) is the EGF for “AA-structures with a marked element.” Integration adds a new unlabeled element. These operations connect EGFs to the theory of combinatorial species, providing a unified categorical framework in which all the classical sequences — Stirling numbers, Bell numbers, Catalan numbers, tree counts — emerge as natural instances of a small set of compositional operations. This perspective, championed by Joyal and Bergeron, transformed enumerative combinatorics from a collection of clever tricks into a coherent structural theory.