Algorithms & Complexity
The design and analysis of algorithms and the theory of computational complexity — from sorting and searching to P vs NP.
An algorithm is a finite, unambiguous sequence of well-defined instructions for solving a computational problem. The study of algorithms asks two intertwined questions: how do we design effective procedures for important problems, and what are the fundamental limits on the efficiency of any procedure? These questions sit at the very heart of computer science, connecting the practical art of programming to the deepest reaches of mathematical logic.
Asymptotic Analysis and Algorithm Efficiency
The first task in the study of algorithms is to develop a vocabulary for measuring efficiency. Rather than timing a program on a particular machine, which conflates the algorithm with the hardware, we count the number of elementary operations as a function of the input size and study how that function grows. Asymptotic notation captures this growth while ignoring constant factors and lower-order terms. We say if there exist constants and such that for all . The companion notations (lower bound) and (tight bound) complete the picture: means grows at exactly the same rate as , up to constant factors. The notations little-o and little-omega provide strict asymptotic comparisons: means grows strictly slower than .
For iterative algorithms, analysis typically involves counting loop iterations and the work done in each. Nested loops multiply: a doubly nested loop over an -element array performs operations. For recursive algorithms, the running time satisfies a recurrence relation. The master theorem provides a powerful shortcut for divide-and-conquer recurrences of the form : depending on how compares to , the solution falls into one of three cases. When the theorem does not apply, one resorts to the expansion method (unrolling the recurrence) or the substitution method (guessing a form and verifying by induction).
Amortized analysis addresses situations where individual operations have varying costs but the total cost over a sequence of operations is predictable. The aggregate method divides the total cost by the number of operations. The accounting method assigns differing charges to operations, saving credit for expensive future operations. The potential method defines a potential function on the data structure’s state, so that the amortized cost of an operation equals its actual cost plus the change in potential. The classic example is the dynamic array: most insertions take time, but occasional resizing costs . Amortized analysis shows that the average cost per insertion is still . Robert Tarjan introduced amortized analysis in the 1980s, and it has since become an essential tool for analyzing self-adjusting data structures and incremental algorithms.
Divide-and-Conquer Algorithms
The divide-and-conquer paradigm solves a problem by breaking it into smaller subproblems of the same type, solving each recursively, and combining the results. Its elegance lies in the way complex problems decompose into manageable pieces, and its efficiency often arises from the logarithmic depth of the recursion. Merge sort, invented by John von Neumann in 1945, is the paradigmatic example: it divides an array in half, recursively sorts each half, and merges the two sorted halves in time, yielding an overall running time of . This is optimal among comparison-based sorting algorithms, a fact established by the information-theoretic lower bound: since there are possible permutations and each comparison yields one bit of information, at least comparisons are necessary.
Quick sort, developed by Tony Hoare in 1959, partitions the array around a pivot element and recursively sorts the two partitions. Its worst-case time is (when the pivot is always the smallest or largest element), but with a randomly chosen pivot, the expected time is , and in practice quick sort often outperforms merge sort due to better cache behavior and smaller constant factors. The median-of-medians algorithm, discovered by Manuel Blum, Robert Floyd, Vaughan Pratt, Ron Rivest, and Robert Tarjan in 1973, selects the median of an array in guaranteed time using a clever recursive strategy that ensures a balanced partition at each level.
Divide and conquer extends far beyond sorting. Strassen’s algorithm (1969) multiplies two matrices in time instead of the naive , by reducing the number of recursive multiplications from eight to seven. The fast Fourier transform (FFT), attributed to James Cooley and John Tukey in 1965 but with roots in Gauss’s unpublished work from 1805, computes the discrete Fourier transform of points in time, enabling efficient polynomial multiplication, signal processing, and the multiplication of large integers. The closest pair of points algorithm finds the two nearest points among points in the plane in time by dividing the plane into halves and carefully handling the boundary strip. Each of these algorithms illustrates the central insight of divide and conquer: by investing a small amount of work in decomposition and recombination, one can achieve dramatic improvements over brute force.
Dynamic Programming
Dynamic programming (DP) is a design paradigm for problems that exhibit two properties: optimal substructure (an optimal solution to the whole problem contains optimal solutions to subproblems) and overlapping subproblems (the same subproblems are solved many times in a naive recursive approach). The key idea is to compute and store the solution to each subproblem exactly once, avoiding redundant work. The term was coined by Richard Bellman in the 1950s, who developed the technique for optimization problems in operations research and control theory.
DP can be implemented in two styles. Memoization (top-down) starts with the original problem, recurses into subproblems, and caches their results. Tabulation (bottom-up) fills a table of subproblem solutions in a systematic order, from smallest to largest. The two approaches compute the same values; the choice between them is often a matter of convenience or of the subproblem dependency structure. Classic DP problems include the Fibonacci sequence (illustrating the basic memoization idea), the longest common subsequence (comparing two strings by building a two-dimensional table), the edit distance (also known as Levenshtein distance, measuring the minimum number of single-character edits to transform one string into another), and the knapsack problem (selecting items of given weights and values to maximize total value within a weight limit).
The algorithmic landscape of dynamic programming is vast. Matrix chain multiplication determines the optimal order in which to multiply a sequence of matrices, minimizing the total number of scalar multiplications. Optimal binary search trees arrange keys to minimize the expected search cost given known access frequencies, a problem solved by a DP due to Donald Knuth. The Bellman-Ford algorithm computes shortest paths from a single source in a graph with possibly negative edge weights, and Floyd-Warshall computes all-pairs shortest paths, both using DP over graph structure. Advanced DP techniques include tree DP (solving problems on trees by rooting and processing bottom-up), bitmask DP (using bit vectors to represent subsets of elements), and the convex hull trick (an optimization that accelerates certain DP transitions from to by exploiting a geometric structure in the cost function).
Greedy Algorithms
A greedy algorithm builds a solution incrementally, making the locally optimal choice at each step in the hope that these local optima aggregate into a global optimum. The appeal of greedy algorithms is their simplicity and efficiency; the challenge is proving that the greedy strategy actually yields an optimal solution. Two conditions justify the greedy approach: the greedy choice property (a locally optimal choice can be extended to a globally optimal solution) and optimal substructure (the remaining subproblem after making the greedy choice is itself an optimization problem of the same type). Correctness proofs typically use an exchange argument, showing that any optimal solution can be transformed into the greedy solution without increasing cost.
Minimum spanning tree algorithms are among the finest examples of greedy reasoning. Kruskal’s algorithm (1956) sorts edges by weight and adds each edge to the tree unless it would create a cycle, using a disjoint-set data structure for cycle detection. Prim’s algorithm (1957, based on earlier work by Vojtech Jarnik in 1930) grows the tree from a starting vertex, repeatedly adding the lightest edge connecting the tree to a non-tree vertex, using a priority queue for efficiency. Both run in time on a graph with edges and vertices. Dijkstra’s algorithm (1959) computes single-source shortest paths in graphs with nonnegative edge weights using a similar greedy expansion with a priority queue, running in with a binary heap or with a Fibonacci heap.
Huffman coding, invented by David Huffman in 1952 as a term paper for a course taught by Robert Fano, constructs an optimal prefix-free binary code for a set of symbols with given frequencies. The algorithm repeatedly merges the two least-frequent symbols, building a binary tree from the bottom up; the resulting code assigns shorter codewords to more frequent symbols. Huffman coding achieves the minimum expected codeword length among all prefix codes and is a cornerstone of data compression, underlying formats like DEFLATE (used in ZIP files and PNG images). Greedy algorithms also solve interval scheduling (selecting the maximum number of non-overlapping intervals by always choosing the one that finishes earliest) and activity selection problems. When greedy fails — as it does for many optimization problems — it often still provides useful approximation algorithms with provable performance guarantees, such as the 2-approximation for vertex cover or the -approximation for set cover.
Graph Algorithms
Graphs are among the most versatile structures in computer science, and graph algorithms form one of the richest chapters in the field. The two fundamental traversal strategies are depth-first search (DFS) and breadth-first search (BFS). DFS explores as far as possible along each branch before backtracking, naturally implemented with a stack (or recursion), and runs in time. It yields a DFS tree whose edge classification into tree edges, back edges, forward edges, and cross edges reveals deep structural information: back edges detect cycles, and the absence of back edges proves acyclicity. BFS explores vertices level by level using a queue, also in time, and computes shortest paths in unweighted graphs. Both traversals are building blocks for more complex algorithms.
Topological sorting produces a linear ordering of the vertices of a directed acyclic graph (DAG) such that for every directed edge , vertex appears before . This ordering, computable in time via DFS or Kahn’s algorithm, is essential for scheduling tasks with dependencies. Strongly connected components (SCCs) — maximal subsets of vertices in a directed graph where every vertex is reachable from every other — can be found in time by Tarjan’s algorithm (1972) or Kosaraju’s algorithm (1978). Condensing each SCC into a single node yields a DAG, the condensation graph, which captures the high-level dependency structure. The 2-SAT problem, which asks whether a Boolean formula in a restricted conjunctive normal form is satisfiable, can be solved in linear time by analyzing the implication graph’s SCCs.
Network flow algorithms solve problems involving the movement of a commodity through a network with capacity constraints. The max-flow min-cut theorem, proved by L. R. Ford and D. R. Fulkerson in 1956, states that the maximum flow from source to sink equals the minimum capacity of a cut separating them — a deep duality result. The Ford-Fulkerson method finds augmenting paths in the residual graph and pushes flow along them. The Edmonds-Karp variant (1972) uses BFS to find shortest augmenting paths, guaranteeing time. Dinic’s algorithm (1970) achieves by working with blocking flows in layered graphs. Push-relabel algorithms, introduced by Andrew Goldberg and Robert Tarjan in 1988, achieve by maintaining a preflow and relabeling vertices. Network flow has applications to bipartite matching, project selection, image segmentation, and many other areas.
NP-Completeness and Computational Hardness
Not all computational problems yield to efficient algorithms. Complexity theory classifies problems by the resources needed to solve them. The class P contains decision problems solvable in polynomial time by a deterministic Turing machine. The class NP contains problems for which a proposed solution (a certificate) can be verified in polynomial time. Every problem in P is also in NP (a polynomial-time algorithm can both find and verify solutions), but whether P equals NP is the most important open question in theoretical computer science, listed among the Clay Millennium Prize Problems.
In 1971, Stephen Cook proved that the Boolean satisfiability problem (SAT) is NP-complete — it is in NP and every problem in NP can be reduced to it in polynomial time. Leonid Levin independently obtained an equivalent result. The Cook-Levin theorem was the first NP-completeness proof, and it opened the floodgates. In 1972, Richard Karp demonstrated that 21 well-known combinatorial problems — including clique, vertex cover, Hamiltonian cycle, graph coloring, subset sum, and the traveling salesman problem — are all NP-complete by exhibiting chains of polynomial-time reductions from SAT. Today, thousands of problems are known to be NP-complete, and for none of them has a polynomial-time algorithm been found. The practical consequence is stark: when faced with an NP-complete problem, one must generally settle for approximation algorithms, heuristics, parameterized algorithms, or restrictions to special cases.
Beyond NP lie even more demanding complexity classes. PSPACE contains problems solvable with polynomial space (but potentially exponential time); quantified Boolean formulas (QBF) are PSPACE-complete. EXPTIME contains problems requiring exponential time. The polynomial hierarchy organizes problems by alternating quantifiers, generalizing both P and NP. At the extreme, undecidable problems — like the halting problem, proved unsolvable by Alan Turing in 1936 — cannot be solved by any algorithm at all, regardless of the time or space allowed.
Approximation and Randomized Algorithms
When exact solutions to NP-hard problems are out of reach, approximation algorithms offer polynomial-time solutions with provable quality guarantees. The approximation ratio measures how close the approximate solution is to the optimum. A 2-approximation for vertex cover, for instance, always finds a cover at most twice the minimum size. The greedy algorithm for set cover achieves an approximation ratio, and this is essentially the best possible unless P = NP. Linear programming relaxation and rounding techniques yield some of the most powerful approximation results: the Goemans-Williamson algorithm (1995) for MAX-CUT, based on semidefinite programming, achieves an approximation ratio of approximately 0.878. The PCP theorem (proved in 1992 by Arora, Lund, Motwani, Sudan, and Szegedy) established that certain problems cannot be approximated beyond specific thresholds unless P = NP, providing a deep theoretical foundation for inapproximability results.
Randomized algorithms use random coin flips as part of their computation. A Las Vegas algorithm always produces a correct answer but has random running time (like randomized quicksort). A Monte Carlo algorithm runs in guaranteed time but may produce an incorrect answer with bounded probability (like the Miller-Rabin primality test). The complexity class BPP contains problems solvable in polynomial time with error probability at most on each input; it is widely believed that BPP = P, meaning that randomness does not fundamentally increase computational power. Derandomization techniques, such as the Nisan-Wigderson generator, aim to make this belief rigorous by constructing pseudorandom sources from hardness assumptions.
Online algorithms face the challenge of making irrevocable decisions without knowledge of future inputs. Competitive analysis, introduced by Daniel Sleator and Robert Tarjan in 1985, compares an online algorithm’s performance to the best offline algorithm. The LRU (least recently used) caching policy, for example, is -competitive for caches of size . Streaming algorithms process data in a single pass with limited memory, enabling the analysis of massive datasets. The HyperLogLog algorithm estimates the number of distinct elements in a stream using space, and the Count-Min sketch estimates element frequencies using sublinear memory. These modern algorithmic paradigms — approximation, randomization, online, and streaming — reflect the reality that much of contemporary computation involves massive, dynamic, or uncertain data where exact polynomial-time solutions may be unavailable or unnecessary.