Functional Analysis

Banach and Hilbert spaces, bounded operators, and spectral theory.


Functional analysis is the branch of mathematics that studies infinite-dimensional vector spaces of functions together with the operators that act on them, forging a deep alliance between linear algebra and the analytic machinery of limits and continuity. Born out of the study of integral equations and differential equations in the early twentieth century, it quickly revealed that the right setting for such problems is not the finite-dimensional space Rn\mathbb{R}^n but an abstract space whose points are themselves functions. The resulting theory — built on the foundational contributions of Stefan Banach, David Hilbert, John von Neumann, and Hermann Weyl — now underpins quantum mechanics, the modern theory of partial differential equations, signal processing, and a substantial portion of pure mathematics.

Normed and Banach Spaces

The first step in functional analysis is to endow a vector space of functions with a notion of size. In finite dimensions every norm is equivalent and every space is complete; in infinite dimensions neither of these facts holds, and choosing the right norm becomes a genuinely consequential act.

A norm on a real or complex vector space XX is a function :X[0,)\|\cdot\| : X \to [0, \infty) satisfying three axioms: positive definiteness (x=0\|x\| = 0 if and only if x=0x = 0), homogeneity (λx=λx\|\lambda x\| = |\lambda| \|x\|), and the triangle inequality (x+yx+y\|x + y\| \leq \|x\| + \|y\|). The pair (X,)(X, \|\cdot\|) is called a normed space, and the norm induces a metric d(x,y)=xyd(x, y) = \|x - y\|, making every normed space a metric space.

The most important normed spaces in analysis are the p\ell^p sequence spaces and the LpL^p function spaces. For 1p<1 \leq p < \infty, the space p\ell^p consists of all sequences (x1,x2,x3,)(x_1, x_2, x_3, \ldots) of real or complex numbers for which

(xn)p=(n=1xnp)1/p<,\|(x_n)\|_{\ell^p} = \left(\sum_{n=1}^{\infty} |x_n|^p\right)^{1/p} < \infty,

while \ell^\infty is the space of bounded sequences with (xn)=supnxn\|(x_n)\|_{\ell^\infty} = \sup_n |x_n|. The analogous Lp(Ω)L^p(\Omega) spaces replace sequences by measurable functions on a measure space Ω\Omega, with the norm

fLp=(Ωfpdμ)1/p.\|f\|_{L^p} = \left(\int_\Omega |f|^p \, d\mu\right)^{1/p}.

Both families are made rigorous by the theory of Lebesgue integration, and both satisfy Holder’s inequality: for conjugate exponents pp and qq with 1p+1q=1\frac{1}{p} + \frac{1}{q} = 1,

fgdμfLpgLq.\int |fg| \, d\mu \leq \|f\|_{L^p} \|g\|_{L^q}.

A normed space is called a Banach space if it is complete — that is, if every Cauchy sequence converges to a limit within the space. Completeness is not automatic: the space of continuous functions C[0,1]C[0,1] with the L1L^1 norm is not complete (a sequence of continuous functions can converge in L1L^1 to a discontinuous limit), but C[0,1]C[0,1] with the supremum norm f=supt[0,1]f(t)\|f\|_\infty = \sup_{t \in [0,1]} |f(t)| is. The spaces p\ell^p and LpL^p are all Banach spaces for 1p1 \leq p \leq \infty, a fact that relies crucially on the dominated convergence theorem. The key difference from finite dimensions is revealed by the Riesz lemma: in an infinite-dimensional Banach space, the closed unit ball is never compact in the norm topology, a fact with profound implications for operator theory.

The dual space XX^* of a Banach space XX is the Banach space of all continuous linear functionals f:XRf: X \to \mathbb{R} (or C\mathbb{C}), equipped with the operator norm fX=supx1f(x)\|f\|_{X^*} = \sup_{\|x\| \leq 1} |f(x)|. Identifying dual spaces is one of the central tasks of the subject: the dual of p\ell^p is q\ell^q for 1p<1 \leq p < \infty, and the dual of Lp(Ω)L^p(\Omega) is Lq(Ω)L^q(\Omega) for 1<p<1 < p < \infty, where 1p+1q=1\frac{1}{p} + \frac{1}{q} = 1.

Bounded Linear Operators and Hilbert Spaces

With a class of spaces in hand, we turn to the maps between them. A linear operator T:XYT: X \to Y between normed spaces is bounded if there exists a constant C0C \geq 0 such that TxYCxX\|Tx\|_Y \leq C\|x\|_X for all xXx \in X. A fundamental theorem establishes that for linear operators between normed spaces, boundedness and continuity are equivalent. The collection of all bounded linear operators T:XYT: X \to Y forms a normed space L(X,Y)\mathcal{L}(X, Y) under the operator norm

T=supxX1TxY,\|T\| = \sup_{\|x\|_X \leq 1} \|Tx\|_Y,

and if YY is a Banach space, then L(X,Y)\mathcal{L}(X, Y) is itself a Banach space.

Among all normed spaces, Hilbert spaces occupy a privileged position because of their additional geometric structure. A Hilbert space is a Banach space whose norm comes from an inner product ,:H×HC\langle \cdot, \cdot \rangle : H \times H \to \mathbb{C}, a sesquilinear form that is conjugate-symmetric (x,y=y,x\langle x, y \rangle = \overline{\langle y, x \rangle}), linear in the first argument, and positive definite (x,x>0\langle x, x \rangle > 0 for x0x \neq 0), with x=x,x\|x\| = \sqrt{\langle x, x \rangle}. The quintessential example is L2(Ω)L^2(\Omega) with the inner product f,g=Ωfgˉdμ\langle f, g \rangle = \int_\Omega f \bar{g} \, d\mu.

The inner product allows a notion of orthogonality: xyx \perp y when x,y=0\langle x, y \rangle = 0. Given a closed subspace MHM \subseteq H, every vector hHh \in H decomposes uniquely as h=m+mh = m + m^\perp with mMm \in M and mM={xH:x,M=0}m^\perp \in M^\perp = \{x \in H : \langle x, M \rangle = 0\}, a fact known as the orthogonal projection theorem. The projection PM:HMP_M : H \to M defined by PMh=mP_M h = m is the best approximation of hh within MM: for every m0Mm_0 \in M, hPMhhm0\|h - P_M h\| \leq \|h - m_0\|.

A complete orthonormal system (or orthonormal basis) for HH is a maximal orthonormal set {eα}\{e_\alpha\}. For separable Hilbert spaces, this is always a countable set {e1,e2,e3,}\{e_1, e_2, e_3, \ldots\}, and Parseval’s identity holds:

h2=n=1h,en2for every hH.\|h\|^2 = \sum_{n=1}^\infty |\langle h, e_n \rangle|^2 \quad \text{for every } h \in H.

The coefficients h,en\langle h, e_n \rangle are the Fourier coefficients of hh with respect to the basis. For H=L2([π,π])H = L^2([-\pi, \pi]), the functions en(t)=12πeinte_n(t) = \frac{1}{\sqrt{2\pi}} e^{int} form a complete orthonormal system, and Parseval’s identity becomes the classical energy identity for Fourier series.

On a Hilbert space, every bounded linear operator T:HHT: H \to H has an adjoint TT^*, the unique bounded operator satisfying Tx,y=x,Ty\langle Tx, y \rangle = \langle x, T^* y \rangle for all x,yHx, y \in H. An operator with T=TT = T^* is self-adjoint (or Hermitian), and one with TT=TTT^* T = T T^* is normal. The Riesz representation theorem connects Hilbert space geometry to duality: every continuous linear functional f:HCf: H \to \mathbb{C} is of the form f(x)=x,yf(x) = \langle x, y \rangle for a unique yHy \in H, establishing a conjugate-linear isometric isomorphism HHH \cong H^*.

The Hahn-Banach Theorem and Consequences

The Hahn-Banach theorem is the cornerstone of duality in functional analysis. It asserts that continuous linear functionals can always be extended without loss, and this seemingly simple fact has far-reaching consequences for the geometry of Banach spaces.

The Hahn-Banach extension theorem states: if p:XRp: X \to \mathbb{R} is a sublinear functional on a real vector space XX, MXM \subseteq X is a subspace, and f:MRf: M \to \mathbb{R} is a linear functional with f(x)p(x)f(x) \leq p(x) for all xMx \in M, then there exists a linear extension F:XRF: X \to \mathbb{R} with FM=fF\big|_M = f and F(x)p(x)F(x) \leq p(x) for all xXx \in X. In the normed space setting, this immediately implies that any bounded linear functional on a subspace extends to a bounded linear functional on the whole space with the same norm:

FX,FM=f,FX=fM.\exists F \in X^*, \quad F\big|_M = f, \quad \|F\|_{X^*} = \|f\|_{M^*}.

The proof, completed by Banach in 1929 following an earlier result of Hahn, uses Zorn’s lemma to handle the passage from finite-dimensional extensions to the full space. The theorem has a geometric counterpart — the separation theorem — which asserts that two disjoint convex sets, one of which is open, can be separated by a closed hyperplane: there exists fXf \in X^* and cRc \in \mathbb{R} such that f(x)cf(y)f(x) \leq c \leq f(y) for all xx in one set and yy in the other.

The Hahn-Banach theorem establishes the abundance of functionals: the dual space XX^* is large enough to separate points — if xyx \neq y, there exists fXf \in X^* with f(x)f(y)f(x) \neq f(y). It also provides the canonical embedding of XX into its bidual X=(X)X^{**} = (X^*)^*. The canonical embedding J:XXJ: X \to X^{**} defined by J(x)(f)=f(x)J(x)(f) = f(x) is an isometric linear injection, so every Banach space embeds isometrically into its bidual. A space is called reflexive if JJ is surjective — if X=XX = X^{**}. The spaces LpL^p for 1<p<1 < p < \infty are reflexive; L1L^1 and LL^\infty are not.

Three foundational theorems, all consequences of completeness and Baire category theory, round out the basic theory. The uniform boundedness principle (Banach-Steinhaus theorem) asserts that a family of bounded operators that is pointwise bounded must be uniformly bounded in operator norm. The open mapping theorem states that a surjective bounded linear operator between Banach spaces is an open map. The closed graph theorem says that a linear operator whose graph {(x,Tx)}\{(x, Tx)\} is closed in X×YX \times Y must be bounded. Together these form a powerful toolkit: the open mapping theorem, for instance, implies that a bijective bounded linear operator between Banach spaces has a bounded inverse.

The weak topology on a Banach space XX is the coarsest topology making all functionals in XX^* continuous; a sequence (xn)(x_n) converges weakly to xx — written xnxx_n \rightharpoonup x — if f(xn)f(x)f(x_n) \to f(x) for every fXf \in X^*. Weak convergence is strictly weaker than norm convergence in infinite dimensions. The Banach-Alaoglu theorem asserts that the closed unit ball of XX^* is compact in the weak-* topology (the topology of pointwise convergence on XX), a result of central importance in optimization, PDE theory, and geometric functional analysis.

Spectral Theory of Bounded Operators

In finite dimensions, the spectral theorem says that every Hermitian matrix can be diagonalized by a unitary change of basis. Functional analysis extends this idea to infinite-dimensional operators, but the story becomes considerably richer: operators may have no eigenvalues at all, yet still possess a highly structured “spectrum.”

For a bounded operator TL(H)T \in \mathcal{L}(H) on a complex Banach or Hilbert space, the resolvent set ρ(T)\rho(T) consists of those λC\lambda \in \mathbb{C} for which (λIT)(\lambda I - T) is bijective with a bounded inverse, called the resolvent R(λ,T)=(λIT)1R(\lambda, T) = (\lambda I - T)^{-1}. The spectrum σ(T)=Cρ(T)\sigma(T) = \mathbb{C} \setminus \rho(T) is always a nonempty compact subset of C\mathbb{C} contained in the disk λT|\lambda| \leq \|T\|. The spectral radius r(T)=supλσ(T)λr(T) = \sup_{\lambda \in \sigma(T)} |\lambda| satisfies the elegant formula

r(T)=limnTn1/n=infn1Tn1/n.r(T) = \lim_{n \to \infty} \|T^n\|^{1/n} = \inf_{n \geq 1} \|T^n\|^{1/n}.

The spectrum decomposes into three disjoint parts. The point spectrum σp(T)\sigma_p(T) consists of eigenvalues: values λ\lambda for which (λIT)(\lambda I - T) is not injective, meaning Tx=λxTx = \lambda x for some nonzero xx. The continuous spectrum σc(T)\sigma_c(T) consists of values where (λIT)(\lambda I - T) is injective with dense but not closed range. The residual spectrum σr(T)\sigma_r(T) accounts for the remainder. For self-adjoint operators on a Hilbert space, the residual spectrum is empty and all spectral values are either eigenvalues or limits of “approximate eigenvalues.”

Compact operators are those sending bounded sets to precompact sets — equivalently, those which can be approximated in operator norm by finite-rank operators. The canonical examples are integral operators of the form

(Kf)(x)=abk(x,y)f(y)dy(Kf)(x) = \int_a^b k(x, y) f(y) \, dy

for a continuous or square-integrable kernel kk. The spectral theory of compact operators is governed by the Fredholm alternative: for a compact operator KK, the operator IKI - K is either invertible or its kernel and cokernel have the same finite dimension, mirroring the situation in finite dimensions. The spectrum of a compact operator on an infinite-dimensional space consists of {0}\{0\} together with a (possibly finite or empty) sequence of eigenvalues accumulating only at zero.

The crown jewel is the spectral theorem for compact self-adjoint operators: if TT is a compact self-adjoint operator on a separable Hilbert space HH, then HH has an orthonormal basis of eigenvectors {en}\{e_n\} of TT, with corresponding real eigenvalues λn0\lambda_n \to 0, and

T=n=1λn,enen.T = \sum_{n=1}^\infty \lambda_n \langle \cdot, e_n \rangle e_n.

This is the infinite-dimensional generalization of the finite-dimensional spectral theorem for Hermitian matrices, and it is the theoretical foundation of Fourier series, Sturm-Liouville theory, and principal component analysis. For non-compact self-adjoint operators — including those arising from differential operators — a more general spectral theorem involves spectral measures and applies to operators with purely continuous spectrum.

Unbounded Operators

Differential operators — the objects of greatest interest in physics and PDE theory — are typically not bounded. The differentiation operator ddx\frac{d}{dx} on L2[0,1]L^2[0,1], for instance, satisfies fL2\|f'\|_{L^2} can be arbitrarily large relative to fL2\|f\|_{L^2}. To handle such operators, functional analysis introduces a careful treatment of domains.

An unbounded operator TT on a Hilbert space HH is a linear map defined on a dense subspace D(T)H\mathcal{D}(T) \subseteq H, called its domain, which we do not assume to be all of HH. The graph of TT is the subspace G(T)={(x,Tx):xD(T)}H×H\mathcal{G}(T) = \{(x, Tx) : x \in \mathcal{D}(T)\} \subseteq H \times H. The operator is closed if its graph is a closed subspace, a condition that replaces boundedness as the right regularity assumption for unbounded operators. The closed graph theorem tells us that a closed operator defined on all of HH must be bounded — so unbounded operators are necessarily defined only on proper dense subdomains.

The adjoint of an unbounded operator TT requires more care. The domain D(T)\mathcal{D}(T^*) consists of those yHy \in H for which the map xTx,yx \mapsto \langle Tx, y \rangle extends to a bounded functional on all of HH; for such yy, the Riesz theorem yields a unique zz with Tx,y=x,z\langle Tx, y \rangle = \langle x, z \rangle for all xD(T)x \in \mathcal{D}(T), and we set Ty=zT^* y = z. An operator is symmetric if Tx,y=x,Ty\langle Tx, y \rangle = \langle x, Ty \rangle for all x,yD(T)x, y \in \mathcal{D}(T) — equivalently, TTT \subseteq T^*. An operator is self-adjoint if T=TT = T^*, meaning both the action and the domain of TT and TT^* coincide.

The distinction between symmetric and self-adjoint is not pedantry: a symmetric operator may have many self-adjoint extensions, each with a different spectrum. The theory of self-adjoint extensions is due to von Neumann and uses the deficiency indices (n+,n)(n_+, n_-), where n+=dimker(TiI)n_+ = \dim \ker(T^* - iI) and n=dimker(T+iI)n_- = \dim \ker(T^* + iI). An operator has self-adjoint extensions if and only if n+=nn_+ = n_-, and the extensions are parametrized by the unitary maps from ker(TiI)\ker(T^* - iI) to ker(T+iI)\ker(T^* + iI). For the momentum operator iddx-i\frac{d}{dx} on an interval, different choices of boundary condition (Dirichlet, Neumann, periodic, or twisted) correspond to different self-adjoint extensions with genuinely different spectra.

A fundamental example is the Laplacian Δ=j=1n2xj2\Delta = \sum_{j=1}^n \frac{\partial^2}{\partial x_j^2} on L2(Ω)L^2(\Omega). On the full space Rn\mathbb{R}^n, the Laplacian is essentially self-adjoint on the Schwartz functions, and its spectrum is the half-line (,0](-\infty, 0] — a purely continuous spectrum. On a bounded domain Ω\Omega with Dirichlet boundary conditions, it becomes a compact resolvent operator with a discrete spectrum of negative eigenvalues λ1<λ2λ3\lambda_1 < \lambda_2 \leq \lambda_3 \leq \cdots \to -\infty, and the eigenfunctions form an orthonormal basis for L2(Ω)L^2(\Omega).

C*-Algebras and Von Neumann Algebras

The algebraic structure of operators on a Hilbert space leads naturally to operator algebras, which encode spectral theory in an abstract, intrinsic framework that has found striking applications in quantum mechanics and noncommutative geometry.

A Banach algebra is a Banach space AA equipped with an associative multiplication satisfying abab\|ab\| \leq \|a\|\|b\|. The bounded operators L(H)\mathcal{L}(H) form a Banach algebra under composition. A C-algebra* is a Banach algebra with an involution aaa \mapsto a^* satisfying (ab)=ba(ab)^* = b^* a^*, (a)=a(a^*)^* = a, a=a\|a^*\| = \|a\|, and the critical C-identity*:

aa=a2.\|a^* a\| = \|a\|^2.

The C*-identity is far more powerful than it looks: it forces the norm to be uniquely determined by the algebraic structure. The Gelfand-Naimark theorem (1943) asserts that every abstract C*-algebra is isometrically *-isomorphic to a closed *-subalgebra of L(H)\mathcal{L}(H) for some Hilbert space HH — the abstract and concrete viewpoints are equivalent. For commutative C*-algebras, the Gelfand representation theorem gives an even sharper result: every commutative C*-algebra is isometrically isomorphic to C0(X)C_0(X), the algebra of continuous functions vanishing at infinity on a locally compact Hausdorff space XX, where XX is the spectrum (character space) of the algebra. This is the precise sense in which noncommutative C*-algebras represent “noncommutative topological spaces.”

Von Neumann algebras are a more rigid variant: they are *-subalgebras of L(H)\mathcal{L}(H) that are closed in the weak operator topology — the coarsest topology making all matrix coefficient maps TTx,yT \mapsto \langle Tx, y \rangle continuous. Von Neumann’s bicommutant theorem provides an elegant characterization: MM is a von Neumann algebra if and only if M=MM = M'', where M={TL(H):TS=ST for all SM}M' = \{T \in \mathcal{L}(H) : TS = ST \text{ for all } S \in M\} is the commutant of MM. The commutant operation is thus the algebraic analog of topological closure. Von Neumann algebras are classified by their type (I, II, III), with type I corresponding to algebras of bounded operators on 2\ell^2 (or direct sums thereof), type II carrying a trace, and type III arising in quantum field theory. The classification program, completed through the work of Connes in the 1970s, is one of the deepest achievements of twentieth-century mathematics.

The passage from C*-algebras to von Neumann algebras mirrors the passage from topology to measure theory. In this sense, von Neumann algebras are the noncommutative analog of measure spaces, while C*-algebras are the noncommutative analog of topological spaces — a perspective systematized by Alain Connes in noncommutative geometry.

Distributions and Sobolev Spaces

The last great thread of functional analysis is the theory that extends calculus to objects that are not classically differentiable, providing the language in which the modern theory of partial differential equations is written.

The idea of a distribution (or generalized function) originated with Paul Dirac, who introduced the delta function δ(x)\delta(x) in 1927 as an object that “picks out” the value of a function at zero: δ(x)f(x)dx=f(0)\int \delta(x) f(x) \, dx = f(0). No ordinary function has this property, but the expression makes sense if we think of it as a continuous linear functional on a suitable space of test functions. Laurent Schwartz formalized this intuition in his 1944 thesis, creating the theory of distributions that earned him the Fields Medal in 1950.

The space of test functions D(Ω)\mathcal{D}(\Omega) consists of all smooth functions ϕ:ΩR\phi: \Omega \to \mathbb{R} with compact support, equipped with a topology that makes convergence mean: there exists a fixed compact set containing all supports, and every derivative αϕn\partial^\alpha \phi_n converges uniformly to αϕ\partial^\alpha \phi. A distribution is a continuous linear functional u:D(Ω)Ru: \mathcal{D}(\Omega) \to \mathbb{R}. Every locally integrable function ff defines a distribution uf(ϕ)=fϕdxu_f(\phi) = \int f \phi \, dx, but distributions also include singular objects like the Dirac delta δa(ϕ)=ϕ(a)\delta_a(\phi) = \phi(a) or the principal value p.v.(1/x)\text{p.v.}(1/x).

The key operation is differentiation: for any distribution uu and any multi-index α\alpha, we define αu\partial^\alpha u by

(αu)(ϕ)=(1)αu(αϕ).(\partial^\alpha u)(\phi) = (-1)^{|\alpha|} u(\partial^\alpha \phi).

This extends classical differentiation — when u=ufu = u_f for a smooth ff, we recover the classical derivative via integration by parts — but applies to every distribution, without any smoothness assumption. In particular, the Heaviside step function H(x)=1x>0H(x) = \mathbf{1}_{x > 0}, which is not differentiable in the classical sense, has distributional derivative H=δ0H' = \delta_0.

Sobolev spaces make distributions quantitative by measuring how many “distributional derivatives” a function has and how regular those derivatives are. For a domain ΩRn\Omega \subseteq \mathbb{R}^n, a nonnegative integer kk, and 1p1 \leq p \leq \infty, the Sobolev space Wk,p(Ω)W^{k,p}(\Omega) consists of all LpL^p functions ff whose distributional derivatives αf\partial^\alpha f up to order αk|\alpha| \leq k all lie in Lp(Ω)L^p(\Omega), with norm

fWk,p=(αkαfLpp)1/p.\|f\|_{W^{k,p}} = \left(\sum_{|\alpha| \leq k} \|\partial^\alpha f\|_{L^p}^p\right)^{1/p}.

The spaces Hk(Ω)=Wk,2(Ω)H^k(\Omega) = W^{k,2}(\Omega) are Hilbert spaces and are particularly important. The Sobolev embedding theorem describes how the regularity in Wk,pW^{k,p} forces classical regularity: if kp>nkp > n, then every function in Wk,p(Ω)W^{k,p}(\Omega) is actually continuous (after modification on a null set), with the quantitative bound fLCfWk,p\|f\|_{L^\infty} \leq C\|f\|_{W^{k,p}}. More generally, Wk,pW^{k,p} embeds continuously into Wj,qW^{j,q} whenever knpjnqk - \frac{n}{p} \geq j - \frac{n}{q} with jkj \leq k.

The Lax-Milgram lemma converts PDE problems into operator problems in Sobolev spaces. A bounded coercive bilinear form a:H×HRa: H \times H \to \mathbb{R} on a Hilbert space HH (bounded means a(u,v)Cuv|a(u,v)| \leq C\|u\|\|v\|; coercive means a(u,u)αu2a(u,u) \geq \alpha\|u\|^2 for some α>0\alpha > 0) gives rise, via Lax-Milgram, to a unique solution uHu \in H of a(u,v)=F(v)a(u,v) = F(v) for all vHv \in H, for any bounded functional FHF \in H^*. Applied to the Dirichlet problem for the Laplacian, with H=H01(Ω)H = H^1_0(\Omega) and a(u,v)=uvdxa(u,v) = \int \nabla u \cdot \nabla v \, dx, the lemma yields the existence and uniqueness of a weak solution — a function uH01(Ω)u \in H^1_0(\Omega) satisfying the equation in the distributional sense, without assuming classical differentiability. Regularity theory then determines under what conditions the weak solution is in fact a classical solution.

Together, distributions and Sobolev spaces transform the study of PDEs from a case-by-case craft into a systematic analytic theory. They are the language in which the Clay Millennium problem on Navier-Stokes regularity is formulated, and they remain one of the most active interfaces between pure mathematics and its applications in physics and engineering.