Computer Networks
The principles and protocols of networked communication — from physical links to application-layer protocols and the internet.
Computer networking is the study of how autonomous computing devices communicate with one another — the protocols they speak, the physical media that carry their signals, the routing algorithms that find paths through complex topologies, and the application-layer services that make the internet useful. From the first packet-switched message sent across the ARPANET on October 29, 1969, to the modern internet carrying exabytes of data daily across billions of devices, networking is the connective tissue of all distributed computation and the infrastructure upon which the digital economy is built.
Network Architecture and the Layered Model
The fundamental organizing principle of computer networks is layering: decomposing the immensely complex problem of networked communication into a stack of manageable, well-defined layers, each providing services to the layer above and consuming services from the layer below. The most influential layered model is the OSI Reference Model, proposed by the International Organization for Standardization in 1984, which defines seven layers: Physical, Data Link, Network, Transport, Session, Presentation, and Application. In practice, the internet uses the simpler TCP/IP model (also called the Internet protocol suite), which effectively collapses the OSI layers into four: Link, Internet, Transport, and Application.
Layering provides several crucial engineering benefits. It allows each layer to be designed, implemented, and replaced independently, so long as it honors the interface contract with its neighbors. It enables interoperability — any compliant transport protocol can run over any compliant network layer, and any application protocol can use any compliant transport. And it creates clean conceptual boundaries for reasoning about network behavior. The cost is a degree of overhead and redundancy (error checking at multiple layers, for instance) and occasional violations of strict layering when performance demands it — a phenomenon known as cross-layer optimization.
Networks are classified by their geographic scope: a Local Area Network (LAN) spans a building or campus, a Wide Area Network (WAN) spans cities or continents, and a Metropolitan Area Network (MAN) falls between. Topology describes the physical or logical arrangement of nodes — bus, star, ring, mesh, or tree. Performance is characterized by bandwidth (the raw data rate of a link, measured in bits per second), throughput (the actual data rate achieved by an application), latency (the time for a bit to traverse the network), and jitter (variability in latency). The relationship between bandwidth and latency is captured by the bandwidth-delay product, which gives the amount of data “in flight” in a fully utilized link — a quantity that deeply influences protocol design, particularly for congestion control.
Physical and Data Link Layers
The physical layer concerns itself with the transmission of raw bits over a communication channel. Signals may be analog or digital, and the choice of modulation technique — Amplitude Shift Keying (ASK), Frequency Shift Keying (FSK), Phase Shift Keying (PSK), or Quadrature Amplitude Modulation (QAM) — determines how bits are encoded into waveforms. The theoretical upper bound on the rate at which information can be transmitted over a noisy channel was established by Claude Shannon in his landmark 1948 paper A Mathematical Theory of Communication:
where is the channel capacity in bits per second, is the bandwidth in hertz, and is the signal-to-noise ratio. Shannon’s theorem is one of the most profound results in information theory, simultaneously guaranteeing that error-free communication is possible up to the capacity limit and proving that no scheme can exceed it.
Transmission media include twisted-pair copper wire (used in Ethernet and telephone lines), coaxial cable (largely superseded but still used in cable television networks), fiber optic cable (which transmits light pulses and offers enormous bandwidth and immunity to electromagnetic interference), and various wireless media (radio waves, microwaves, infrared). The choice of medium affects bandwidth, attenuation, susceptibility to interference, and deployment cost.
The data link layer is responsible for reliable communication across a single physical link. It organizes raw bits into frames, adds physical (MAC) addresses for local delivery, and provides error detection through mechanisms like the Cyclic Redundancy Check (CRC) — a polynomial-based checksum that detects a wide class of errors with high probability. Medium Access Control (MAC) sublayer protocols govern how multiple devices share a common channel. CSMA/CD (Carrier Sense Multiple Access with Collision Detection), the original Ethernet access method, listens for an idle channel before transmitting and detects collisions during transmission. CSMA/CA (Collision Avoidance), used in Wi-Fi (IEEE 802.11), avoids collisions in the wireless medium — where they cannot be reliably detected — by using a randomized backoff protocol and optional RTS/CTS (Request-to-Send / Clear-to-Send) handshaking.
Ethernet switching operates at the data link layer. A switch maintains a MAC address table that maps each known address to the port on which it was last seen, learns new mappings by examining incoming frames, and forwards frames only to the appropriate port — dramatically improving performance over the shared-medium hubs that preceded switches. The Spanning Tree Protocol (STP), designed by Radia Perlman in 1985, prevents loops in networks with redundant links by disabling selected paths to create a logical tree topology. Virtual LANs (VLANs) allow a single physical switch to be partitioned into multiple logical broadcast domains, providing isolation and flexibility.
Network Layer and IP Routing
The network layer provides end-to-end delivery of packets across multiple hops, handling addressing, routing, and forwarding. The Internet Protocol (IP) is the network layer of the internet. IPv4, specified in RFC 791 (1981), uses 32-bit addresses, providing approximately billion unique addresses — a number that seemed vast in 1981 but has been exhausted by the explosive growth of the internet. Classless Inter-Domain Routing (CIDR), introduced in 1993, replaces the original class-based addressing scheme with variable-length prefixes (written as, e.g., 192.168.1.0/24), enabling more efficient allocation and hierarchical aggregation of routing information.
IPv6, standardized in 1998, extends the address space to 128 bits ( addresses), provides a simplified fixed-length header, eliminates the need for NAT (Network Address Translation) in principle, and includes mandatory support for IPsec. Despite its technical superiority, IPv6 adoption has been gradual, and the internet continues to operate on a dual-stack basis.
Routing is the process of determining the path a packet should take from source to destination. Each router maintains a routing table that maps destination prefixes to next-hop routers, and forwarding is the per-packet operation of looking up the destination in the table and sending the packet to the appropriate output port. Longest prefix matching resolves ambiguity when multiple table entries match a destination address.
Within an autonomous system (a network under a single administrative authority), Interior Gateway Protocols (IGPs) compute routes. RIP (Routing Information Protocol), based on the Bellman-Ford algorithm, is simple but limited to small networks. OSPF (Open Shortest Path First), based on Dijkstra’s shortest-path algorithm applied to a link-state database, scales to large networks by organizing them into areas. Between autonomous systems, the Border Gateway Protocol (BGP) is the sole inter-domain routing protocol of the internet. BGP is a path-vector protocol that selects routes based on policy (business relationships between networks) as much as on shortest path, and its convergence behavior, stability, and security remain active areas of research. Vint Cerf and Bob Kahn, who designed the original TCP/IP architecture, are widely regarded as the “fathers of the internet,” and the protocol stack they envisioned remains the internet’s foundation more than four decades later.
Transport Layer Protocols
The transport layer provides end-to-end communication services to applications, multiplexing data from multiple applications onto the network layer using port numbers and providing varying levels of reliability and ordering.
UDP (User Datagram Protocol) is the simpler of the two main transport protocols. It provides a connectionless, best-effort datagram service with minimal overhead — just an 8-byte header containing source and destination ports, length, and checksum. UDP offers no ordering, no retransmission, and no congestion control. Its simplicity makes it ideal for applications where low latency matters more than reliability — DNS queries, real-time voice and video (RTP), and online gaming.
TCP (Transmission Control Protocol) provides a reliable, ordered, byte-stream service. Connection establishment uses a three-way handshake (SYN, SYN-ACK, ACK). TCP ensures reliability through sequence numbers, acknowledgments, and retransmission of lost segments. Flow control uses a sliding window mechanism, where the receiver advertises its available buffer space (the receive window), and the sender limits its transmission rate accordingly.
Congestion control is TCP’s mechanism for adapting its sending rate to the capacity of the network, avoiding the collapse that would result if all senders transmitted at full speed. The classic TCP Tahoe algorithm uses slow start (exponentially increasing the congestion window from one segment until a loss occurs), congestion avoidance (linearly increasing the window after reaching a threshold), and fast retransmit (retransmitting immediately upon receiving three duplicate acknowledgments rather than waiting for a timeout). TCP Reno added fast recovery, which avoids resetting the window to one after a fast retransmit. Modern variants include Cubic (the default in Linux, which uses a cubic function for window growth), and BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Google, which models the network path to achieve higher throughput and lower latency than loss-based algorithms. The evolution of TCP congestion control is a story of the internet’s growth: each new algorithm addressed performance problems that emerged as link speeds, round-trip times, and traffic volumes changed.
QUIC, standardized as RFC 9000 in 2021, is a transport protocol built on top of UDP that incorporates encryption (TLS 1.3), multiplexed streams without head-of-line blocking, and connection migration (a connection can survive a change of IP address). QUIC reduces connection establishment latency to as little as zero round trips for previously visited servers, and its multiplexed streams allow independent recovery from loss on each stream — a significant improvement over TCP, where a single lost segment delays all data on the connection.
Application Layer and the Domain Name System
The application layer encompasses the protocols that end users interact with, built on top of the transport layer to provide specific services. The Domain Name System (DNS), designed by Paul Mockapetris in 1983, translates human-readable domain names (such as example.com) into IP addresses through a distributed hierarchical database. DNS operates primarily over UDP (for speed), with TCP as a fallback for large responses. A DNS query traverses the hierarchy from root servers to top-level domain (TLD) servers to authoritative servers, with extensive caching at every level to reduce load and latency. DNSSEC adds cryptographic signatures to DNS responses to prevent spoofing and cache poisoning. More recently, DNS over HTTPS (DoH) and DNS over TLS (DoT) encrypt DNS traffic to protect user privacy.
The Hypertext Transfer Protocol (HTTP) is the protocol of the World Wide Web, governing how clients (browsers) request and servers deliver resources. HTTP/1.0 (1996) used a separate TCP connection for each request; HTTP/1.1 (1997) introduced persistent connections and pipelining. HTTP/2 (2015) introduced binary framing, multiplexing of multiple requests over a single TCP connection, header compression, and server push. HTTP/3, based on QUIC rather than TCP, eliminates head-of-line blocking at the transport layer and further reduces latency. HTTPS wraps HTTP in TLS (Transport Layer Security), providing confidentiality, integrity, and server authentication through public-key cryptography and symmetric encryption.
Email relies on SMTP (Simple Mail Transfer Protocol) for sending, and IMAP or POP3 for retrieval. SSH (Secure Shell) provides encrypted remote terminal access and file transfer. DHCP (Dynamic Host Configuration Protocol) automatically assigns IP addresses to devices joining a network. NTP (Network Time Protocol) synchronizes clocks across the internet to within milliseconds. Each of these protocols embodies design trade-offs between simplicity, efficiency, extensibility, and security that have been refined over decades of deployment.
Network Security and Cryptographic Protocols
Network security protects communication from eavesdropping, tampering, and impersonation. The foundational protocols are TLS (and its predecessor SSL), which secure most internet traffic. A TLS handshake authenticates the server (and optionally the client) using X.509 certificates issued by Certificate Authorities (CAs), negotiates a cipher suite specifying the key exchange, encryption, and integrity algorithms, and derives symmetric session keys for efficient bulk encryption. TLS 1.3 (2018) simplified the handshake to a single round trip, removed insecure legacy algorithms, and mandated perfect forward secrecy — the property that compromising a server’s long-term private key does not compromise past session keys.
Firewalls filter traffic based on rules applied to packet headers (stateless) or connection state (stateful). Intrusion Detection Systems (IDS) monitor network traffic for suspicious patterns, using either signature-based methods (matching known attack patterns) or anomaly-based methods (flagging deviations from baseline behavior). Distributed Denial of Service (DDoS) attacks overwhelm a target with traffic from many sources; mitigation techniques include rate limiting, traffic scrubbing, geographic distribution, and anycast routing (directing traffic to the nearest of several servers sharing the same IP address).
Virtual Private Networks (VPNs) create encrypted tunnels over public networks. IPsec (IP Security) operates at the network layer, providing either transport mode (encrypting only the payload) or tunnel mode (encrypting the entire IP packet and encapsulating it in a new header). IPsec uses the Internet Key Exchange (IKE) protocol to establish shared keys, and its two sub-protocols — Authentication Header (AH) for integrity and Encapsulating Security Payload (ESP) for confidentiality — can be used independently or together. TLS-based VPNs operate at the transport layer and are often easier to deploy through firewalls.
Software-Defined Networking and Modern Architecttic
Traditional networks embed both the control plane (which computes routing tables and makes forwarding decisions) and the data plane (which forwards packets based on those decisions) in each individual switch and router. Software-Defined Networking (SDN) separates these planes: a centralized SDN controller runs the control logic and programs the forwarding tables of simple, commodity switches via protocols like OpenFlow. This separation provides a global view of the network, enables rapid reconfiguration, and allows network behavior to be defined in software rather than in the firmware of proprietary hardware.
OpenFlow, proposed by Nick McKeown and colleagues at Stanford in 2008, defines a standardized interface between the controller and switches. A switch maintains one or more flow tables, each containing rules that match packet headers and specify actions (forward, drop, modify, send to controller). The controller installs, modifies, and removes flow entries in response to network events, traffic engineering objectives, or security policies. SDN has found wide adoption in data centers (Google’s B4 WAN, for example, uses SDN to achieve near-optimal link utilization) and in network function virtualization (NFV), which replaces dedicated network appliances (firewalls, load balancers, intrusion detection systems) with software running on commodity hardware.
Content Delivery Networks (CDNs) are a complementary approach to improving performance and reliability. A CDN distributes copies of content to edge servers located close to end users, reducing latency and offloading traffic from origin servers. Request routing algorithms direct users to the optimal edge server based on geographic proximity, server load, and network conditions. CDNs now serve the majority of internet traffic, including video streaming, web pages, and software downloads, and they play a critical role in DDoS mitigation by absorbing attack traffic at the edge.
Looking forward, the field continues to evolve rapidly. 5G and beyond cellular networks promise lower latency and higher bandwidth for mobile devices. Intent-based networking aims to translate high-level business policies into network configurations automatically. Deterministic networking seeks to provide guaranteed latency and reliability for time-critical applications like industrial automation and autonomous vehicles. And the increasing integration of machine learning into network management — for anomaly detection, traffic prediction, and automated optimization — promises to make networks smarter, more efficient, and more adaptive than ever before.