Graphics & Vision
Visual computing — computer graphics, computer vision, and human-computer interaction.
The Graphics and Vision branch sits at the intersection of mathematics, perception, and engineering, concerned with how computers create, interpret, and mediate visual information. From the earliest oscilloscope displays of the 1950s to today’s real-time photorealistic renderers and neural vision models, this branch has transformed entertainment, science, medicine, and everyday communication. Its intellectual core draws on linear algebra, optics, signal processing, and cognitive science, weaving these into systems that either synthesize images from mathematical descriptions or extract structured understanding from raw pixel data. Where other branches of computer science deal primarily in symbols, numbers, or networks, Graphics and Vision deals in light, shape, motion, and human perception — making it both deeply mathematical and profoundly grounded in physical reality.
The field’s origins trace to Ivan Sutherland, whose 1963 Sketchpad system demonstrated that computers could serve as interactive visual instruments, not merely numerical calculators. Sutherland’s insight — that the display screen was a window into a mathematical world — launched computer graphics as a discipline and, with his later work on head-mounted displays, anticipated virtual and augmented reality by decades. In parallel, the study of how machines might interpret images emerged from artificial intelligence research in the 1960s, when Lawrence Roberts and others at MIT attempted to recover three-dimensional structure from photographs. David Marr’s computational theory of vision, published in the late 1970s, provided the field’s first unifying framework by arguing that vision should be understood at three levels: computational theory, algorithm, and implementation. On the human side, researchers like Douglas Engelbart — who demonstrated the mouse, hypertext, and collaborative computing in his landmark 1968 “Mother of All Demos” — and Ben Shneiderman, who formalized principles of direct manipulation, established human-computer interaction as a rigorous discipline in its own right.
The three sub-topics in this branch form a natural progression. Computer Graphics begins with the mathematical machinery of transformations, rasterization, and shading, then ascends through ray tracing and global illumination to physically-based rendering, animation, and procedural generation. It is the art of turning geometry and physics into pixels. Computer Vision reverses that process: given pixels, it recovers geometry, identity, motion, and meaning. From classical feature detection and epipolar geometry through the deep learning revolution that began with AlexNet in 2012, computer vision now powers autonomous vehicles, medical imaging, and the generative models reshaping creative industries. Human-Computer Interaction bridges the gap between computational systems and human users, drawing on cognitive psychology and design methodology to create interfaces that are usable, accessible, and increasingly intelligent. Together, these three areas define a loop — graphics produces images, vision interprets them, and interaction design ensures that the humans in the middle can effectively communicate with the machines on both sides.
The mathematical threads running through the branch are remarkably consistent. Homogeneous coordinates and projective geometry appear in both the graphics pipeline and multi-view vision. Fourier analysis underpins texture filtering, image processing, and signal reconstruction. Optimization — from least-squares fitting in camera calibration to gradient descent in neural networks — recurs in every sub-topic. Probability and statistics govern Monte Carlo rendering, Bayesian inference in vision, and the experimental methods of HCI research. A student who masters the linear algebra of transformations and the calculus of light transport will find those tools reappearing throughout the branch, connecting rendering equations to epipolar constraints to the statistical models of user behavior.
Today the boundaries between these sub-topics are dissolving. Neural radiance fields blur the line between rendering and reconstruction. Differentiable rendering allows vision systems to optimize graphics parameters through gradient descent. Large vision-language models connect visual perception to natural language understanding. And the rise of spatial computing — augmented and virtual reality — demands simultaneous advances in real-time graphics, robust computer vision, and intuitive interaction design. The Graphics and Vision branch, once three separate communities, is converging into a unified discipline of visual computing whose ambition is nothing less than seamless interplay between the physical world, its digital representation, and the humans who navigate both.
Explore
- 01
Computer Graphics
The generation of visual content by computer — rendering, modeling, animation, and real-time graphics pipelines.
- 02
Computer Vision
Teaching machines to interpret visual information — image processing, object detection, scene understanding, and deep learning for vision.
- 03
Human-Computer Interaction
The design and evaluation of user interfaces — usability, interaction paradigms, accessibility, and user experience research.