Machine Learning

Statistical learning from data.


foundation tier

Machine Learning addresses statistical learning from data. It sits within AI and Machine Learning and inherits that area’s core questions about correctness, scale, and tractability. This page surveys the conceptual axes of the topic and points to the references that frame ongoing research and teaching. The intent is to be useful both as an entry point for newcomers and as an index for practitioners cross-checking their mental model against the field’s primary sources.

Work on machine learning can be organised around a few interlocking concerns: the formal objects under study, the algorithms or systems that compute over them, the resource trade-offs (time, memory, communication, statistical efficiency), and the empirical or theoretical guarantees that practitioners rely on. The sources cited below approach the topic from a mix of these angles.

Foundational references

Bishop, Pattern Recognition and Machine Learning (2006) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques. Hastie, The Elements of Statistical Learning (2009) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques. Murphy, Probabilistic Machine Learning: An Introduction (2022) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques.

Open methodological questions in machine learning cluster around how to compose the techniques above under realistic constraints — scale, adversarial inputs, partial observability, and shifting workloads. The cited references give the precise statements, proofs, and empirical evaluations that this overview only sketches; downstream topic pages drill into specific subfields.

Prerequisites

Sources

In context

Where this topic sits in the prerequisite graph. Click any node to jump.

Open in full atlas →

Reviewed by

Explore

  1. 01

    Supervised Learning

    Regression and classification from labeled data.

  2. 02

    Linear Models

    Linear regression, logistic regression, and regularization.

  3. 03

    Support Vector Machines

    Max-margin classifiers and kernel methods.

  4. 04

    Graph Neural Networks

    Neural architectures that operate on graph-structured data through message passing, attention, and equivariant operations — the dominant approach to learning on networks, molecules, and relational data.

  5. 05

    Kernel Methods

    Reproducing kernel Hilbert spaces and kernel-based learning.

  6. 06

    Decision Trees and Ensembles

    CART, random forests, and gradient-boosted trees.

  7. 07

    Federated Learning

    A distributed learning paradigm in which many clients collaboratively train a shared model under privacy and communication constraints, exchanging only model updates rather than raw data.

  8. 08

    Clustering

    k-means, hierarchical, density-based, and spectral clustering.

  9. 09

    Dimensionality Reduction

    PCA, t-SNE, UMAP, and manifold learning.

  10. 10

    Probabilistic Graphical Models

    Bayesian and Markov networks and inference algorithms.

  11. 11

    Bayesian Machine Learning

    Priors, posteriors, and Bayesian inference for learning.

  12. 12

    Gaussian Processes

    Nonparametric Bayesian regression and classification.

  13. 13

    Variational Inference

    Mean-field, stochastic, and amortized variational methods.

  14. 14

    MCMC Methods

    Metropolis-Hastings, Gibbs, and Hamiltonian Monte Carlo.

  15. 15

    Optimization for ML

    SGD, Adam, second-order methods, and ML-specific optimization.

  16. 16

    Regularization and Generalization

    Weight decay, dropout, double descent, and generalization theory.

  17. 17

    Feature Learning and Representation

    Self-supervised and contrastive representation learning.

  18. 18

    Contrastive Learning

    SimCLR, MoCo, CLIP-style contrastive representation methods.

  19. 19

    Self-Supervised Learning

    Pretext tasks and masked-prediction objectives.

  20. 20

    Semi-Supervised Learning

    Learning with limited labeled and abundant unlabeled data.

  21. 21

    Active Learning

    Query strategies for sample-efficient labeling.

  22. 22

    Transfer Learning

    Pretraining, fine-tuning, and domain adaptation.

  23. 23

    Domain Adaptation

    Aligning source and target distributions for transfer.

  24. 24

    Meta-Learning

    Learning to learn, MAML, and few-shot adaptation.

  25. 25

    Few-Shot Learning

    Generalizing from a handful of examples.

  26. 26

    Continual Learning

    Learning sequentially without catastrophic forgetting.

  27. 27

    Multi-Task Learning

    Joint learning of related tasks for shared representations.

  28. 28

    Curriculum Learning

    Ordering training examples by difficulty.

  29. 29

    Imbalanced Learning

    Class-imbalanced classification and rare-event prediction.

  30. 30

    Anomaly and Novelty Detection

    Outlier and novelty detection methods.

  31. 31

    Structured Prediction

    Learning over structured outputs: sequences, trees, graphs.

  32. 32

    Uncertainty Quantification in ML

    Calibration, conformal prediction, and predictive uncertainty.

  33. 33

    Conformal Prediction

    Distribution-free predictive intervals with finite-sample guarantees.

  34. 34

    Fairness in Machine Learning

    Group and individual fairness definitions and mitigations.

  35. 35

    Privacy-Preserving ML

    DP-SGD, secure aggregation, and privacy in learning.

  36. 36

    Adversarial Robustness

    Adversarial examples and certified defenses.

  37. 37

    Distribution Shift and OOD Generalization

    Robustness to covariate and label shift.

  38. 38

    AutoML

    Automated model and pipeline search.

  39. 39

    Neural Architecture Search

    Search-based and gradient-based architecture discovery.

  40. 40

    Hyperparameter Optimization

    Bayesian optimization, Hyperband, and population-based search.

  41. 41

    Recommendation ML

    Embedding-based and sequential recommendation models.

  42. 42

    Tabular Deep Learning

    Deep models for tabular data and comparisons to GBDTs.

  43. 43

    Geometric Deep Learning

    Equivariance, group-invariant networks, and manifold methods.

  44. 44

    Equivariant Neural Networks

    Networks invariant or equivariant to symmetry groups.

  45. 45

    Scientific Machine Learning

    Physics-informed networks, neural ODEs, and ML for science.

  46. 46

    Physics-Informed Neural Networks

    Networks regularized by PDE residuals.

  47. 47

    Neural Operators

    Learning solution operators of PDEs (FNO, DeepONet).

  48. 48

    Efficient Machine Learning

    Pruning, quantization, distillation, and on-device inference.

  49. 49

    Model Compression

    Quantization, pruning, and low-rank factorization.

  50. 50

    Knowledge Distillation

    Training compact students from large teachers.

  51. 51

    MLOps

    Operationalizing ML — pipelines, monitoring, and lifecycle.

  52. 52

    ML Systems

    Distributed training and inference systems for ML.

  53. 53

    Mechanistic Interpretability

    Reverse-engineering neural network internals and circuits.

  54. 54

    Representation Engineering

    Probing, steering, and editing model representations.

  55. 55

    Causal Machine Learning

    Causal inference combined with ML for treatment effects.

  56. 56

    Neural Networks and Deep Learning

    Multilayer perceptrons and the deep-learning paradigm.

  57. 57

    Generative Models

    Models that generate samples from learned data distributions.

  58. 58

    Reinforcement Learning

    Learning by trial and error from reward signals.


Review this topic

This page was drafted by an agent and is waiting on expert review. Spotted a wrong prerequisite, a missing concept, a misattributed source, or a factual slip? Tell us — your review opens a tracked issue maintainers act on.