Machine Learning
Statistical learning from data.
Machine Learning addresses statistical learning from data. It sits within AI and Machine Learning and inherits that area’s core questions about correctness, scale, and tractability. This page surveys the conceptual axes of the topic and points to the references that frame ongoing research and teaching. The intent is to be useful both as an entry point for newcomers and as an index for practitioners cross-checking their mental model against the field’s primary sources.
Work on machine learning can be organised around a few interlocking concerns: the formal objects under study, the algorithms or systems that compute over them, the resource trade-offs (time, memory, communication, statistical efficiency), and the empirical or theoretical guarantees that practitioners rely on. The sources cited below approach the topic from a mix of these angles.
Foundational references
Bishop, Pattern Recognition and Machine Learning (2006) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques. Hastie, The Elements of Statistical Learning (2009) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques. Murphy, Probabilistic Machine Learning: An Introduction (2022) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques.
Open methodological questions in machine learning cluster around how to compose the techniques above under realistic constraints — scale, adversarial inputs, partial observability, and shifting workloads. The cited references give the precise statements, proofs, and empirical evaluations that this overview only sketches; downstream topic pages drill into specific subfields.
Prerequisites
Sources
- textbook · primary · 2006Pattern Recognition and Machine Learningbishop-2006
-
-
In context
Where this topic sits in the prerequisite graph. Click any node to jump.
Reviewed by
Explore
- 01
Supervised Learning
Regression and classification from labeled data.
- 02
Linear Models
Linear regression, logistic regression, and regularization.
- 03
Support Vector Machines
Max-margin classifiers and kernel methods.
- 04
Graph Neural Networks
Neural architectures that operate on graph-structured data through message passing, attention, and equivariant operations — the dominant approach to learning on networks, molecules, and relational data.
- 05
Kernel Methods
Reproducing kernel Hilbert spaces and kernel-based learning.
- 06
Decision Trees and Ensembles
CART, random forests, and gradient-boosted trees.
- 07
Federated Learning
A distributed learning paradigm in which many clients collaboratively train a shared model under privacy and communication constraints, exchanging only model updates rather than raw data.
- 08
Clustering
k-means, hierarchical, density-based, and spectral clustering.
- 09
Dimensionality Reduction
PCA, t-SNE, UMAP, and manifold learning.
- 10
Probabilistic Graphical Models
Bayesian and Markov networks and inference algorithms.
- 11
Bayesian Machine Learning
Priors, posteriors, and Bayesian inference for learning.
- 12
Gaussian Processes
Nonparametric Bayesian regression and classification.
- 13
Variational Inference
Mean-field, stochastic, and amortized variational methods.
- 14
MCMC Methods
Metropolis-Hastings, Gibbs, and Hamiltonian Monte Carlo.
- 15
Optimization for ML
SGD, Adam, second-order methods, and ML-specific optimization.
- 16
Regularization and Generalization
Weight decay, dropout, double descent, and generalization theory.
- 17
Feature Learning and Representation
Self-supervised and contrastive representation learning.
- 18
Contrastive Learning
SimCLR, MoCo, CLIP-style contrastive representation methods.
- 19
Self-Supervised Learning
Pretext tasks and masked-prediction objectives.
- 20
Semi-Supervised Learning
Learning with limited labeled and abundant unlabeled data.
- 21
Active Learning
Query strategies for sample-efficient labeling.
- 22
Transfer Learning
Pretraining, fine-tuning, and domain adaptation.
- 23
Domain Adaptation
Aligning source and target distributions for transfer.
- 24
Meta-Learning
Learning to learn, MAML, and few-shot adaptation.
- 25
Few-Shot Learning
Generalizing from a handful of examples.
- 26
Continual Learning
Learning sequentially without catastrophic forgetting.
- 27
Multi-Task Learning
Joint learning of related tasks for shared representations.
- 28
Curriculum Learning
Ordering training examples by difficulty.
- 29
Imbalanced Learning
Class-imbalanced classification and rare-event prediction.
- 30
Anomaly and Novelty Detection
Outlier and novelty detection methods.
- 31
Structured Prediction
Learning over structured outputs: sequences, trees, graphs.
- 32
Uncertainty Quantification in ML
Calibration, conformal prediction, and predictive uncertainty.
- 33
Conformal Prediction
Distribution-free predictive intervals with finite-sample guarantees.
- 34
Fairness in Machine Learning
Group and individual fairness definitions and mitigations.
- 35
Privacy-Preserving ML
DP-SGD, secure aggregation, and privacy in learning.
- 36
Adversarial Robustness
Adversarial examples and certified defenses.
- 37
Distribution Shift and OOD Generalization
Robustness to covariate and label shift.
- 38
AutoML
Automated model and pipeline search.
- 39
Neural Architecture Search
Search-based and gradient-based architecture discovery.
- 40
Hyperparameter Optimization
Bayesian optimization, Hyperband, and population-based search.
- 41
Recommendation ML
Embedding-based and sequential recommendation models.
- 42
Tabular Deep Learning
Deep models for tabular data and comparisons to GBDTs.
- 43
Geometric Deep Learning
Equivariance, group-invariant networks, and manifold methods.
- 44
Equivariant Neural Networks
Networks invariant or equivariant to symmetry groups.
- 45
Scientific Machine Learning
Physics-informed networks, neural ODEs, and ML for science.
- 46
Physics-Informed Neural Networks
Networks regularized by PDE residuals.
- 47
Neural Operators
Learning solution operators of PDEs (FNO, DeepONet).
- 48
Efficient Machine Learning
Pruning, quantization, distillation, and on-device inference.
- 49
Model Compression
Quantization, pruning, and low-rank factorization.
- 50
Knowledge Distillation
Training compact students from large teachers.
- 51
MLOps
Operationalizing ML — pipelines, monitoring, and lifecycle.
- 52
ML Systems
Distributed training and inference systems for ML.
- 53
Mechanistic Interpretability
Reverse-engineering neural network internals and circuits.
- 54
Representation Engineering
Probing, steering, and editing model representations.
- 55
Causal Machine Learning
Causal inference combined with ML for treatment effects.
- 56
Neural Networks and Deep Learning
Multilayer perceptrons and the deep-learning paradigm.
- 57
Generative Models
Models that generate samples from learned data distributions.
- 58
Reinforcement Learning
Learning by trial and error from reward signals.
Review this topic
This page was drafted by an agent and is waiting on expert review. Spotted a wrong prerequisite, a missing concept, a misattributed source, or a factual slip? Tell us — your review opens a tracked issue maintainers act on.