Neural Networks and Deep Learning

Multilayer perceptrons and the deep-learning paradigm.


foundation tier

Neural Networks and Deep Learning addresses multilayer perceptrons and the deep-learning paradigm. It sits within Machine Learning and inherits that area’s core questions about correctness, scale, and tractability. This page surveys the conceptual axes of the topic and points to the references that frame ongoing research and teaching. The intent is to be useful both as an entry point for newcomers and as an index for practitioners cross-checking their mental model against the field’s primary sources.

Work on neural networks and deep learning can be organised around a few interlocking concerns: the formal objects under study, the algorithms or systems that compute over them, the resource trade-offs (time, memory, communication, statistical efficiency), and the empirical or theoretical guarantees that practitioners rely on. The sources cited below approach the topic from a mix of these angles.

Foundational references

Goodfellow, Deep Learning (2016) is a standard reference for this material and is used both as a curriculum anchor and as a long-form survey of techniques.

Historical context

Learning representations by back-propagating errors (Rumelhart, 1986) situates the topic in its historical trajectory; revisiting it clarifies which ideas in current practice are recent and which trace back to the field’s founding texts.

Open methodological questions in neural networks and deep learning cluster around how to compose the techniques above under realistic constraints — scale, adversarial inputs, partial observability, and shifting workloads. The cited references give the precise statements, proofs, and empirical evaluations that this overview only sketches; downstream topic pages drill into specific subfields.

Prerequisites

Sources

In context

Where this topic sits in the prerequisite graph. Click any node to jump.

Open in full atlas →

Reviewed by

Explore

  1. 01

    Feedforward Networks

    MLPs, activation functions, and universal approximation.

  2. 02

    Convolutional Neural Networks

    CNNs and their variants for grid-structured data.

  3. 03

    Recurrent Neural Networks

    RNNs, LSTMs, GRUs, and sequence modeling.

  4. 04

    Transformer Networks

    Self-attention architectures and the transformer family.

  5. 05

    Attention Mechanisms

    Soft, hard, and sparse attention variants.

  6. 06

    State-Space Models

    S4, Mamba, and structured state-space sequence models.

  7. 07

    Mixture-of-Experts Networks

    Sparsely-gated experts and routing in large models.

  8. 08

    Normalization Layers

    Batch, layer, group, and RMS normalization.

  9. 09

    Residual Networks

    Skip connections, ResNets, and very deep training.

  10. 10

    Neural ODEs and Continuous Models

    ODE-, SDE-, and CDE-based continuous-time neural networks.

  11. 11

    Spiking Neural Networks

    Event-driven SNNs and neuromorphic computation.


Review this topic

This page was drafted by an agent and is waiting on expert review. Spotted a wrong prerequisite, a missing concept, a misattributed source, or a factual slip? Tell us — your review opens a tracked issue maintainers act on.