Bioinformatics
Computational methods for analyzing biological sequence, structure, and omics data.
Bioinformatics sits within biology and addresses computational methods for analyzing biological sequence, structure, and omics data. The page below sketches the conceptual scope of the area, the methodological tools it relies on, and the recent literature anchoring its current frontier.
The area organises around a small number of recurring axes: scope (what biological scales the work spans), method (the dominant experimental or computational tools), data regime (what kinds of measurements are now routine vs. still frontier), and open questions (what the field cannot yet do reliably). The sources below cover different combinations of these axes.
Foundational references
Durbin, Biological Sequence Analysis is a standard reference for the foundations covered here, used across the field to anchor terminology, canonical models, and the relationships between sub-areas of bioinformatics. Treat it as the entry point to which the more specialised work below adds frontier detail.
Compeau, Bioinformatics Algorithms: An Active Learning Approach is a standard reference for the foundations covered here, used across the field to anchor terminology, canonical models, and the relationships between sub-areas of bioinformatics. Treat it as the entry point to which the more specialised work below adds frontier detail.
Mount, Bioinformatics: Sequence and Genome Analysis is a standard reference for the foundations covered here, used across the field to anchor terminology, canonical models, and the relationships between sub-areas of bioinformatics. Treat it as the entry point to which the more specialised work below adds frontier detail.
Open questions
Open questions in bioinformatics cluster around scaling current methods to larger systems, integrating measurements across modalities, and producing predictive rather than descriptive models. The references above mark the work that the next iteration of this page should engage with in more specific detail.
Prerequisites
Sources
- textbook · primary · 1998Biological Sequence Analysisdurbin-richard-1998, eddy-1998, krogh-1998, mitchison-1998
- textbook · primary · 2014Bioinformatics Algorithms: An Active Learning Approachcompeau-2014, pevzner-2014
- textbook · primary · 2004Bioinformatics: Sequence and Genome Analysismount-2004
In context
Where this topic sits in the prerequisite graph. Click any node to jump.
Explore
- 01
Sequence Alignment
Pairwise and multiple sequence alignment — algorithms and substitution models.
- 02
Homology Search
BLAST, HMMER, and profile-based search across sequence databases.
- 03
Sequence Clustering and Databases
Building and querying clustered sequence resources at scale.
- 04
Protein Language Models
Transformer-based models trained on protein sequences for representation and design.
- 05
Genomic Language Models
Foundation models trained on DNA sequences for variant and regulatory-element prediction.
- 06
Single-Cell Computational Analysis
Clustering, trajectory inference, and batch correction for single-cell omics.
- 07
Biological Network Inference
Statistical and ML methods for inferring regulatory and interaction networks.
- 08
Variant Interpretation
ML-based annotation of variant pathogenicity and effect.
- 09
Cryo-EM Image Processing
Computational pipelines for particle picking, classification, and reconstruction.
- 10
Cell and Tissue Image Segmentation
Deep-learning-based segmentation of cells and structures in microscopy.
- 11
Drug–Target Prediction
ML methods for predicting drug–protein interactions and target identification.
- 12
Foundation Models for Cell State
Pretrained transformer models on single-cell data for cell-type prediction and perturbation.
Review this topic
This page was drafted by an agent and is waiting on expert review. Spotted a wrong prerequisite, a missing concept, a misattributed source, or a factual slip? Tell us — your review opens a tracked issue maintainers act on.