ScientificIntermediate

Volcano Plot

A scatter plot that combines fold change with statistical significance — instantly revealing which genes or proteins are both meaningfully changed and statistically reliable.

// 01 — The chart

What it looks like

Example — Differential gene expression: treated vs controlRNA-seq
108520log2(fold change)−log10(p-value)−4−1014TP53BRCA1

A volcano plot showing differential gene expression. Red dots (upper-right) are significantly upregulated genes; green dots (upper-left) are significantly downregulated. Gray dots are not significant.

// 02 — Definition

What is a volcano plot?

A volcano plot is a scatter plot that simultaneously displays the magnitude of change (fold change) and the statistical significance (p-value) for thousands of features measured in an experiment. The name comes from the plot’s characteristic shape — most points cluster near the center at the base, while significant features fan outward and upward like a volcanic eruption.

The X-axis shows log2(fold change) between two conditions (e.g., treated vs. control). Features that are upregulated appear on the right side, while downregulated features appear on the left. The Y-axis shows −log10(p-value), pushing the most statistically significant features to the top of the plot.

This dual-axis design solves a critical problem in high-throughput biology: some features may be statistically significant but biologically trivial (tiny fold change), while others may show large changes but fail statistical testing. The volcano plot’s quadrant system highlights features that pass both thresholds — the truly interesting hits.

Origin: The volcano plot was introduced in the early 2000s as microarray technology made it possible to measure gene expression across the entire genome simultaneously. It became a standard visualization in bioinformatics alongside MA plots and heatmaps for differential expression analysis.

// 03 — Anatomy

Parts of a volcano plot

ABCDE
A — Y-axis (−log₁₀ p-value): Vertical axis showing transformed p-values; higher values indicate stronger statistical significance
B — X-axis (log₂ fold change): Horizontal axis showing the magnitude and direction of change between conditions (left = down, right = up)
C — Downregulated (significant): Features in the upper-left quadrant that are both significantly decreased and statistically reliable
D — Upregulated (significant): Features in the upper-right quadrant that are both significantly increased and statistically reliable
E — Significance threshold: Horizontal line marking the p-value cutoff (e.g., 0.05 or FDR-adjusted) above which features are considered significant

// 04 — Usage

When to use it — and when not to

✓Use a volcano plot when…
  • Analyzing differential gene expression from RNA-seq or microarray experiments
  • You need to identify features that are both statistically significant and biologically meaningful
  • Comparing two experimental conditions (treated vs. control, disease vs. healthy)
  • Visualizing results from proteomics, metabolomics, or other -omics experiments
  • You want to highlight specific genes of interest in the context of the full dataset
  • Presenting results where both magnitude and confidence matter
×Avoid a volcano plot when…
  • You have more than two conditions — consider a heatmap or parallel coordinates instead
  • Your data doesn’t have both fold change and p-value information
  • You want to show genomic position — use a Manhattan plot instead
  • Your audience is unfamiliar with log-transformed axes and statistical concepts
  • You’re comparing paired samples — an MA plot may be more appropriate
  • You only care about significance without effect size — a simple ranked list may suffice

// 05 — Reading guide

How to read a volcano plot

Follow these steps to extract insights from a volcano plot.

1

Identify the threshold lines

Locate the horizontal significance threshold (p-value cutoff, often shown as a dashed line) and the vertical fold-change cutoffs. Together these define the quadrants that separate significant hits from background noise.

2

Focus on the upper corners

The most interesting features sit in the upper-left (significantly downregulated) and upper-right (significantly upregulated) corners. These passed both the statistical and biological significance filters.

3

Check which genes are labeled

Look for labeled points — these are usually the top hits by significance or fold change, or genes of known biological interest. They represent the primary findings of the analysis.

4

Assess the symmetry of the plot

A roughly symmetrical volcano suggests balanced up- and downregulation. Strong asymmetry (many more hits on one side) may indicate a directional biological response or a technical artifact.

5

Note the density of the base

The cluster of gray points near the origin represents features with neither significant change nor strong p-values. A very wide base suggests high biological or technical variability in the dataset.

// 06 — Common mistakes

Mistakes to watch out for

Using unadjusted p-values

When testing thousands of genes simultaneously, raw p-values produce massive numbers of false positives. Always use multiple-testing corrected values (FDR/Benjamini-Hochberg or Bonferroni) on the Y-axis. This is the most common and consequential mistake in volcano plot construction.

Setting arbitrary fold-change thresholds

A fold-change cutoff of 2 (log₂FC = 1) is convention, but it’s not always appropriate. In low-expression genes, a 2-fold change may be noise; in highly expressed genes, even a 1.5-fold change can be biologically meaningful. Consider the biology of your system.

Overplotting obscuring density

With 20,000+ genes, many points overlap in the base of the volcano. Without transparency or density-based coloring, you can’t tell if there are 10 or 10,000 points in a region. Use alpha transparency or hexbin densities.

Labeling too many or too few genes

Labeling every significant gene creates an unreadable mess. Labeling none forces readers to consult supplementary tables. Strike a balance — label the top 10–20 hits plus any genes of known biological interest.

Ignoring the symmetry or lack thereof

A highly asymmetric volcano plot (many more hits on one side) could indicate a real biological signal or a normalization issue. Always verify with an MA plot to check whether your normalization assumption is valid.

// 07 — Real-world examples

Where you’ll see volcano plots used

01

Cancer genomics: tumor vs. normal tissue

Identifying genes that are differentially expressed between cancerous and healthy tissue. Volcano plots help prioritize potential drug targets and biomarkers by showing which genes are most strongly and reliably altered.

Oncology
02

Drug discovery: treated vs. untreated cells

Pharmaceutical companies use volcano plots to understand how a candidate drug affects gene expression across the entire transcriptome. The plot reveals both on-target effects and potential off-target concerns.

Pharmacology
03

Proteomics: comparing protein abundances

Mass spectrometry-based proteomics generates fold-change and significance data for thousands of proteins. Volcano plots help identify which proteins are significantly altered between disease states or treatment conditions.

Proteomics

// 08 — At a glance

Quick reference

Also known asSignificance vs fold-change plot
First usedEarly 2000s, with the rise of microarray technology
Best forIdentifying features that are both statistically significant and biologically meaningful
Data typeslog₂(fold change) on X-axis, −log₁₀(p-value) on Y-axis
Typical thresholdsp < 0.05 (adjusted), |log₂FC| > 1
Typical data size5,000 to 30,000+ features (genes, proteins)
Common toolsR (EnhancedVolcano, ggplot2), Python (bioinfokit, matplotlib), Prism
Common mistakesUnadjusted p-values, overplotting, arbitrary thresholds, poor labeling

// 09 — Variations

Types of volcano plots

The basic volcano plot has been adapted for various analytical needs.

Enhanced volcano plot

Adds point size encoding (e.g., by base expression) and extensive labeling to create a publication-ready figure with multiple information layers.

Gradient volcano plot

Uses continuous color gradients instead of discrete categories, encoding additional information like expression level or pathway membership.

Side-by-side volcano plots

Multiple volcano plots arranged as small multiples, comparing differential expression across several conditions or time points.

Interactive volcano plot

Web-based versions with hover tooltips, clickable points for gene details, and adjustable threshold sliders for exploratory analysis.

// 10 — FAQs

Frequently asked questions

What is a volcano plot?+

A volcano plot is a scatter plot that simultaneously displays the magnitude of change (fold change) and the statistical significance (p-value) for thousands of features measured in an experiment. The name comes from the plot's characteristic shape — most points cluster near the center at the base, while significant features fan outward and upward like a volcanic eruption.

When should you use a volcano plot?+

Use a volcano plot when analyzing differential gene expression from RNA-seq or microarray experiments. It also works well when you need to identify features that are both statistically significant and biologically meaningful, and when comparing two experimental conditions (treated vs. control, disease vs. healthy).

When should you avoid a volcano plot?+

Avoid a volcano plot when you have more than two conditions — consider a heatmap or parallel coordinates instead. It is also a poor fit when your data doesn’t have both fold change and p-value information, or when you want to show genomic position — use a Manhattan plot instead.

Is a volcano plot suitable for dashboards?+

Yes — a volcano plot can work well in dashboards as long as the panel is large enough for readers to perceive the encoded values, has a clear title, and includes the legend or axis labels needed to interpret it.

What category of chart is a volcano plot?+

Volcano Plot belongs to the Scientific family of charts. Charts in that family are designed to answer the same kind of question, so they often work as alternatives when one doesn't quite fit your data.

How do you read a volcano plot?+

Start with the axis labels and legend, then look at the overall shape before zooming into individual marks. Compare prominent features against the rest of the data, and verify any conclusion against the underlying numbers when precision matters.