Correlogram
A matrix of pairwise correlations that reveals how every variable in a dataset relates to every other — at a glance.
// 01 — The chart
What it looks like
A correlogram of four Iris dataset measurements. Strong positive correlations appear in deep red; weak or negative correlations in light tones.
// 02 — Definition
What is a correlogram?
A correlogram (also called a correlation matrix chart) is a visualization that displays the pairwise correlation coefficients between all variables in a dataset as a color-coded matrix. Each cell in the grid represents how strongly two variables are related, with colors and/or sizes encoding the strength and direction of the relationship.
The correlation coefficient ranges from −1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 meaning no linear relationship. By arranging all variables along both axes, the correlogram lets analysts quickly scan dozens of relationships simultaneously.
Correlograms are a cornerstone of exploratory data analysis (EDA). They help data scientists decide which variables to investigate further, which predictors to include in models, and where multicollinearity might be a concern.
Key insight: A correlogram is symmetric around its diagonal — the cell at (A, B) always equals (B, A). The diagonal itself always shows perfect correlation (1.00) because every variable is perfectly correlated with itself.
// 03 — Anatomy
Parts of a correlogram
// 04 — Usage
When to use it — and when not to
- You have many numeric variables and want to scan all pairwise relationships at once
- Performing exploratory data analysis (EDA) before modeling
- Checking for multicollinearity among predictor variables
- Deciding which variables to include in a regression or classification model
- Communicating a high-level overview of variable relationships to stakeholders
- Your audience understands correlation coefficients
- You have only 2–3 variables — a simple scatter plot is more informative
- Relationships are non-linear — Pearson correlation will miss them entirely
- You need to show causation, not just correlation
- Your audience is non-technical — the matrix format can be intimidating
- You have too many variables (50+) — the matrix becomes unreadable
- Data contains mostly categorical variables — use a mosaic plot instead
// 05 — Reading guide
How to read a correlogram
Follow these steps whenever you encounter a correlogram.
Read the variable labels
Identify the variables along both axes. Since the matrix is symmetric, each variable appears on both the row and the column.
Locate the diagonal
The diagonal always shows 1.00 — every variable is perfectly correlated with itself. This serves as a visual anchor and color reference for the maximum value.
Scan for intense colors
The deepest-colored cells indicate the strongest correlations (positive or negative). These are the relationships worth investigating further.
Check the sign
If the chart uses a diverging color scale (e.g., red for positive, blue for negative), distinguish between variables that move together and those that move in opposite directions.
Note the coefficient values
If numeric values are displayed, use them for precision. A correlation of 0.85 is quite different from 0.50, even if the colors look similar on a saturated scale.
// 06 — Data format
What your data should look like
The input is typically a table where each column is a numeric variable and each row is an observation. The correlogram is computed from the correlation matrix (e.g., using Pearson, Spearman, or Kendall methods).
// Raw data table
| sepal_len | sepal_wid | petal_len | petal_wid |
|-----------|-----------|-----------|-----------|
| 5.1 | 3.5 | 1.4 | 0.2 |
| 7.0 | 3.2 | 4.7 | 1.4 |
| 6.3 | 3.3 | 6.0 | 2.5 |
// 07 — Construction
How to build one
Compute the correlation matrix — calculate pairwise correlation coefficients (Pearson, Spearman, or Kendall) for all numeric variables.
Choose a color scale — use a sequential scale for all-positive correlations, or a diverging scale (e.g., blue-white-red) if negative correlations matter.
Map each cell — assign the correlation value to the color (and optionally the size) of each matrix cell.
Label axes — place variable names along both rows and columns for reference.
Optionally show only half — since the matrix is symmetric, you can display only the upper or lower triangle to reduce redundancy.
// 08 — Pitfalls
Common mistakes
Ignoring non-linear relationships
Pearson correlation only captures linear associations. A U-shaped or circular relationship can have r ≈ 0 while being strongly related.
Overloading with too many variables
A 50×50 matrix becomes a sea of colors with no actionable insight. Group or pre-filter variables first.
Using a poor color scale
A rainbow color scale makes it nearly impossible to compare magnitudes. Use a perceptually uniform sequential or diverging palette.
Confusing correlation with causation
A strong r-value between ice cream sales and drowning deaths doesn't mean one causes the other — both are driven by summer heat.
Not checking for outliers
A single extreme value can inflate or deflate Pearson r dramatically. Always pair correlograms with scatter plots for suspicious pairs.
// 09 — In the wild
Real-world examples
Finance
Correlograms of stock returns show which assets move together — essential for building diversified portfolios and understanding sector contagion during market downturns.
Genomics
Researchers use correlograms to identify co-expressed genes across thousands of samples, revealing regulatory networks and potential drug targets.
Marketing
Campaign analysts use correlograms to see which customer behavior metrics (page views, time on site, clicks, purchases) are correlated, helping them identify leading indicators of conversion.
// 10 — At a glance
Quick reference
Also known as
Correlation matrix chart, correlation heatmap
Category
Correlation
Typical data
Multiple numeric variables
Best for
Scanning all pairwise relationships
Difficulty
Intermediate
Key stat
Pearson r, Spearman ρ, or Kendall τ
// 11 — Accessibility
Making it accessible
Include numeric labels in each cell — color alone is insufficient for colorblind readers
Use a colorblind-safe palette (e.g., viridis or cividis) instead of red-green diverging scales
Provide a text summary or table alternative for screen readers
Add a clear color legend showing the correlation range
Use sufficient contrast between cell border and background for clarity
// 12 — Variations
Common variations
Half-matrix correlogram
Shows only the upper or lower triangle, removing redundant duplicate information.
Circle-size correlogram
Uses circle radius in each cell to encode correlation magnitude alongside color.
Clustered correlogram
Reorders rows and columns using hierarchical clustering to group similar variables together.
Ellipse correlogram
Uses oriented ellipses — the more elongated and tilted, the stronger the correlation.
// 13 — FAQs
Frequently asked questions
What is a correlogram?+
A correlogram (also called a correlation matrix chart) is a visualization that displays the pairwise correlation coefficients between all variables in a dataset as a color-coded matrix. Each cell in the grid represents how strongly two variables are related, with colors and/or sizes encoding the strength and direction of the relationship.
When should you use a correlogram?+
Use a correlogram when you have many numeric variables and want to scan all pairwise relationships at once. It also works well when performing exploratory data analysis (EDA) before modeling, and when checking for multicollinearity among predictor variables.
When should you avoid a correlogram?+
Avoid a correlogram when you have only 2–3 variables — a simple scatter plot is more informative. It is also a poor fit when relationships are non-linear — Pearson correlation will miss them entirely, or when you need to show causation, not just correlation.
What data do you need to make a correlogram?+
The input is typically a table where each column is a numeric variable and each row is an observation. The correlogram is computed from the correlation matrix (e.g., using Pearson, Spearman, or Kendall methods).
How is a correlogram different from a scatter plot?+
Both a correlogram and a scatter plot can look similar at first glance, but they answer different questions. Reach for a correlogram when the comparisons and patterns it was designed to reveal match what you need to communicate, and choose a scatter plot when its particular strengths better fit your data and audience.
What is another name for a correlogram?+
Correlogram is also known as Correlation matrix chart, correlation heatmap. The name varies between fields, but the visualisation technique is the same.
What size of dataset works best for a correlogram?+
Correlogram works best for Scanning all pairwise relationships. Outside that range the chart either looks empty or becomes too cluttered to read clearly.
Are correlograms accessible to screen readers?+
Yes — a correlogram can be made accessible to screen readers by pairing it with a clear text summary of the key insight, ensuring color choices meet WCAG contrast guidelines, adding descriptive alt text or aria-label to the SVG, and offering the underlying data as an HTML table fallback for assistive technologies.