CorrelationIntermediate

Hexagonal Binning Plot

A density-aware alternative to the scatter plot that aggregates thousands of overlapping points into color-coded hexagonal bins. The chart of choice when scatter plots collapse into a solid blob.

// 01The chart

What it looks like

Example — Flight delay vs. distancen = 12,000 flights
120600-60500mi1500mi2500miDistanceDelay (min)

A hexbin plot showing flight delay vs. distance. Darker hexagons indicate higher concentrations of flights — most cluster around short distances with minimal delays.

// 02Definition

What is a hexbin plot?

A hexagonal binning plot (hexbin plot) divides the 2D plotting area into a regular grid of hexagons and counts how many data points fall into each hex cell. The count is then mapped to color intensity, producing a density visualization that reveals patterns invisible in an overplotted scatter plot.

Hexagons are preferred over squares because they tile the plane with fewer edge artifacts, have no ambiguous corners, and provide a more uniform distance from center to edges. This makes hexagonal binning more perceptually accurate than rectangular binning. Each hex has six neighbors instead of four, which translates into smoother visual transitions between high-density and low-density regions.

Hexbin plots are the go-to solution when a scatter plot has so many points (thousands to millions) that individual dots overlap into a solid mass, hiding the true density distribution underneath. They preserve the bivariate relationship view of a scatter plot while replacing the ambiguous overplotted blob with a quantitative density map.

The technique was introduced by Daniel Carr and colleagues in 1987 specifically as a remedy for high-volume scatter plots. Today every serious plotting library ships a hexbin function: plt.hexbin in Matplotlib, geom_hex in ggplot2, d3-hexbin for the web.

Why hexagons? Hexagons have 6 neighbors (squares have only 4 edge-neighbors), the center-to-edge distance is more uniform than a square, and human perception responds better to the circular shape of hexagonal cells when estimating density.

// 03When to use

When a hexbin plot is the right call

Reach for a hexbin plot whenever a scatter plot would have hundreds or thousands of overlapping dots. The win is showing density honestly, not just hiding overplotting.

✓Use a hexbin plot when…
  • You have thousands or millions of data points that overplot in a scatter plot
  • Density patterns are more important than individual point positions
  • You want to reveal hidden clusters or hotspots in large datasets
  • You need to combine the relationship view of a scatter plot with density information
  • Working with GPS data, sensor data, click-stream logs, or any high-volume continuous measurements
  • You want a computationally efficient alternative to kernel density estimation
  • You want a discrete, honest density estimate without smoothing artifacts

// 04When not to use

When a hexbin plot is the wrong call

Hexbin is a power tool. Below a few hundred points it adds noise instead of removing it; for some questions a smoother or simpler chart wins.

×Avoid a hexbin plot when…
  • You have fewer than ~100 data points — individual dots are clearer
  • You need to identify specific observations, not density regions
  • Your data has meaningful categorical grouping that color-coding would convey better
  • The exact shape of the distribution matters — use contour or 2D density plots instead
  • You want smooth density contours rather than discrete binned cells
  • Bin size selection is too sensitive for your audience — small changes drastically alter the visualization
  • Either axis is categorical — hexbin only makes sense on continuous bivariate data

// 05Data requirements

What your data needs to look like

Before building the chart, your dataset needs to fit a specific shape. Use this checklist to confirm yours does.

Shape

One row per observation, with two continuous numeric columns. The hexagonal binning is computed at render time from the raw rows.

Minimum rows

~500 rows. Below that, a regular scatter plot communicates the same information more directly.

Maximum rows

Millions. Hexbin’s computational cost is O(n) and the visual output stays legible regardless of n.

Required fields
xrequired
number (continuous)

The horizontal continuous variable. Same role as the x in a scatter plot — distance, age, price, time, sensor reading. Hexbin works best when this variable spans a meaningful range with thousands of observations.

yrequired
number (continuous)

The vertical continuous variable. Paired with x for each observation. Both must be on a numeric scale; categorical variables won’t bin meaningfully into hexagons.

weight
number (optional)

Optional per-row weight that replaces the simple count in each bin. Use to plot a weighted density (e.g. revenue per location, rather than count of transactions).

Example data
distance_midelay_min
84212
1,540-3
32045
2,1808
54022
1,260-12

Tip: if your raw data is event logs at the millions-of-rows scale, you can bin server-side in SQL with a GROUP BY floor(x/bin), floor(y/bin) approximation and pass the pre-binned counts to the chart. The hexagonal axes-aligned grid is more visually correct, but a square pre-bin is often a useful first pass.

// 06Anatomy

Parts of a hexbin plot

Every hexbin plot is built from the same five pieces: two continuous axes, hexagonal bins, a low-density background, and a color legend that decodes the bin counts.

ABCDE
A — Y-axis: The vertical continuous variable, same as a scatter plot
B — X-axis: The horizontal continuous variable, same as a scatter plot
C — Hexagonal bin: Each hexagon represents a region; color intensity encodes point count
D — Low-density hex: Light-colored hexagons indicate few points in that region of x/y space
E — Color legend: Maps bin colors to count values so readers can estimate density

// 07Step-by-step

Step-by-step: how to build a good hexbin plot

An eight-step recipe that works regardless of tool. The bin-size choice is the only step that needs real iteration; the rest is mechanical once the data is in shape.

  1. 1

    Confirm you have enough points

    Hexbin needs density to make sense. Below ~500 points, a normal scatter plot reads better. Above a few thousand, overplotting starts to bite and hexbin earns its place.
  2. 2

    Pick the bin count

    Bin count (or grid size) is the only knob you really tune. Start with about √n bins on the long axis where n is the row count, then adjust visually until the structure is visible without becoming a confetti mosaic.
  3. 3

    Pick a sequential colormap

    Use a sequential, perceptually uniform colormap (viridis, magma, inferno, cividis). Avoid rainbow / jet — they introduce visual artifacts that don’t match the underlying density gradient.
  4. 4

    Set the lowest bin to a contrasting background

    Empty hexes should disappear into the chart background or be drawn in the lightest sequential color so the eye reads them as ‘zero,’ not as another density tier.
  5. 5

    Always show a colorbar

    Without a legend, color is unreadable. Show the colorbar with a label like “count” or “density”, and prefer logarithmic color scaling when the count distribution is heavily skewed.
  6. 6

    Lock the axes to the data range

    Hexbin doesn’t ‘degrade well’ if the axes are too generous — you end up with a tiny cloud of dark hexes in a sea of empty ones. Trim the axes to the meaningful range of x and y.
  7. 7

    Title the takeaway, not the chart type

    Replace “Hexbin plot of distance vs. delay” with the conclusion you want the reader to draw — “Most flights cluster under 1500 mi with a tight delay distribution; long-haul flights show much wider variance.”
  8. 8

    Validate that the bin count tells the same story at ±50%

    If halving or doubling the bin count changes which hex is the densest, your bin choice is unstable. Pick a bin count that produces a consistent visual story across nearby values.

// 08Real-world examples

Where you’ll see hexbin plots used

Hexbin earns its keep wherever bivariate data shows up at scale: aviation, finance, GPS, sensor networks, and the kinds of bioinformatics where a single experiment produces tens of thousands of measurements.

01

Aviation: flight delay analytics

BTS-style flight datasets contain hundreds of thousands of rows. A hexbin of distance vs. arrival delay quickly shows that short-haul flights cluster on time while long-haul flights have a far wider delay spread — a story scatter plots can’t tell.

Aviation
02

Finance: risk vs return scatter

Quant teams plot tens of thousands of asset combinations as risk vs. return. A hexbin reveals the efficient-frontier-like density that a scatter plot of ~50k dots completely obscures.

Finance
03

Genomics: Bland-Altman of expression data

Genome-wide expression studies yield a Bland-Altman plot with one dot per gene. Hexbin replaces 30k overplotted dots with a smooth, banded density that makes the heteroscedastic structure visible.

Genomics
04

Web analytics: dwell time vs scroll depth

Product teams hexbin every user session. The densest hexes reveal the typical engagement envelope; outlier hexes flag bot traffic and stuck-tab sessions.

Product analytics

// 09Variations

Variants of the hexbin plot

The basic hexbin plot has several useful relatives. Each replaces the count encoding with something different to answer a different question.

Weighted hexbin

Color encodes a per-row weight (revenue, error magnitude) instead of raw count. Useful when the count alone misses what matters.

Summary hexbin (mean / median)

Color encodes the mean or median of a third variable across points in each bin. Effectively a hexagonal heatmap of a continuous z value.

Log-scaled hexbin

Same plot with a logarithmic color scale. Use when bin counts span several orders of magnitude and a linear scale collapses mid-density structure.

Faceted hexbins (small multiples)

One hexbin per panel, with the same axes, color scale and bin size. Great for comparing density across categorical splits.

// 10Comparisons

How it compares

Hexbin lives in the family of bivariate-density charts. The three most important comparisons are with the scatter plot it replaces, the smoothed 2D density that competes with it, and the heatmap that looks similar but answers a different question.

Hexbin vs scatter plot

A scatter plot shows every observation; a hexbin plot summarizes them. Use scatter when individual points matter and overplotting isn’t a problem; switch to hexbin once the cloud is dense enough to obscure structure.

Hexbin plot

Aggregates points into hexagonal bins colored by density. Reveals structure even with millions of points.

  • Designed for high-volume data (≥ 1k points)
  • Color encodes density, not category
  • Individual observations are not visible

Scatter plot

Plots every observation as a point. Best with hundreds of points or fewer; degrades into a blob beyond that.

  • Each observation is individually visible
  • Easy to color by category or label outliers
  • Suffers from overplotting at high volume

Hexbin vs 2D density plot

Both encode density, but a hexbin plot is discrete and a 2D density plot is smooth. Hexbin is honest about its bin grid; density plots interpolate, which can hide modes or invent ones.

Hexbin plot

Discrete tiles with a clearly visible grid. Each hex is an honest count of observations in that area.

  • Discrete bins, no smoothing
  • Easy to compute exact counts per region
  • Edge effects are visible, not hidden

2D density plot

Continuous contours produced by kernel density estimation. Looks smoother but depends on a bandwidth choice.

  • Smooth contours, no visible grid
  • Bandwidth selection affects the picture
  • Better for showing the shape of one mode

Hexbin vs heatmap

A heatmap uses square bins on a regular grid and is usually for matrix-like data (e.g. correlation matrices). A hexbin plot uses hexagonal bins on continuous x/y space and is usually for raw observation data.

Hexbin plot

Hexagonal bins over continuous x/y axes. Used to summarize raw observation pairs by density.

  • Continuous x and y axes
  • Hexagons reduce edge artifacts
  • Plots one row per observation

Heatmap

Square cells over discrete row/column indices. Often used to visualize a precomputed matrix of values.

  • Categorical or pre-binned axes
  • Square cells aligned to a grid
  • Plots one cell per row×column pair

// 11Common mistakes

Common hexbin plot mistakes

The bin-size knob is responsible for almost every bad hexbin plot. The other mistakes mostly come from picking a flashy colormap or from overlaying the chart on top of the data it summarizes.

Choosing a bin size by accident

Default bin counts in plotting libraries (10–20) are tuned for tiny demo data. Always tune the bin count visually for your actual sample size.

Using a rainbow colormap

Jet, hsv, and similar palettes introduce false bands that don’t correspond to density. Always stick to perceptually uniform sequential maps (viridis, magma, inferno, cividis).

Linear color scale for skewed counts

When counts are heavily skewed, a linear color scale paints almost everything the lightest color and one tiny region the darkest. Switch to a log scale.

Hiding low-density bins by default

Setting mincnt above 1 to remove sparse bins quietly hides outliers and overstates how clean the data is. Be explicit if you do it.

Overlaying scatter dots on the hexbins

If you can show every point, you don’t need a hexbin. Pick one encoding and commit.

No colorbar at all

Without a legend, color is just decoration. Always show a colorbar with a unit-bearing label.

// 12Accessibility

Accessibility checklist

Run through this list before publishing. The chart should still communicate its message to readers using assistive technology, color-blind users, keyboard navigation, and reduced-motion settings.

  • ✓

    Use a perceptually uniform sequential colormap

    WCAG 1.4.1
    Viridis, magma, inferno, and cividis preserve density ordering for color-blind readers and convert correctly to grayscale. Avoid rainbow / jet — they introduce false bands that don’t reflect the data.
  • ✓

    Color contrast for axis text and labels

    WCAG 1.4.3
    Axis tick labels, the chart title, and the colorbar label must reach 4.5:1 contrast against the page background. Don’t place text on top of the densest hexagons unless contrast is verified for every fill.
  • ✓

    Provide a text alternative for the chart

    WCAG 1.1.1
    An accessible name should describe the takeaway, not the visual: “Flight delays cluster under 60 minutes for distances under 1500 miles, with a wider spread for long-haul flights.”
  • ✓

    Expose the underlying data

    WCAG 1.3.1
    Offer a downloadable CSV or a hidden data table that screen readers can navigate. Hexbin’s spatial encoding is essentially impossible to experience without sight, so structured data must be available.
  • ✓

    Keyboard-accessible interactivity

    WCAG 2.1.1
    If hexagons reveal a tooltip on hover, every hex should be reachable with the Tab key and the tooltip should appear on focus, with a visible focus ring.
  • ✓

    Respect prefers-reduced-motion

    WCAG 2.3.3
    Avoid animated transitions of bin sizes or colors unless gated behind a prefers-reduced-motion: no-preference media query.
  • ✓

    Resizable and zoomable

    WCAG 1.4.4
    Use a responsive viewBox and let the chart container reflow with the viewport. The hexagon grid should remain legible at 200% browser zoom.

// 13Best practices

Design and craft tips

A short list of dos and don’ts that, in our experience, separate publication-quality hexbin plots from the demo-grade ones every plotting tutorial produces.

Do

Tune bin size visually

Start at √n bins on the long axis and step through nearby values. The right bin size makes structure pop without devolving into noise.
×Don’t

Use a rainbow colormap

Rainbow scales (jet, hsv) introduce false bands that don’t correspond to the underlying density. Stick with viridis-family sequential maps.
Do

Show the colorbar with units

Label the colorbar “count” or “flights per bin” so readers can map color back to a number, not just to a relative shade.
×Don’t

Hide low-count bins by default

Setting mincnt above 1 to filter sparse bins quietly hides outliers and creates a misleadingly clean picture. Make it explicit if you do.
Do

Trim the axes to the data range

Generous padding leaves a tiny cloud floating in empty hexes. Crop tightly so the binned area fills the panel.
×Don’t

Mix hexbin with raw points

Overlaying scatter dots on top of hexagons defeats the purpose of binning. Pick one encoding and commit to it.
Do

Use a log color scale for skewed counts

When a few bins hold most of the mass, switch the color scale to log so mid-density structure remains visible.
×Don’t

Compare two hexbin panels with different bin sizes

Side-by-side hexbins are only comparable when bin size, color scale, and axis limits match across panels.

// 15Tool instructions

How to build it in your tool of choice

Recipes for the libraries and platforms that ship a real hexbin implementation. Tableau and Power BI both fall back to circular density kernels by default; the workarounds are noted.

Python (Matplotlib)

Code — ~5 min
  1. 01Install Matplotlib and NumPy with pip install matplotlib numpy.
  2. 02Load the data into two NumPy arrays or pandas Series of equal length.
  3. 03Create a figure with fig, ax = plt.subplots(figsize=(8, 6)).
  4. 04Call hb = ax.hexbin(x, y, gridsize=40, cmap='inferno', mincnt=1).
  5. 05Add a colorbar with fig.colorbar(hb, label='count') and label both axes.
  6. 06Save with plt.savefig('hexbin.png', dpi=200, bbox_inches='tight').

Tip: pass bins='log' to plt.hexbin() when the count distribution is heavily skewed — it stretches the dynamic range so mid-density structure becomes visible.

R (ggplot2)

Code — ~5 min
  1. 01Install ggplot2 and hexbin with install.packages(c('ggplot2', 'hexbin')).
  2. 02Build a tibble or data frame with two numeric columns (x and y).
  3. 03Pipe it into ggplot(df, aes(x = distance, y = delay)).
  4. 04Add geom_hex(bins = 40) to compute and draw the hexagonal binning.
  5. 05Apply scale_fill_viridis_c(option = 'inferno', trans = 'log10') for a perceptual color scale.
  6. 06Polish with labs(), theme_minimal(), and fixed coord_cartesian() limits.

Use stat_summary_hex() instead of geom_hex() when you want to color hexes by a third variable’s mean or median, not by simple count.

JavaScript (D3 / Observable Plot)

Code — ~10 min
  1. 01Install with npm i d3 d3-hexbin or use CDN script tags for the browser.
  2. 02Set up scales with d3.scaleLinear() for x and y, mapping data range to pixel range.
  3. 03Compute bins with const hexbin = d3Hexbin().radius(8).extent([[0,0],[w,h]]).
  4. 04Project the data: const bins = hexbin(data.map(d => [x(d.x), y(d.y)])).
  5. 05Build a sequential color scale: d3.scaleSequential(d3.interpolateInferno).domain([0, d3.max(bins, b => b.length)]).
  6. 06Render <path d={hexbin.hexagon(radius)}/> at each bin position, fill by color scale, and add a legend.

Observable Plot ships a higher-level API: Plot.hexbin() returns a ready-to-render mark in three lines, with sensible defaults for bin size, color scale, and legend.

Tableau

BI — ~6 min
  1. 01Drag the x measure to Columns and the y measure to Rows.
  2. 02Switch the Marks card from Automatic to Density (Tableau’s built-in density mark).
  3. 03Open Color and choose a sequential palette like Viridis or Density-Multi-color.
  4. 04Adjust the Density slider and Size to control the bin size visually.
  5. 05Add a title that states the takeaway and remove the unhelpful default “Measure of …” header.
  6. 06Tableau doesn’t ship a true hexagonal density mark by default — the Density mark uses circular kernels; for true hexbins, use a Python TabPy script or pre-bin in SQL/Python.

If you must have hexagons specifically, pre-bin in SQL/Python and pass bin centers + counts to Tableau, then use a polygon mark with hex coordinates.

Power BI

BI — ~6 min
  1. 01Power BI doesn’t ship a native hexbin chart — use the marketplace visual “Hex Bin Plot” or “R script visual.”
  2. 02For the R route, drag two numeric fields to the Values well and paste a ggplot2 + geom_hex script into the R script editor.
  3. 03Set the bin count via the bins parameter in geom_hex(bins = 40).
  4. 04Apply scale_fill_viridis_c() so the result reads correctly when the report is exported to PDF.
  5. 05Set a fixed plot size in the script (ggsave width/height) so the chart isn’t squashed in narrow report pages.
  6. 06Add a takeaway-style title via Format → Title.

If you can’t install custom visuals, fall back to a plain Scatter chart with low Mark Opacity (~10%) — it imitates a density view, though imperfectly.

// 16Code examples

Working code in the most common stacks

Three runnable snippets that produce equivalent hexbin plots in Python, R, and JavaScript. Each uses the same toy flight-delay dataset, a perceptually uniform colormap, and a log color scale to handle skewed counts.

hexbin.py
import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(42)
n = 12_000

# Simulated flight data: shorter flights cluster at moderate delay,
# longer flights show much wider delay variance.
distance = rng.gamma(shape=2.0, scale=600, size=n)
delay = rng.normal(loc=10, scale=18, size=n) + (distance / 200) * rng.normal(0, 1, size=n)

fig, ax = plt.subplots(figsize=(8, 5.5))
hb = ax.hexbin(
    distance,
    delay,
    gridsize=40,
    cmap="inferno",
    mincnt=1,
    bins="log",       # stretch dynamic range when counts are skewed
)

ax.set_xlim(0, 3500)
ax.set_ylim(-60, 120)
ax.set_xlabel("Flight distance (mi)")
ax.set_ylabel("Arrival delay (min)")
ax.set_title("Most short-haul flights cluster on time — long-haul variance fans out", loc="left")
ax.spines[["top", "right"]].set_visible(False)

cb = fig.colorbar(hb, ax=ax, label="flights per hex (log scale)")
cb.outline.set_visible(False)

plt.tight_layout()
plt.savefig("hexbin.png", dpi=200, bbox_inches="tight")
plt.show()
$ python hexbin.py

// 17 — FAQs

Frequently asked questions

What is a hexbin plot?+

A hexagonal binning plot (hexbin plot) divides the 2D plotting area into a regular grid of hexagons and counts how many data points fall into each hex cell. The count is then mapped to color intensity, producing a density visualization that reveals patterns invisible in an overplotted scatter plot.

When should you use a hexbin plot?+

Use a hexbin plot when you have thousands or millions of data points that overplot in a scatter plot. It also works well when density patterns are more important than individual point positions, when you need to reveal hidden clusters or hotspots, and when you want a computationally efficient alternative to kernel density estimation.

When should you avoid a hexbin plot?+

Avoid a hexbin plot when you have fewer than ~100 data points — individual dots are clearer. It is also a poor fit when you need to identify specific observations rather than density regions, when categorical grouping matters, or when smooth contours would communicate the distribution better than discrete cells.

Why hexagons instead of squares?+

Hexagons tile the plane with fewer edge artifacts, have no ambiguous corners, and provide a more uniform distance from center to edges than squares do. Each hex has six neighbors instead of four, which produces smoother, more perceptually accurate density estimates than square binning.

How do you choose the right bin size?+

Bin size is the most important parameter and the most common source of mistakes. Too few bins hide structure; too many produce a noisy mosaic that looks like a scatter plot of confetti. Start with about √n bins on the longer axis (where n is the number of points) and adjust visually.

Is a hexbin plot suitable for dashboards?+

Yes — hexbin plots work well in dashboards as long as the panel is large enough to read individual hexagons, the color legend is visible, and the bin size is fixed across linked views so colors are comparable.

What category of chart is a hexbin plot?+

Hexbin Plot belongs to the Correlation family of charts. Charts in that family are designed to answer the same kind of question — how two continuous variables relate — so they often work as alternatives when one doesn't quite fit your data.

How do you read a hexbin plot?+

Start with the axis labels and color legend, then look for the darkest hexagons — those are the modal regions of your data. Note whether the high-density cloud is round (no correlation) or elongated (positive or negative correlation), and check the edges for outliers and how quickly density drops off.

What’s the best library for building hexbin plots?+

For Python, matplotlib’s plt.hexbin() is the canonical choice and ships with every Anaconda install. For R, the hexbin package combined with ggplot2’s geom_hex() is the standard. For the web, d3-hexbin or Observable Plot’s hexbin transform produce publication-quality results.

// 18References

References and further reading

Primary sources, official library documentation, and the cartography reference materials cited throughout this guide.

  • The original paper that introduced hexagonal binning as a technique for visualizing large bivariate datasets. Published in the Journal of the American Statistical Association.
    https://www.jstor.org/stable/2289444
  • Encyclopedia entry covering the algorithm, history, and visual encoding of hexbin plots.
    https://en.wikipedia.org/wiki/Hexagonal_binning
  • Official API reference for the Python hexbin helper used in this guide’s code sample. Documents gridsize, bins, mincnt, and the linear / log color scale options.
    https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hexbin.html
  • Tidyverse documentation for the ggplot2 hexagonal binning geometry, including its dependency on the hexbin package.
    https://ggplot2.tidyverse.org/reference/geom_hex.html
  • d3-hexbinLibrary
    Maintained D3 plugin that computes hexagonal bins and provides a hexagon path generator for SVG / Canvas rendering.
    https://github.com/d3/d3-hexbin
  • Higher-level Observable Plot transform for hexagonal binning, with worked examples and color-scale defaults.
    https://observablehq.com/plot/transforms/hexbin
  • Practical tutorial that includes hexbin plots as one of the recommended alternatives when point maps overplot.
    https://academy.datawrapper.de/article/302-what-to-use-instead-of-dot-maps
  • Web Accessibility Initiative guidance on text alternatives, long descriptions, and data tables for complex charts — directly applicable to dense hexbin visualizations.
    https://www.w3.org/WAI/tutorials/images/complex/
  • Catalog of perceptually uniform sequential colormaps designed to remain readable in grayscale and for color-vision-deficient viewers.
    https://www.fabiocrameri.ch/colourmaps/