Home/Chart Types/Correlation/Scatter plot

CorrelationIntermediate

Scatter Plot

A chart that plots individual data points on two axes to reveal relationships, correlations, and clusters between two continuous variables — the workhorse of exploratory analysis since Herschel sketched the first one in 1833.

// 01 — The chart

What it looks like

Example — Study hours vs. exam scoren = 24 students

A scatter plot showing a positive correlation between study hours and exam scores. The dashed line shows the linear trend, and one outlier is highlighted.

// 02 — Definition

What is a scatter plot?

A scatter plot (also called a scatter chart, scattergram, or XY plot) displays individual data points on a two-dimensional plane, where one variable is plotted on the horizontal axis and another on the vertical axis. Each dot represents a single observation — one student, one country, one transaction — positioned by its X and Y values. The cloud of dots, taken together, is what answers the analyst’s question: do these two variables move together, and if so how tightly?

The primary purpose of a scatter plot is to reveal the relationship between two continuous variables. When the dots form a pattern — sloping upward, sloping downward, curving, clustering, or scattering randomly — the shape tells you how one variable behaves as the other changes. A tight upward cloud means strong positive correlation; a wide, shapeless cloud means little or no correlation. The slope of any fitted trend line is the average effect; the spread around the line is the noise.

Unlike bar charts (which compare categories) or line charts (which show change over time), scatter plots are uniquely designed to answer: “Does Y change as X changes, and how reliably?” They are the working tool of regression analysis, the screening device for outlier detection, and the visual anchor for almost every introductory statistics course because position on a common axis is the encoding humans read most accurately. Where they break down is at extremes of size: under ten points the pattern is noise, and past several thousand the markers overlap into a blob and you need a hex-bin or density variant.

The price of that simplicity is a single, persistent trap: correlation is not causation. A scatter plot can show that ice cream sales and drowning rates rise together, but it cannot say that ice cream causes drowning. Both rise with summer heat. Every scatter plot is a question (“why do these move together?”) rather than an answer, and the rest of this guide is about building scatters that reward careful questioning instead of inviting the wrong one.

Origin: The scatter plot was first used by English scientist John Frederick W. Herschel in 1833 to analyze the orbits of double stars. Francis Galton popularized it in the 1880s for studying human traits, plotting parents’ against children’s heights — the chart that gave us the word “regression.” Frank Anscombe’s 1973 quartet later cemented the scatter plot as the chart you must look at before trusting any summary statistic.

// 03 — When to use

When a scatter plot is the right call

Reach for a scatter plot whenever the question is about how two continuous variables move together and you have enough observations for a pattern to emerge. Below are the situations where it consistently wins against the alternatives.

&check;Use a scatter plot when…

You want to explore the relationship between two continuous variables
You’re looking for correlations — do values of X and Y tend to move together?
You need to identify outliers or unusual observations that deserve investigation
You have at least ~20 observations so a pattern can emerge above the noise
You’re checking whether a linear regression model fits the data before running one
You want to compare clusters or subgroups by coloring or shaping the markers
You’re building exploratory visuals where individual rows still need to be visible

// 04 — When not to use

When a scatter plot is the wrong call

Scatter plots can technically display many kinds of data, but “technically possible” is not the same as “good idea.” Below are the cases where the scatter plot actively hides information you need to communicate.

×Avoid a scatter plot when…

One axis is categorical (countries, products) — use a bar chart or strip plot instead
You want to show change over time — use a line chart so the temporal sequence is visible
You have very few data points (fewer than ~10) — patterns won’t be statistically meaningful
Both variables are categorical — use a mosaic plot or contingency table
Your data is so dense that points overlap into a single blob — use a hex bin or 2D density plot
You need exact values rather than patterns — a small data table communicates better
You want to show parts of a whole — use a pie, donut, or stacked bar instead
You have one continuous variable and want to show its distribution — use a histogram or density plot

// 05 — Data requirements

What your data needs to look like

Before building the chart, your dataset needs to fit a specific shape. Use this checklist to confirm yours does.

Shape

One row per observation, with two numeric columns for the X and Y coordinates. Optional columns for group (color/shape), size (bubble), and label (annotation).

Minimum rows

10 observations to start, ~20+ for a credible pattern. With fewer, the cloud is just noise.

Maximum rows

~5,000 observations before overplotting forces a switch to alpha blending or a hex bin / density variant.

Required fields

xrequired

number (continuous)

The horizontal coordinate of each marker. Typically the explanatory or predictor variable — the thing you suspect drives the outcome. Must be numeric; categorical X belongs on a strip plot or bar chart instead.

yrequired

number (continuous)

The vertical coordinate of each marker. Typically the outcome or response variable. Must be numeric and ideally measured on a scale where small differences are meaningful.

group

string (optional)

Optional categorical column used to color or shape markers so subgroups can be compared on the same axes. Keep groups to roughly 2–6; past that, marker styles become indistinguishable.

size

number (optional)

Optional positive quantitative column mapped to the marker area. Mapping a third variable to size promotes the chart from scatter plot to bubble chart — use only when the third variable answers a question.

label

string (optional)

Optional row label used to annotate notable points (e.g., country names on a Gapminder-style chart). Annotate two or three points by hand rather than every marker.

Example data

student	hours	score	group
S01	1.5	42	A
S02	2.0	48	A
S03	3.7	62	B
S04	5.5	70	B
S05	7.0	84	C
S06	9.2	95	C

Tip: if your raw data is pre-aggregated (means per group), un-aggregate it back to one row per observation before plotting. Tools like Tableau and Power BI silently aggregate by default — use the Detail well or disaggregate measures so each row becomes its own marker.

// 06 — Anatomy

Parts of a scatter plot

Every scatter plot is built from the same handful of parts. Knowing the names makes it easier to talk about what to keep, what to drop, and what most templates are getting wrong.

A — Y-axis: The vertical axis representing the dependent (outcome) variable

B — X-axis: The horizontal axis representing the independent (predictor) variable

C — Data point: Each dot represents one observation, positioned by its X and Y values

D — Trend line: A fitted line (often linear regression) showing the general direction of the relationship

E — Outlier: A point far from the cloud — may indicate an error, a special case, or an interesting anomaly

// 07 — Step-by-step

Step-by-step: how to build a good scatter plot

A nine-step recipe that works regardless of the tool. Walk through it the first few times and the moves become automatic; skip steps and the chart usually shows it.

1
Pick the question you want the chart to answer
A scatter plot answers “does Y change as X changes, and how tightly?” Write that question down before drawing anything. If your question is about ranking categories, change over time, or share of a whole, switch to a bar, line, or part-to-whole chart now — not after you have built it.
2
Get one row per observation
Each row in your data must represent one observation — one student, one country, one transaction. If your raw data is pre-aggregated (averages per group), un-aggregate it or accept that the chart will show group means, not the underlying spread.
3
Decide which variable goes on which axis
By convention, X is the explanatory variable and Y is the outcome. If you would describe the relationship as “Y depends on X,” put the candidate cause on X. If neither variable obviously depends on the other, the assignment is yours — just stay consistent across related charts.
4
Choose the axis ranges
Unlike a bar chart, scatter plots do not need to start at zero. Crop both axes to the data range plus a small margin so the cloud fills the chart and small differences become visible. Always show the actual numbers on the axis so the reader knows they are looking at a zoomed view.
5
Draw the markers and tame overplotting
If your dataset has fewer than ~500 points, draw plain filled circles. Past 500, lower marker alpha to roughly 0.2–0.4 so overlapping points darken into local density. Past several thousand, switch to a hex bin or 2D density plot — the scatter has stopped being legible.
6
Add a trend line if a relationship is plausible
A linear regression line summarizes the average direction of the cloud. Print the R² value next to it so the reader can judge fit quality. If the cloud curves, fit a LOESS or polynomial smoother instead — a straight line through curved data hides the real pattern.
7
Encode a third variable only when it earns its place
Color, shape, and size each let you add a third variable. Use color for nominal groups (with a colorblind-safe palette), shape for two or three groups, and size for a positive quantitative variable. If the third variable doesn’t change what the reader concludes, leave it out.
8
Annotate the points that tell the story
Label two or three notable observations with their row identifier (a country name, a player, a date). Don’t label everything — a sea of text removes the focus you just created. Outliers, the headline observation, and one or two extremes are enough.
9
Write a takeaway title and ship
“Study hours vs exam score” is a label. “Each extra hour of study correlated with about 6 more points on the exam” is a takeaway. Put the takeaway on the chart and the descriptive label as a subtitle, then verify the chart still works at the size your readers will see it.

// 08 — Real-world examples

Where you’ll see scatter plots used

Scatter plots show up in three places more than anywhere else: scientific papers, public-policy storytelling, and business analytics. Each context has its own conventions, and they all reward the same fundamentals.

Medicine: Drug dosage vs patient response

Pharmaceutical researchers use scatter plots to visualize the relationship between dosage levels and a measured outcome — blood pressure drop, antibody titer, time to remission. Points cluster tightly when the dose-response relationship is strong; outliers are flagged for chart review because they may be adverse reactions or non-responders. Trial reports almost always include the underlying scatter as well as the fitted curve.

Medical Research

Economics: GDP per capita vs life expectancy

The famous Gapminder visualization, popularized by Hans Rosling, shows countries as dots with GDP per capita on the X-axis and life expectancy on the Y-axis. Marker area encodes population and color encodes continent — a scatter plot that has graduated to a bubble chart. Two centuries of data played as an animation rewrote how lay audiences think about global development.

Economics

Sports: Player salary vs performance

Sports analysts use scatter plots to identify overpaid and underpaid players by plotting salary against a key performance metric (WAR for baseball, expected goals for soccer). Players above the trend line are outperforming their pay; those below are underperforming. The same chart drives front-office trade decisions and the analytics blogs that scrutinize them.

Sports Analytics

Business: Customer acquisition cost vs lifetime value

A SaaS dashboard showing one dot per acquisition channel, with cost per customer on X and lifetime value on Y. A 1:1 reference line splits the chart into healthy channels (above the line) and unprofitable ones (below). Operators glance at the chart, identify the channel that needs attention, and reallocate budget without ever opening the underlying spreadsheet.

Business Analytics

// 09 — Variations

Types of scatter plots

The basic scatter plot has several important variants, each suited to a slightly different data situation. The headline rule is the same as ever: pick the variant whose strengths match your question.

Bubble chart

Encodes a third variable through the area of each marker. Use only when the third variable is meaningful and positive.

Connected scatter

Lines connect points in time order, showing how the relationship between two variables evolves over time.

Hex bin / density scatter

Aggregates points into hexagonal cells whose color encodes count. Solves overplotting past several thousand points.

Jitter / strip plot

Adds small random offsets to points to prevent overlap when one axis is categorical or has only a few unique values.

// 10 — Comparisons

Scatter plot vs other chart types

Scatter plots get confused with several other chart types because they all live in or near the Correlation family. The differences matter — picking the wrong one changes what your reader is allowed to conclude.

Scatter plot vs bubble chart

Both place markers in a continuous X–Y plane. A scatter plot encodes two variables; a bubble chart encodes a third by mapping it to marker area. Use the bubble version only when the extra variable answers a question — otherwise the size channel becomes decorative noise.

Scatter plot

One marker per observation. Position encodes two variables; markers are typically the same size. The cleanest way to show a two-variable relationship.

Two variables: X and Y
All markers same size
Easiest to read at a glance

Bubble chart

One marker per observation, but marker area encodes a third quantitative variable. Color often encodes a fourth, categorical variable. Powerful but easy to overload.

Three or four variables encoded at once
Map size to area, never radius
Best with a strong takeaway and few markers

Scatter plot vs hex bin plot

Scatter plots show every observation; hex bin plots aggregate observations into hexagonal cells whose color encodes the count. Use a scatter for hundreds to a few thousand markers; switch to hex bins past that, where overplotting hides the underlying density.

Scatter plot

One mark per row. Outliers and individual points stay visible. Breaks down when markers overlap into a single solid blob.

Best for ~50 to ~5,000 observations
Outliers are obvious
Use alpha blending past ~1,000 points

Hex bin plot

Markers aggregated into hexagonal cells; cell color encodes the count. Density patterns become legible even with millions of points, but individual rows disappear.

Best past ~5,000 observations
Density patterns become readable
Outliers are smoothed away

Scatter plot vs line chart (correlation vs trend)

A scatter plot answers “how are X and Y related?”; a line chart answers “how does Y change over time (or another ordered variable)?” The visual cue is whether marks are connected. Connect only when the X-axis has a meaningful order.

Scatter plot

Unconnected markers. The X axis is any continuous variable. The reader’s eye searches for a cloud shape: rising, falling, curved, or random.

X is any continuous variable
No implied order between points
Highlights correlation and outliers

Line chart

Markers (or just a polyline) connected in X order. The X axis almost always represents time. The reader’s eye follows trend, slope, and inflection points.

X is ordered (usually time)
Connecting line implies sequence
Highlights trend and change

Scatter plot vs scatter plot matrix

A scatter plot shows one X–Y pair; a scatter plot matrix (also called a SPLOM or pair plot) shows every pairwise combination of several variables in a small-multiples grid. Reach for a SPLOM when you have three or more numeric variables and want to compare relationships at once.

Scatter plot

Single panel. One pair of variables. Best when you already know which two variables you want to compare.

One X, one Y
Easiest to read at large sizes
Best for a single hypothesis

Scatter plot matrix

Grid of small scatter plots, one per pair of variables. Useful for exploratory analysis when you don’t yet know which pairs are interesting.

Compares 3–10 variables at once
Diagonal often shows distributions
Each panel is small — use for screening

// 11 — Common mistakes

Mistakes to watch out for

Almost every misleading scatter plot in the wild fails the same handful of ways. If you only memorize six rules, make them these.

Overplotting (too many overlapping points)

When thousands of markers pile on top of each other, the chart becomes a solid blob and you cannot see the underlying density. The fix is the cheapest one in visualization: drop marker alpha to roughly 0.2–0.4. Past several thousand points, switch to a hex-bin or 2D density plot, or aggregate to one marker per group with whiskers.

Assuming correlation means causation

This is the most dangerous misinterpretation a scatter plot invites. A chart showing that ice cream sales and drownings rise together does not mean ice cream causes drowning — both rise with summer heat. Always look for confounding variables, write “associated with” instead of “causes” in your title, and resist the temptation to sell a story the chart cannot support.

Ignoring scale and aspect ratio

Stretching one axis can make a weak correlation look strong, and compressing it can make a strong one look weak. Keep aspect ratios proportional to the data range, never crop one axis to exaggerate a trend, and never start one axis at zero just to make the points look closer together. If the relationship is real it will survive an honest aspect ratio.

Fitting a straight line to curved data

If the cloud of points curves — say, in a U or an exponential — a linear regression line is the wrong summary. The reader will see a slope and conclude “small positive effect” when the truth is “strong non-linear effect.” Always look at the cloud shape first, and reach for a LOESS smoother or a polynomial fit when the data is curved.

Using too few data points

With fewer than ten observations, almost any pattern you see could be random noise. Even a beautiful upward trend through five points is one data collection bug away from disappearing. Wait for at least ~20 observations before drawing conclusions, and report sample size in the title or caption so the reader can calibrate trust.

Hiding subgroups inside one cloud

If the data contains two distinct groups (men/women, treatment/control, before/after), a single-color scatter can hide a Simpson’s paradox where the overall trend goes one way and each subgroup goes the other. Color or shape by the subgroup variable whenever there is a meaningful split, and always check at least one breakdown before trusting an aggregate trend line.

Encoding too many variables at once

A scatter that uses position for X and Y, color for one variable, shape for a second, and size for a third asks the reader to juggle five channels at once. The chart becomes a puzzle. If you find yourself reaching for the fourth channel, split the data into small multiples instead — the reader’s working memory will thank you.

// 12 — Accessibility

Accessibility checklist

Run through this list before publishing. The chart should still communicate its message to readers using assistive technology, color-blind users, keyboard navigation, and reduced-motion settings.

&check;
Provide a text alternative for the chart
WCAG 1.1.1
Add an accessible name (alt text or aria-label) that summarizes the takeaway, not the chart type. “Scatter plot of two variables” is weak; “Study hours and exam scores show a positive correlation: each extra hour of study lined up with about six more exam points (R² = 0.71, n = 24)” is strong.
&check;
Do not rely on color alone to encode groups
WCAG 1.4.1
If markers are color-coded by category, also vary their shape (circle, triangle, square, plus) so colorblind readers and grayscale printers can still tell groups apart. Roughly 1 in 12 men and 1 in 200 women have some form of color-vision deficiency.
&check;
Marker contrast meets WCAG AA
WCAG 1.4.3
Marker fill against the chart background should reach at least 3:1 contrast for graphical objects, and any text labels (titles, axes, point annotations) should reach 4.5:1 for body text or 3:1 for large text.
&check;
Expose the underlying data
WCAG 1.3.1
Place the raw X, Y (and group, if present) values in a screen-reader-friendly table next to or beneath the chart. For dense scatters, also expose a per-group summary table so screen-reader users have a navigable equivalent of what sighted readers see at a glance.
&check;
Provide a trend line and R² text alternative
WCAG 1.1.1
When a regression or smoother is drawn, always print the slope and R² (or the correlation coefficient) in plain text near the chart. A reader who can’t see the line still needs to know how strong the relationship is.
&check;
Focusable points with visible focus rings
WCAG 2.4.7
If the scatter is interactive, every marker should be reachable with the Tab key in a sensible order, expose its X, Y, and identifier through aria-label, and gain a visible focus ring (outline, glow, or enlarged stroke) when the keyboard lands on it. Tooltips must appear on focus, not only on hover.
&check;
Respect prefers-reduced-motion
WCAG 2.3.3
If markers fade or fly into position, gate the animation behind a prefers-reduced-motion: no-preference media query so motion-sensitive readers see the final state immediately. Zoom and pan transitions should also be skippable.
&check;
Make the chart resizable and zoomable
WCAG 1.4.4
Let the chart container scale with the viewport and stay legible at 200% browser zoom. Avoid baking the SVG to a fixed pixel size; use a responsive viewBox so axis ticks stay readable on narrow screens.
&check;
Label both axes with units
WCAG 3.3.2
“$1.2k” is fine in display, but the axis title or a nearby caption must state the unit (“customer lifetime value, USD”) so a reader who can’t see the chart still understands what is being measured on each axis.

Microsoft Excel

Spreadsheet — ~3 min

01Place X values in column A and Y values in column B, with a header row. Both columns must be numeric.
02Highlight both columns including the headers.
03Open the Insert tab, choose Charts, then XY (Scatter), and pick the first preset (markers only, no connecting line).
04Right-click any marker and choose Format Data Series; under Marker Options set the size to ~5 and the fill transparency to 60–80% if points overlap.
05Right-click a marker again and choose Add Trendline → Linear; tick ‘Display Equation on chart’ and ‘Display R-squared value on chart’.
06Edit each axis title to include units (‘Study hours’, ‘Exam score / 100’) and replace the default chart title with the takeaway sentence.
07If you need a third variable, color markers by category using Format Data Series → Vary colors by point or duplicate the chart with conditional series.

Tip: avoid the ‘Scatter with Smooth Lines’ preset. Connecting the markers implies a temporal order that doesn’t exist for most scatter data.

Google Sheets

Spreadsheet — ~3 min

01Lay out your data with X values in the first column and Y values in the second column, with headers.
02Select the range, then choose Insert → Chart.
03In the Chart editor on the right, set Chart type to Scatter chart.
04Open Customize → Series and set the Point size to 5 and the Point opacity to 0.4 if your markers overlap.
05Under Customize → Series, tick the Trendline checkbox, choose Linear, and select ‘Use equation’ in the Label dropdown.
06Edit the chart title under Customize → Chart & axis titles → Chart title and write a takeaway, not just a label.
07For a third variable, switch to a Bubble chart type and map the third numeric column to bubble size.

Sheets has no built-in jitter. If many of your X values share the same value (e.g., integer scores), add tiny random noise (=A2 + RAND()*0.2 − 0.1) to a helper column and plot that.

Python (Matplotlib)

Code — ~5 min

01Install Matplotlib with pip install matplotlib (and numpy if it isn’t already in your environment).
02Import matplotlib.pyplot as plt and numpy as np, and load your X and Y arrays.
03Call plt.scatter(x, y, alpha=0.4, s=30) — alpha tames overplotting and s sets marker area in points².
04Add a regression line by computing slope, intercept = np.polyfit(x, y, 1) and plotting plt.plot(x, slope*x + intercept).
05Print the R² value with plt.text() so the reader can judge how tightly the cloud follows the line.
06Add plt.xlabel(), plt.ylabel() with units, and plt.title() with the takeaway sentence; call plt.tight_layout() before plt.show() or plt.savefig().
07For a third variable, pass c=group_array and a colormap to color by group, or use marker= to switch shapes per group.

Use ax.spines[['top','right']].set_visible(False) to drop the chart border for a cleaner, publication-ready look.

R (ggplot2)

Code — ~5 min

01Install ggplot2 with install.packages('ggplot2') and load it with library(ggplot2).
02Build a data frame with at least an x and a y column. Add a group column if you want to color or shape by category.
03Pass the data frame to ggplot(aes(x = x, y = y)) and add geom_point(alpha = 0.4) for the markers.
04Add geom_smooth(method = 'lm', se = TRUE) for a linear trend line with a confidence ribbon, or method = 'loess' for a smoother.
05Annotate the R² by computing it with summary(lm(y ~ x, data = d))$r.squared and adding annotate('text', …).
06Apply labs() with title, x, and y arguments (include units), then theme_minimal(base_size = 12) for a clean default.
07For a third variable, map color or shape inside aes(): aes(x = x, y = y, color = group, shape = group). Use scale_color_brewer() for colorblind-safe palettes.

ggplot2’s geom_jitter() is the easiest way to recover stacked points when the X axis has only a few unique values — swap it in for geom_point().

JavaScript (D3.js)

Code — ~10 min

01Install D3 (npm i d3) or include the CDN script tag in your HTML.
02Create an SVG container and set its viewBox, plus a margin object.
03Build linear scales for both axes with d3.scaleLinear().domain(d3.extent(data, d => d.x)).nice(); don’t force the domain to start at zero.
04Bind your data with selectAll('circle').data(data).join('circle') and set cx, cy from the scales, with a small radius (3–5 px) and fill-opacity ~0.5.
05Render the axes with d3.axisBottom() and d3.axisLeft(); add label <text> elements that include units.
06Compute and draw a regression line with d3.regressionLinear() from the d3-regression plugin, and print the resulting R² in a label.
07Make markers focusable: set tabindex=0, give each circle an aria-label like ‘x: 4 hours, y: 72 points’, and add a focus ring via :focus { stroke: black; stroke-width: 2px; }.

If you don’t need full control, Observable Plot, Plotly, ECharts, or Vega-Lite all give you a working scatter plot with tooltips and zoom in fewer lines.

Tableau

BI — ~4 min

01Connect to your data and drag the first numeric measure to the Columns shelf and the second to the Rows shelf. Tableau will aggregate by default.
02Disaggregate by toggling Analysis → Aggregate Measures off, so each row in the data becomes its own marker rather than a single mean.
03Change the Marks card to Circle and reduce opacity to ~50% under the Color property to handle overplotting.
04Drag a third dimension to the Color property of the Marks card to color markers by category, and to Shape if you want shapes too.
05Add a trend line by going to Analytics → Trend Line → Linear; right-click the line and choose Describe Trend Line to see R² and p-values.
06Right-click each axis and choose Edit Axis to crop the range to the data and add a unit-aware title.
07Use the Highlight tool or annotations to label two or three notable observations rather than every marker.

Tableau’s default ‘SUM aggregation’ is the single most common scatter mistake — it collapses every row into a single point. Always disaggregate first.

Power BI

BI — ~4 min

01In Power BI Desktop, open the Visualizations pane and choose the Scatter chart visual.
02Drag your X numeric measure to the X Axis well and your Y numeric measure to the Y Axis well.
03Drag a row identifier (an ID, country, or product key) to the Details well so each marker represents one observation rather than the aggregated total.
04Open the Format pane, expand Markers, and lower the marker fill transparency to ~50% to handle overplotting.
05Under Format → Analytics, add a Trend line and tick the option to display the equation and R² on the chart.
06Drag a categorical column to the Legend well to color markers by category, and to the Size well only if a meaningful third quantitative variable exists.
07Edit each axis under Format → X axis / Y axis to set Start and End to the data range plus a small margin, and label both with units.

If you forget the Details well, Power BI silently aggregates every row into one marker per legend group — just like Tableau. Always check the marker count.

// 16 — Code examples

Working code in the most common stacks

Three runnable snippets that produce the same chart — a scatter of 24 students’ study hours vs exam scores, with a linear trend line and the R² printed in the corner. Copy, paste, and replace the data with yours.

scatter_plot.py

import matplotlib.pyplot as plt
import numpy as np

# Study hours vs exam score for 24 students.
hours  = np.array([1.5, 2.0, 2.2, 2.8, 3.0, 3.3, 3.7, 4.1, 4.4, 4.6,
                   5.0, 5.2, 5.5, 5.8, 6.1, 6.4, 6.7, 7.0, 7.3, 7.6,
                   8.0, 8.4, 8.8, 9.2])
scores = np.array([42, 48, 51, 55, 53, 58, 62, 60, 66, 64,
                   68, 71, 70, 74, 76, 79, 81, 84, 82, 87,
                   89, 92, 91, 95])

fig, ax = plt.subplots(figsize=(8, 4.8))
ax.scatter(hours, scores, s=42, alpha=0.55,
           color="#c94a2e", edgecolor="#c94a2e", linewidth=0.8)

# Linear regression line + R-squared.
slope, intercept = np.polyfit(hours, scores, 1)
xs = np.linspace(hours.min(), hours.max(), 100)
ax.plot(xs, slope * xs + intercept,
        color="#1a1a18", linewidth=1.2, linestyle="--", alpha=0.7)

r2 = np.corrcoef(hours, scores)[0, 1] ** 2
ax.text(0.04, 0.92, f"R² = {r2:.2f}", transform=ax.transAxes,
        fontsize=11, color="#1a1a18")

ax.set_title("Each extra hour of study lined up with about six more exam points",
             loc="left", fontsize=13)
ax.set_xlabel("Study hours per week")
ax.set_ylabel("Exam score (out of 100)")
ax.spines[["top", "right"]].set_visible(False)
ax.set_xlim(hours.min() - 0.5, hours.max() + 0.5)
ax.set_ylim(scores.min() - 5, scores.max() + 5)

plt.tight_layout()
plt.savefig("scatter_plot.png", dpi=200)
plt.show()

$ python scatter_plot.py

// 17 — FAQs

Frequently asked questions

What is a scatter plot?+

A scatter plot (also called a scatter chart, scattergram, or XY plot) displays individual data points on a two-dimensional plane, where one variable is plotted on the horizontal axis and another on the vertical axis. Each dot represents a single observation or measurement, and the cloud of dots reveals whether the two variables are related, how strong the relationship is, and where the unusual cases sit.

When should you use a scatter plot?+

Use a scatter plot when you want to explore the relationship between two continuous variables — for example, study hours versus exam score, or advertising spend versus sales. Scatter plots are also the right choice when you need to spot outliers, check whether a linear regression model is appropriate, or compare clusters across colored or shaped subgroups.

When should you avoid a scatter plot?+

Avoid a scatter plot when one axis is categorical (countries, products, teams) — use a bar chart, dot plot, or strip plot instead. They are also a poor fit when you have fewer than ten observations (the pattern won’t be statistically meaningful), when both variables are categorical (use a mosaic plot), or when the data is so dense that points overlap into a single blob — reach for a hex bin or 2D density plot.

How is a scatter plot different from a line chart?+

A line chart connects observations in time order to emphasize trend; a scatter plot leaves the points unconnected to emphasize relationship. Use a line chart when the X-axis is time and the order matters; use a scatter plot when X is any continuous variable and you care about whether Y goes up, down, or stays flat as X changes.

How is a scatter plot different from a bubble chart?+

A scatter plot encodes two variables — one per axis. A bubble chart adds a third variable encoded as the area (not the radius) of each marker, and often a fourth variable as color. Use a scatter plot when you only have two continuous variables to compare; switch to a bubble chart when a meaningful third quantitative variable would otherwise be hidden.

How is a scatter plot different from a hex bin plot?+

Both show the relationship between two continuous variables, but a scatter plot draws one mark per row while a hex bin plot draws one hexagonal cell per local group of rows and color-codes the count. When you have more than a few thousand points and the marks overlap into a solid blob, swap the scatter for a hex bin so density becomes legible again.

What does correlation mean in a scatter plot?+

Correlation is a number between −1 and 1 that summarizes how tightly the two variables move together. A value near +1 means as X rises, Y reliably rises; a value near −1 means as X rises, Y reliably falls; a value near 0 means there is no consistent relationship. Most scatter plots are read with the Pearson correlation, but Spearman is more robust to outliers.

Does correlation prove causation?+

No. A scatter plot can only show that two variables move together; it cannot show why. Ice cream sales and drownings rise together every summer, but ice cream doesn’t cause drowning — hot weather drives both. Treat every scatter plot as a question (“why do these move together?”) rather than an answer.

How many data points do you need for a scatter plot?+

About 20 is a sensible minimum for visually trusting a pattern. With fewer than 10, almost any pattern you see could be random noise. Past several thousand, individual marks stop being legible and you should switch to alpha blending, jittering, or a density-based variant such as hex bin or 2D KDE.

Should the axes start at zero in a scatter plot?+

Not necessarily. Unlike a bar chart, a scatter plot encodes value with position, not length, so a non-zero baseline does not exaggerate ratios. Crop the axes to the data range so the cloud fills the chart — but always keep the visible range honest and label it clearly so readers know they are looking at a zoomed view.

What is Anscombe’s quartet?+

Anscombe’s quartet is a set of four datasets, published by statistician Frank Anscombe in 1973, that share nearly identical means, variances, correlation, and regression line — yet look completely different when plotted. It is the canonical reason why every analyst should plot the data, not just compute the summary statistics, before reporting a result.

How do you handle overplotting in a scatter plot?+

Three approaches solve overplotting in order of increasing effort: lower the marker alpha (e.g., 0.2) so density emerges from overlap, jitter near-identical values to reveal stacked points, or aggregate into a hex bin or 2D density plot when marker count exceeds ~5,000. Sampling a representative subset is a fourth option for exploratory work.

Can a scatter plot encode a third variable?+

Yes — by mapping a third variable to the marker color, shape, or size. Color works for categorical or sequential variables, shape works for small numbers of categorical groups, and size (i.e., a bubble chart) works for a positive quantitative variable. Only encode a third variable when its presence answers a question your reader actually has.

What category of chart is a scatter plot?+

Scatter Plot belongs to the Correlation family of charts. Charts in that family — bubble chart, hex bin plot, 2D density plot, scatter plot matrix — are all designed to answer how two or more continuous variables relate, so they often work as alternatives when one doesn’t quite fit your data.

What’s the best library for building scatter plots in code?+

For static, publication-quality scatter plots, Matplotlib (Python) and ggplot2 (R) are the standard choices. For interactive web scatter plots, D3.js gives the most control, while Plotly, Observable Plot, ECharts, and Vega-Lite get you to a working chart in fewer lines and ship reasonable defaults for tooltips and zoom.

// 18 — References

References and further reading

Primary sources, reference texts, and the official documentation for the libraries and tools referenced throughout this guide.

Wikipedia — Scatter plotReference
Encyclopedia entry covering the history, variants, and visual encoding of scatter plots, including the Herschel and Galton origin stories. A solid neutral starting point with citations.
https://en.wikipedia.org/wiki/Scatter_plot
Francis Galton — Regression Towards Mediocrity in Hereditary Stature (1886)Primary source
Galton’s original paper plotting parents’ against children’s heights — the first published scatter plot used for regression analysis. Hosted by galton.org.
https://galton.org/essays/1880-1889/galton-1886-jaigi-regression-stature.pdf
F. J. Anscombe — Graphs in Statistical Analysis (1973)Primary source
The paper that introduced Anscombe’s quartet: four datasets with identical summary statistics but radically different scatter plots. The canonical reason to plot before you summarize.
https://www.jstor.org/stable/2682899
Edward Tufte — The Visual Display of Quantitative InformationBook
Tufte’s foundational text on data graphics. The chapters on data-ink ratio and small multiples explain why plain scatters beat decorated ones, and motivate scatter plot matrices.
https://www.edwardtufte.com/book/the-visual-display-of-quantitative-information/
Datawrapper Academy — What to consider when creating scatterplotsTutorial
Hands-on tutorial with real published examples. Especially useful for handling overplotting, choosing axis ranges, and adding meaningful annotations.
https://academy.datawrapper.de/article/255-what-to-consider-when-creating-a-scatterplot
Financial Times — Visual VocabularyReference
Open-source poster categorizing chart types by intent. Scatter plots, bubble charts, and connected scatters all sit in the Correlation family alongside hex bins and density plots.
https://github.com/Financial-Times/chart-doctor/tree/main/visual-vocabulary
WAI — Complex Images: Charts and GraphsAccessibility
Web Accessibility Initiative guidance on making charts accessible: text alternatives, long descriptions, and data tables. Use this when building the accessibility checklist for a scatter plot.
https://www.w3.org/WAI/tutorials/images/complex/
Matplotlib documentation — plt.scatterDocs
Official API reference for the Python scatter plot helper used in this guide’s code sample, including marker, alpha, and colormap arguments.
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html
ggplot2 reference — geom_pointDocs
Tidyverse documentation for the ggplot2 point geometry. Covers aesthetics, jittering, and combining with geom_smooth() for trend lines.
https://ggplot2.tidyverse.org/reference/geom_point.html
D3.js — ScatterplotDocs
Maintained Observable notebook from the D3 team that mirrors the JavaScript code sample in this guide, with worked examples for axes, voronoi tooltips, and zoom.
https://observablehq.com/@d3/scatterplot
Cleveland & McGill — Graphical Perception (1984)Primary source
Landmark experimental study on which visual encodings let humans estimate quantities most accurately. Position along a common axis (the scatter plot’s native encoding) ranks at the top.
https://www.jstor.org/stable/2288400

← Previous: Line graph

1 of 80+ chart types

Next: Bubble chart →

Scatter Plot

What it looks like

What is a scatter plot?

When a scatter plot is the right call

When a scatter plot is the wrong call

What your data needs to look like

Parts of a scatter plot

Step-by-step: how to build a good scatter plot

Pick the question you want the chart to answer

Get one row per observation

Decide which variable goes on which axis

Choose the axis ranges

Draw the markers and tame overplotting

Add a trend line if a relationship is plausible

Encode a third variable only when it earns its place

Annotate the points that tell the story

Write a takeaway title and ship

Where you’ll see scatter plots used

Medicine: Drug dosage vs patient response

Economics: GDP per capita vs life expectancy

Sports: Player salary vs performance

Business: Customer acquisition cost vs lifetime value

Types of scatter plots

Scatter plot vs other chart types

Scatter plot vs bubble chart

Scatter plot

Bubble chart

Scatter plot vs hex bin plot

Scatter plot

Hex bin plot

Scatter plot vs line chart (correlation vs trend)

Scatter plot

Line chart

Scatter plot vs scatter plot matrix

Scatter plot

Scatter plot matrix

Mistakes to watch out for

Overplotting (too many overlapping points)

Assuming correlation means causation

Ignoring scale and aspect ratio

Fitting a straight line to curved data

Using too few data points

Hiding subgroups inside one cloud

Encoding too many variables at once

Accessibility checklist

Provide a text alternative for the chart

Do not rely on color alone to encode groups

Marker contrast meets WCAG AA

Expose the underlying data

Provide a trend line and R² text alternative

Focusable points with visible focus rings

Respect prefers-reduced-motion

Make the chart resizable and zoomable

Label both axes with units

Design and craft tips

Use alpha blending when points overlap

Connect the dots with a line

Show a trend line and its R²

Force the axes to start at zero

Encode subgroups with both color and shape

Imply causation in the title

Annotate two or three notable points

Use a bubble chart by accident

Related and alternative charts

How to build it in your tool of choice

Microsoft Excel

Google Sheets

Python (Matplotlib)

R (ggplot2)

JavaScript (D3.js)

Tableau

Power BI

Working code in the most common stacks

Frequently asked questions

References and further reading