Scatter Plot
A chart that plots individual data points on two axes to reveal relationships, correlations, and clusters between two continuous variables — the workhorse of exploratory analysis since Herschel sketched the first one in 1833.
// 01 — The chart
What it looks like
A scatter plot showing a positive correlation between study hours and exam scores. The dashed line shows the linear trend, and one outlier is highlighted.
// 02 — Definition
What is a scatter plot?
A scatter plot (also called a scatter chart, scattergram, or XY plot) displays individual data points on a two-dimensional plane, where one variable is plotted on the horizontal axis and another on the vertical axis. Each dot represents a single observation — one student, one country, one transaction — positioned by its X and Y values. The cloud of dots, taken together, is what answers the analyst’s question: do these two variables move together, and if so how tightly?
The primary purpose of a scatter plot is to reveal the relationship between two continuous variables. When the dots form a pattern — sloping upward, sloping downward, curving, clustering, or scattering randomly — the shape tells you how one variable behaves as the other changes. A tight upward cloud means strong positive correlation; a wide, shapeless cloud means little or no correlation. The slope of any fitted trend line is the average effect; the spread around the line is the noise.
Unlike bar charts (which compare categories) or line charts (which show change over time), scatter plots are uniquely designed to answer: “Does Y change as X changes, and how reliably?” They are the working tool of regression analysis, the screening device for outlier detection, and the visual anchor for almost every introductory statistics course because position on a common axis is the encoding humans read most accurately. Where they break down is at extremes of size: under ten points the pattern is noise, and past several thousand the markers overlap into a blob and you need a hex-bin or density variant.
The price of that simplicity is a single, persistent trap: correlation is not causation. A scatter plot can show that ice cream sales and drowning rates rise together, but it cannot say that ice cream causes drowning. Both rise with summer heat. Every scatter plot is a question (“why do these move together?”) rather than an answer, and the rest of this guide is about building scatters that reward careful questioning instead of inviting the wrong one.
Origin: The scatter plot was first used by English scientist John Frederick W. Herschel in 1833 to analyze the orbits of double stars. Francis Galton popularized it in the 1880s for studying human traits, plotting parents’ against children’s heights — the chart that gave us the word “regression.” Frank Anscombe’s 1973 quartet later cemented the scatter plot as the chart you must look at before trusting any summary statistic.
// 03 — When to use
When a scatter plot is the right call
Reach for a scatter plot whenever the question is about how two continuous variables move together and you have enough observations for a pattern to emerge. Below are the situations where it consistently wins against the alternatives.
- You want to explore the relationship between two continuous variables
- You’re looking for correlations — do values of X and Y tend to move together?
- You need to identify outliers or unusual observations that deserve investigation
- You have at least ~20 observations so a pattern can emerge above the noise
- You’re checking whether a linear regression model fits the data before running one
- You want to compare clusters or subgroups by coloring or shaping the markers
- You’re building exploratory visuals where individual rows still need to be visible
// 04 — When not to use
When a scatter plot is the wrong call
Scatter plots can technically display many kinds of data, but “technically possible” is not the same as “good idea.” Below are the cases where the scatter plot actively hides information you need to communicate.
- One axis is categorical (countries, products) — use a bar chart or strip plot instead
- You want to show change over time — use a line chart so the temporal sequence is visible
- You have very few data points (fewer than ~10) — patterns won’t be statistically meaningful
- Both variables are categorical — use a mosaic plot or contingency table
- Your data is so dense that points overlap into a single blob — use a hex bin or 2D density plot
- You need exact values rather than patterns — a small data table communicates better
- You want to show parts of a whole — use a pie, donut, or stacked bar instead
- You have one continuous variable and want to show its distribution — use a histogram or density plot
// 05 — Data requirements
What your data needs to look like
Before building the chart, your dataset needs to fit a specific shape. Use this checklist to confirm yours does.
Shape
One row per observation, with two numeric columns for the X and Y coordinates. Optional columns for group (color/shape), size (bubble), and label (annotation).
Minimum rows
10 observations to start, ~20+ for a credible pattern. With fewer, the cloud is just noise.
Maximum rows
~5,000 observations before overplotting forces a switch to alpha blending or a hex bin / density variant.
The horizontal coordinate of each marker. Typically the explanatory or predictor variable — the thing you suspect drives the outcome. Must be numeric; categorical X belongs on a strip plot or bar chart instead.
The vertical coordinate of each marker. Typically the outcome or response variable. Must be numeric and ideally measured on a scale where small differences are meaningful.
Optional categorical column used to color or shape markers so subgroups can be compared on the same axes. Keep groups to roughly 2–6; past that, marker styles become indistinguishable.
Optional positive quantitative column mapped to the marker area. Mapping a third variable to size promotes the chart from scatter plot to bubble chart — use only when the third variable answers a question.
Optional row label used to annotate notable points (e.g., country names on a Gapminder-style chart). Annotate two or three points by hand rather than every marker.
| student | hours | score | group |
|---|---|---|---|
| S01 | 1.5 | 42 | A |
| S02 | 2.0 | 48 | A |
| S03 | 3.7 | 62 | B |
| S04 | 5.5 | 70 | B |
| S05 | 7.0 | 84 | C |
| S06 | 9.2 | 95 | C |
Tip: if your raw data is pre-aggregated (means per group), un-aggregate it back to one row per observation before plotting. Tools like Tableau and Power BI silently aggregate by default — use the Detail well or disaggregate measures so each row becomes its own marker.
// 06 — Anatomy
Parts of a scatter plot
Every scatter plot is built from the same handful of parts. Knowing the names makes it easier to talk about what to keep, what to drop, and what most templates are getting wrong.
// 07 — Step-by-step
Step-by-step: how to build a good scatter plot
A nine-step recipe that works regardless of the tool. Walk through it the first few times and the moves become automatic; skip steps and the chart usually shows it.
- 1
Pick the question you want the chart to answer
A scatter plot answers “does Y change as X changes, and how tightly?” Write that question down before drawing anything. If your question is about ranking categories, change over time, or share of a whole, switch to a bar, line, or part-to-whole chart now — not after you have built it. - 2
Get one row per observation
Each row in your data must represent one observation — one student, one country, one transaction. If your raw data is pre-aggregated (averages per group), un-aggregate it or accept that the chart will show group means, not the underlying spread. - 3
Decide which variable goes on which axis
By convention, X is the explanatory variable and Y is the outcome. If you would describe the relationship as “Y depends on X,” put the candidate cause on X. If neither variable obviously depends on the other, the assignment is yours — just stay consistent across related charts. - 4
Choose the axis ranges
Unlike a bar chart, scatter plots do not need to start at zero. Crop both axes to the data range plus a small margin so the cloud fills the chart and small differences become visible. Always show the actual numbers on the axis so the reader knows they are looking at a zoomed view. - 5
Draw the markers and tame overplotting
If your dataset has fewer than ~500 points, draw plain filled circles. Past 500, lower marker alpha to roughly 0.2–0.4 so overlapping points darken into local density. Past several thousand, switch to a hex bin or 2D density plot — the scatter has stopped being legible. - 6
Add a trend line if a relationship is plausible
A linear regression line summarizes the average direction of the cloud. Print the R² value next to it so the reader can judge fit quality. If the cloud curves, fit a LOESS or polynomial smoother instead — a straight line through curved data hides the real pattern. - 7
Encode a third variable only when it earns its place
Color, shape, and size each let you add a third variable. Use color for nominal groups (with a colorblind-safe palette), shape for two or three groups, and size for a positive quantitative variable. If the third variable doesn’t change what the reader concludes, leave it out. - 8
Annotate the points that tell the story
Label two or three notable observations with their row identifier (a country name, a player, a date). Don’t label everything — a sea of text removes the focus you just created. Outliers, the headline observation, and one or two extremes are enough. - 9
Write a takeaway title and ship
“Study hours vs exam score” is a label. “Each extra hour of study correlated with about 6 more points on the exam” is a takeaway. Put the takeaway on the chart and the descriptive label as a subtitle, then verify the chart still works at the size your readers will see it.
// 08 — Real-world examples
Where you’ll see scatter plots used
Scatter plots show up in three places more than anywhere else: scientific papers, public-policy storytelling, and business analytics. Each context has its own conventions, and they all reward the same fundamentals.
Medicine: Drug dosage vs patient response
Pharmaceutical researchers use scatter plots to visualize the relationship between dosage levels and a measured outcome — blood pressure drop, antibody titer, time to remission. Points cluster tightly when the dose-response relationship is strong; outliers are flagged for chart review because they may be adverse reactions or non-responders. Trial reports almost always include the underlying scatter as well as the fitted curve.
Medical ResearchEconomics: GDP per capita vs life expectancy
The famous Gapminder visualization, popularized by Hans Rosling, shows countries as dots with GDP per capita on the X-axis and life expectancy on the Y-axis. Marker area encodes population and color encodes continent — a scatter plot that has graduated to a bubble chart. Two centuries of data played as an animation rewrote how lay audiences think about global development.
EconomicsSports: Player salary vs performance
Sports analysts use scatter plots to identify overpaid and underpaid players by plotting salary against a key performance metric (WAR for baseball, expected goals for soccer). Players above the trend line are outperforming their pay; those below are underperforming. The same chart drives front-office trade decisions and the analytics blogs that scrutinize them.
Sports AnalyticsBusiness: Customer acquisition cost vs lifetime value
A SaaS dashboard showing one dot per acquisition channel, with cost per customer on X and lifetime value on Y. A 1:1 reference line splits the chart into healthy channels (above the line) and unprofitable ones (below). Operators glance at the chart, identify the channel that needs attention, and reallocate budget without ever opening the underlying spreadsheet.
Business Analytics// 09 — Variations
Types of scatter plots
The basic scatter plot has several important variants, each suited to a slightly different data situation. The headline rule is the same as ever: pick the variant whose strengths match your question.
Bubble chart
Encodes a third variable through the area of each marker. Use only when the third variable is meaningful and positive.
Connected scatter
Lines connect points in time order, showing how the relationship between two variables evolves over time.
Hex bin / density scatter
Aggregates points into hexagonal cells whose color encodes count. Solves overplotting past several thousand points.
Jitter / strip plot
Adds small random offsets to points to prevent overlap when one axis is categorical or has only a few unique values.
// 10 — Comparisons
Scatter plot vs other chart types
Scatter plots get confused with several other chart types because they all live in or near the Correlation family. The differences matter — picking the wrong one changes what your reader is allowed to conclude.
Scatter plot vs bubble chart
Both place markers in a continuous X–Y plane. A scatter plot encodes two variables; a bubble chart encodes a third by mapping it to marker area. Use the bubble version only when the extra variable answers a question — otherwise the size channel becomes decorative noise.
Scatter plot
One marker per observation. Position encodes two variables; markers are typically the same size. The cleanest way to show a two-variable relationship.
- Two variables: X and Y
- All markers same size
- Easiest to read at a glance
Bubble chart
One marker per observation, but marker area encodes a third quantitative variable. Color often encodes a fourth, categorical variable. Powerful but easy to overload.
- Three or four variables encoded at once
- Map size to area, never radius
- Best with a strong takeaway and few markers
Scatter plot vs hex bin plot
Scatter plots show every observation; hex bin plots aggregate observations into hexagonal cells whose color encodes the count. Use a scatter for hundreds to a few thousand markers; switch to hex bins past that, where overplotting hides the underlying density.
Scatter plot
One mark per row. Outliers and individual points stay visible. Breaks down when markers overlap into a single solid blob.
- Best for ~50 to ~5,000 observations
- Outliers are obvious
- Use alpha blending past ~1,000 points
Hex bin plot
Markers aggregated into hexagonal cells; cell color encodes the count. Density patterns become legible even with millions of points, but individual rows disappear.
- Best past ~5,000 observations
- Density patterns become readable
- Outliers are smoothed away
Scatter plot vs line chart (correlation vs trend)
A scatter plot answers “how are X and Y related?”; a line chart answers “how does Y change over time (or another ordered variable)?” The visual cue is whether marks are connected. Connect only when the X-axis has a meaningful order.
Scatter plot
Unconnected markers. The X axis is any continuous variable. The reader’s eye searches for a cloud shape: rising, falling, curved, or random.
- X is any continuous variable
- No implied order between points
- Highlights correlation and outliers
Line chart
Markers (or just a polyline) connected in X order. The X axis almost always represents time. The reader’s eye follows trend, slope, and inflection points.
- X is ordered (usually time)
- Connecting line implies sequence
- Highlights trend and change
Scatter plot vs scatter plot matrix
A scatter plot shows one X–Y pair; a scatter plot matrix (also called a SPLOM or pair plot) shows every pairwise combination of several variables in a small-multiples grid. Reach for a SPLOM when you have three or more numeric variables and want to compare relationships at once.
Scatter plot
Single panel. One pair of variables. Best when you already know which two variables you want to compare.
- One X, one Y
- Easiest to read at large sizes
- Best for a single hypothesis
Scatter plot matrix
Grid of small scatter plots, one per pair of variables. Useful for exploratory analysis when you don’t yet know which pairs are interesting.
- Compares 3–10 variables at once
- Diagonal often shows distributions
- Each panel is small — use for screening
// 11 — Common mistakes
Mistakes to watch out for
Almost every misleading scatter plot in the wild fails the same handful of ways. If you only memorize six rules, make them these.
Overplotting (too many overlapping points)
When thousands of markers pile on top of each other, the chart becomes a solid blob and you cannot see the underlying density. The fix is the cheapest one in visualization: drop marker alpha to roughly 0.2–0.4. Past several thousand points, switch to a hex-bin or 2D density plot, or aggregate to one marker per group with whiskers.
Assuming correlation means causation
This is the most dangerous misinterpretation a scatter plot invites. A chart showing that ice cream sales and drownings rise together does not mean ice cream causes drowning — both rise with summer heat. Always look for confounding variables, write “associated with” instead of “causes” in your title, and resist the temptation to sell a story the chart cannot support.
Ignoring scale and aspect ratio
Stretching one axis can make a weak correlation look strong, and compressing it can make a strong one look weak. Keep aspect ratios proportional to the data range, never crop one axis to exaggerate a trend, and never start one axis at zero just to make the points look closer together. If the relationship is real it will survive an honest aspect ratio.
Fitting a straight line to curved data
If the cloud of points curves — say, in a U or an exponential — a linear regression line is the wrong summary. The reader will see a slope and conclude “small positive effect” when the truth is “strong non-linear effect.” Always look at the cloud shape first, and reach for a LOESS smoother or a polynomial fit when the data is curved.
Using too few data points
With fewer than ten observations, almost any pattern you see could be random noise. Even a beautiful upward trend through five points is one data collection bug away from disappearing. Wait for at least ~20 observations before drawing conclusions, and report sample size in the title or caption so the reader can calibrate trust.
Hiding subgroups inside one cloud
If the data contains two distinct groups (men/women, treatment/control, before/after), a single-color scatter can hide a Simpson’s paradox where the overall trend goes one way and each subgroup goes the other. Color or shape by the subgroup variable whenever there is a meaningful split, and always check at least one breakdown before trusting an aggregate trend line.
Encoding too many variables at once
A scatter that uses position for X and Y, color for one variable, shape for a second, and size for a third asks the reader to juggle five channels at once. The chart becomes a puzzle. If you find yourself reaching for the fourth channel, split the data into small multiples instead — the reader’s working memory will thank you.
// 12 — Accessibility
Accessibility checklist
Run through this list before publishing. The chart should still communicate its message to readers using assistive technology, color-blind users, keyboard navigation, and reduced-motion settings.
- ✓
Provide a text alternative for the chart
WCAG 1.1.1Add an accessible name (alt text or aria-label) that summarizes the takeaway, not the chart type. “Scatter plot of two variables” is weak; “Study hours and exam scores show a positive correlation: each extra hour of study lined up with about six more exam points (R² = 0.71, n = 24)” is strong. - ✓
Do not rely on color alone to encode groups
WCAG 1.4.1If markers are color-coded by category, also vary their shape (circle, triangle, square, plus) so colorblind readers and grayscale printers can still tell groups apart. Roughly 1 in 12 men and 1 in 200 women have some form of color-vision deficiency. - ✓
Marker contrast meets WCAG AA
WCAG 1.4.3Marker fill against the chart background should reach at least 3:1 contrast for graphical objects, and any text labels (titles, axes, point annotations) should reach 4.5:1 for body text or 3:1 for large text. - ✓
Expose the underlying data
WCAG 1.3.1Place the raw X, Y (and group, if present) values in a screen-reader-friendly table next to or beneath the chart. For dense scatters, also expose a per-group summary table so screen-reader users have a navigable equivalent of what sighted readers see at a glance. - ✓
Provide a trend line and R² text alternative
WCAG 1.1.1When a regression or smoother is drawn, always print the slope and R² (or the correlation coefficient) in plain text near the chart. A reader who can’t see the line still needs to know how strong the relationship is. - ✓
Focusable points with visible focus rings
WCAG 2.4.7If the scatter is interactive, every marker should be reachable with the Tab key in a sensible order, expose its X, Y, and identifier through aria-label, and gain a visible focus ring (outline, glow, or enlarged stroke) when the keyboard lands on it. Tooltips must appear on focus, not only on hover. - ✓
Respect prefers-reduced-motion
WCAG 2.3.3If markers fade or fly into position, gate the animation behind a prefers-reduced-motion: no-preference media query so motion-sensitive readers see the final state immediately. Zoom and pan transitions should also be skippable. - ✓
Make the chart resizable and zoomable
WCAG 1.4.4Let the chart container scale with the viewport and stay legible at 200% browser zoom. Avoid baking the SVG to a fixed pixel size; use a responsive viewBox so axis ticks stay readable on narrow screens. - ✓
Label both axes with units
WCAG 3.3.2“$1.2k” is fine in display, but the axis title or a nearby caption must state the unit (“customer lifetime value, USD”) so a reader who can’t see the chart still understands what is being measured on each axis.
// 13 — Best practices
Design and craft tips
The mistakes section above tells you what to avoid. The list below is the positive version: the small set of habits that separate a good scatter plot from a passable one.
Use alpha blending when points overlap
Connect the dots with a line
Show a trend line and its R²
Force the axes to start at zero
Encode subgroups with both color and shape
Imply causation in the title
Annotate two or three notable points
Use a bubble chart by accident
// 15 — Tool instructions
How to build it in your tool of choice
Every major analysis tool ships a scatter plot. The recipes below get you to a clean, alpha-blended scatter plot with a sensible trend line in each of the most common platforms.
Microsoft Excel
Spreadsheet — ~3 min- 01Place X values in column A and Y values in column B, with a header row. Both columns must be numeric.
- 02Highlight both columns including the headers.
- 03Open the Insert tab, choose Charts, then XY (Scatter), and pick the first preset (markers only, no connecting line).
- 04Right-click any marker and choose Format Data Series; under Marker Options set the size to ~5 and the fill transparency to 60–80% if points overlap.
- 05Right-click a marker again and choose Add Trendline → Linear; tick ‘Display Equation on chart’ and ‘Display R-squared value on chart’.
- 06Edit each axis title to include units (‘Study hours’, ‘Exam score / 100’) and replace the default chart title with the takeaway sentence.
- 07If you need a third variable, color markers by category using Format Data Series → Vary colors by point or duplicate the chart with conditional series.
Tip: avoid the ‘Scatter with Smooth Lines’ preset. Connecting the markers implies a temporal order that doesn’t exist for most scatter data.
Google Sheets
Spreadsheet — ~3 min- 01Lay out your data with X values in the first column and Y values in the second column, with headers.
- 02Select the range, then choose Insert → Chart.
- 03In the Chart editor on the right, set Chart type to Scatter chart.
- 04Open Customize → Series and set the Point size to 5 and the Point opacity to 0.4 if your markers overlap.
- 05Under Customize → Series, tick the Trendline checkbox, choose Linear, and select ‘Use equation’ in the Label dropdown.
- 06Edit the chart title under Customize → Chart & axis titles → Chart title and write a takeaway, not just a label.
- 07For a third variable, switch to a Bubble chart type and map the third numeric column to bubble size.
Sheets has no built-in jitter. If many of your X values share the same value (e.g., integer scores), add tiny random noise (=A2 + RAND()*0.2 − 0.1) to a helper column and plot that.
Python (Matplotlib)
Code — ~5 min- 01Install Matplotlib with pip install matplotlib (and numpy if it isn’t already in your environment).
- 02Import matplotlib.pyplot as plt and numpy as np, and load your X and Y arrays.
- 03Call plt.scatter(x, y, alpha=0.4, s=30) — alpha tames overplotting and s sets marker area in points².
- 04Add a regression line by computing slope, intercept = np.polyfit(x, y, 1) and plotting plt.plot(x, slope*x + intercept).
- 05Print the R² value with plt.text() so the reader can judge how tightly the cloud follows the line.
- 06Add plt.xlabel(), plt.ylabel() with units, and plt.title() with the takeaway sentence; call plt.tight_layout() before plt.show() or plt.savefig().
- 07For a third variable, pass c=group_array and a colormap to color by group, or use marker= to switch shapes per group.
Use ax.spines[['top','right']].set_visible(False) to drop the chart border for a cleaner, publication-ready look.
R (ggplot2)
Code — ~5 min- 01Install ggplot2 with install.packages('ggplot2') and load it with library(ggplot2).
- 02Build a data frame with at least an x and a y column. Add a group column if you want to color or shape by category.
- 03Pass the data frame to ggplot(aes(x = x, y = y)) and add geom_point(alpha = 0.4) for the markers.
- 04Add geom_smooth(method = 'lm', se = TRUE) for a linear trend line with a confidence ribbon, or method = 'loess' for a smoother.
- 05Annotate the R² by computing it with summary(lm(y ~ x, data = d))$r.squared and adding annotate('text', …).
- 06Apply labs() with title, x, and y arguments (include units), then theme_minimal(base_size = 12) for a clean default.
- 07For a third variable, map color or shape inside aes(): aes(x = x, y = y, color = group, shape = group). Use scale_color_brewer() for colorblind-safe palettes.
ggplot2’s geom_jitter() is the easiest way to recover stacked points when the X axis has only a few unique values — swap it in for geom_point().
JavaScript (D3.js)
Code — ~10 min- 01Install D3 (npm i d3) or include the CDN script tag in your HTML.
- 02Create an SVG container and set its viewBox, plus a margin object.
- 03Build linear scales for both axes with d3.scaleLinear().domain(d3.extent(data, d => d.x)).nice(); don’t force the domain to start at zero.
- 04Bind your data with selectAll('circle').data(data).join('circle') and set cx, cy from the scales, with a small radius (3–5 px) and fill-opacity ~0.5.
- 05Render the axes with d3.axisBottom() and d3.axisLeft(); add label <text> elements that include units.
- 06Compute and draw a regression line with d3.regressionLinear() from the d3-regression plugin, and print the resulting R² in a label.
- 07Make markers focusable: set tabindex=0, give each circle an aria-label like ‘x: 4 hours, y: 72 points’, and add a focus ring via :focus { stroke: black; stroke-width: 2px; }.
If you don’t need full control, Observable Plot, Plotly, ECharts, or Vega-Lite all give you a working scatter plot with tooltips and zoom in fewer lines.
Tableau
BI — ~4 min- 01Connect to your data and drag the first numeric measure to the Columns shelf and the second to the Rows shelf. Tableau will aggregate by default.
- 02Disaggregate by toggling Analysis → Aggregate Measures off, so each row in the data becomes its own marker rather than a single mean.
- 03Change the Marks card to Circle and reduce opacity to ~50% under the Color property to handle overplotting.
- 04Drag a third dimension to the Color property of the Marks card to color markers by category, and to Shape if you want shapes too.
- 05Add a trend line by going to Analytics → Trend Line → Linear; right-click the line and choose Describe Trend Line to see R² and p-values.
- 06Right-click each axis and choose Edit Axis to crop the range to the data and add a unit-aware title.
- 07Use the Highlight tool or annotations to label two or three notable observations rather than every marker.
Tableau’s default ‘SUM aggregation’ is the single most common scatter mistake — it collapses every row into a single point. Always disaggregate first.
Power BI
BI — ~4 min- 01In Power BI Desktop, open the Visualizations pane and choose the Scatter chart visual.
- 02Drag your X numeric measure to the X Axis well and your Y numeric measure to the Y Axis well.
- 03Drag a row identifier (an ID, country, or product key) to the Details well so each marker represents one observation rather than the aggregated total.
- 04Open the Format pane, expand Markers, and lower the marker fill transparency to ~50% to handle overplotting.
- 05Under Format → Analytics, add a Trend line and tick the option to display the equation and R² on the chart.
- 06Drag a categorical column to the Legend well to color markers by category, and to the Size well only if a meaningful third quantitative variable exists.
- 07Edit each axis under Format → X axis / Y axis to set Start and End to the data range plus a small margin, and label both with units.
If you forget the Details well, Power BI silently aggregates every row into one marker per legend group — just like Tableau. Always check the marker count.
// 16 — Code examples
Working code in the most common stacks
Three runnable snippets that produce the same chart — a scatter of 24 students’ study hours vs exam scores, with a linear trend line and the R² printed in the corner. Copy, paste, and replace the data with yours.
import matplotlib.pyplot as plt
import numpy as np
# Study hours vs exam score for 24 students.
hours = np.array([1.5, 2.0, 2.2, 2.8, 3.0, 3.3, 3.7, 4.1, 4.4, 4.6,
5.0, 5.2, 5.5, 5.8, 6.1, 6.4, 6.7, 7.0, 7.3, 7.6,
8.0, 8.4, 8.8, 9.2])
scores = np.array([42, 48, 51, 55, 53, 58, 62, 60, 66, 64,
68, 71, 70, 74, 76, 79, 81, 84, 82, 87,
89, 92, 91, 95])
fig, ax = plt.subplots(figsize=(8, 4.8))
ax.scatter(hours, scores, s=42, alpha=0.55,
color="#c94a2e", edgecolor="#c94a2e", linewidth=0.8)
# Linear regression line + R-squared.
slope, intercept = np.polyfit(hours, scores, 1)
xs = np.linspace(hours.min(), hours.max(), 100)
ax.plot(xs, slope * xs + intercept,
color="#1a1a18", linewidth=1.2, linestyle="--", alpha=0.7)
r2 = np.corrcoef(hours, scores)[0, 1] ** 2
ax.text(0.04, 0.92, f"R² = {r2:.2f}", transform=ax.transAxes,
fontsize=11, color="#1a1a18")
ax.set_title("Each extra hour of study lined up with about six more exam points",
loc="left", fontsize=13)
ax.set_xlabel("Study hours per week")
ax.set_ylabel("Exam score (out of 100)")
ax.spines[["top", "right"]].set_visible(False)
ax.set_xlim(hours.min() - 0.5, hours.max() + 0.5)
ax.set_ylim(scores.min() - 5, scores.max() + 5)
plt.tight_layout()
plt.savefig("scatter_plot.png", dpi=200)
plt.show()
// 17 — FAQs
Frequently asked questions
What is a scatter plot?+
A scatter plot (also called a scatter chart, scattergram, or XY plot) displays individual data points on a two-dimensional plane, where one variable is plotted on the horizontal axis and another on the vertical axis. Each dot represents a single observation or measurement, and the cloud of dots reveals whether the two variables are related, how strong the relationship is, and where the unusual cases sit.
When should you use a scatter plot?+
Use a scatter plot when you want to explore the relationship between two continuous variables — for example, study hours versus exam score, or advertising spend versus sales. Scatter plots are also the right choice when you need to spot outliers, check whether a linear regression model is appropriate, or compare clusters across colored or shaped subgroups.
When should you avoid a scatter plot?+
Avoid a scatter plot when one axis is categorical (countries, products, teams) — use a bar chart, dot plot, or strip plot instead. They are also a poor fit when you have fewer than ten observations (the pattern won’t be statistically meaningful), when both variables are categorical (use a mosaic plot), or when the data is so dense that points overlap into a single blob — reach for a hex bin or 2D density plot.
How is a scatter plot different from a line chart?+
A line chart connects observations in time order to emphasize trend; a scatter plot leaves the points unconnected to emphasize relationship. Use a line chart when the X-axis is time and the order matters; use a scatter plot when X is any continuous variable and you care about whether Y goes up, down, or stays flat as X changes.
How is a scatter plot different from a bubble chart?+
A scatter plot encodes two variables — one per axis. A bubble chart adds a third variable encoded as the area (not the radius) of each marker, and often a fourth variable as color. Use a scatter plot when you only have two continuous variables to compare; switch to a bubble chart when a meaningful third quantitative variable would otherwise be hidden.
How is a scatter plot different from a hex bin plot?+
Both show the relationship between two continuous variables, but a scatter plot draws one mark per row while a hex bin plot draws one hexagonal cell per local group of rows and color-codes the count. When you have more than a few thousand points and the marks overlap into a solid blob, swap the scatter for a hex bin so density becomes legible again.
What does correlation mean in a scatter plot?+
Correlation is a number between −1 and 1 that summarizes how tightly the two variables move together. A value near +1 means as X rises, Y reliably rises; a value near −1 means as X rises, Y reliably falls; a value near 0 means there is no consistent relationship. Most scatter plots are read with the Pearson correlation, but Spearman is more robust to outliers.
Does correlation prove causation?+
No. A scatter plot can only show that two variables move together; it cannot show why. Ice cream sales and drownings rise together every summer, but ice cream doesn’t cause drowning — hot weather drives both. Treat every scatter plot as a question (“why do these move together?”) rather than an answer.
How many data points do you need for a scatter plot?+
About 20 is a sensible minimum for visually trusting a pattern. With fewer than 10, almost any pattern you see could be random noise. Past several thousand, individual marks stop being legible and you should switch to alpha blending, jittering, or a density-based variant such as hex bin or 2D KDE.
Should the axes start at zero in a scatter plot?+
Not necessarily. Unlike a bar chart, a scatter plot encodes value with position, not length, so a non-zero baseline does not exaggerate ratios. Crop the axes to the data range so the cloud fills the chart — but always keep the visible range honest and label it clearly so readers know they are looking at a zoomed view.
What is Anscombe’s quartet?+
Anscombe’s quartet is a set of four datasets, published by statistician Frank Anscombe in 1973, that share nearly identical means, variances, correlation, and regression line — yet look completely different when plotted. It is the canonical reason why every analyst should plot the data, not just compute the summary statistics, before reporting a result.
How do you handle overplotting in a scatter plot?+
Three approaches solve overplotting in order of increasing effort: lower the marker alpha (e.g., 0.2) so density emerges from overlap, jitter near-identical values to reveal stacked points, or aggregate into a hex bin or 2D density plot when marker count exceeds ~5,000. Sampling a representative subset is a fourth option for exploratory work.
Can a scatter plot encode a third variable?+
Yes — by mapping a third variable to the marker color, shape, or size. Color works for categorical or sequential variables, shape works for small numbers of categorical groups, and size (i.e., a bubble chart) works for a positive quantitative variable. Only encode a third variable when its presence answers a question your reader actually has.
What category of chart is a scatter plot?+
Scatter Plot belongs to the Correlation family of charts. Charts in that family — bubble chart, hex bin plot, 2D density plot, scatter plot matrix — are all designed to answer how two or more continuous variables relate, so they often work as alternatives when one doesn’t quite fit your data.
What’s the best library for building scatter plots in code?+
For static, publication-quality scatter plots, Matplotlib (Python) and ggplot2 (R) are the standard choices. For interactive web scatter plots, D3.js gives the most control, while Plotly, Observable Plot, ECharts, and Vega-Lite get you to a working chart in fewer lines and ship reasonable defaults for tooltips and zoom.
// 18 — References
References and further reading
Primary sources, reference texts, and the official documentation for the libraries and tools referenced throughout this guide.
- Wikipedia — Scatter plotReferenceEncyclopedia entry covering the history, variants, and visual encoding of scatter plots, including the Herschel and Galton origin stories. A solid neutral starting point with citations.https://en.wikipedia.org/wiki/Scatter_plot
- Galton’s original paper plotting parents’ against children’s heights — the first published scatter plot used for regression analysis. Hosted by galton.org.https://galton.org/essays/1880-1889/galton-1886-jaigi-regression-stature.pdf
- F. J. Anscombe — Graphs in Statistical Analysis (1973)Primary sourceThe paper that introduced Anscombe’s quartet: four datasets with identical summary statistics but radically different scatter plots. The canonical reason to plot before you summarize.https://www.jstor.org/stable/2682899
- Tufte’s foundational text on data graphics. The chapters on data-ink ratio and small multiples explain why plain scatters beat decorated ones, and motivate scatter plot matrices.https://www.edwardtufte.com/book/the-visual-display-of-quantitative-information/
- Hands-on tutorial with real published examples. Especially useful for handling overplotting, choosing axis ranges, and adding meaningful annotations.https://academy.datawrapper.de/article/255-what-to-consider-when-creating-a-scatterplot
- Financial Times — Visual VocabularyReferenceOpen-source poster categorizing chart types by intent. Scatter plots, bubble charts, and connected scatters all sit in the Correlation family alongside hex bins and density plots.https://github.com/Financial-Times/chart-doctor/tree/main/visual-vocabulary
- WAI — Complex Images: Charts and GraphsAccessibilityWeb Accessibility Initiative guidance on making charts accessible: text alternatives, long descriptions, and data tables. Use this when building the accessibility checklist for a scatter plot.https://www.w3.org/WAI/tutorials/images/complex/
- Official API reference for the Python scatter plot helper used in this guide’s code sample, including marker, alpha, and colormap arguments.https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html
- Tidyverse documentation for the ggplot2 point geometry. Covers aesthetics, jittering, and combining with geom_smooth() for trend lines.https://ggplot2.tidyverse.org/reference/geom_point.html
- Maintained Observable notebook from the D3 team that mirrors the JavaScript code sample in this guide, with worked examples for axes, voronoi tooltips, and zoom.https://observablehq.com/@d3/scatterplot
- Cleveland & McGill — Graphical Perception (1984)Primary sourceLandmark experimental study on which visual encodings let humans estimate quantities most accurately. Position along a common axis (the scatter plot’s native encoding) ranks at the top.https://www.jstor.org/stable/2288400