DistributionIntermediate

ECDF Plot

The empirical cumulative distribution function — a step-function chart that answers “what percentage of data falls below this value?” without any binning.

// 01 — The chart

What it looks like

Example — Cumulative distribution of page load timesn = 1,000
100%50%0%0.5s1.5s2.5s3.5s50th pctl = 1.5s

An ECDF of page load times. The curve rises from 0% to 100%, and the median (50th percentile) can be read directly from the chart at 1.5 seconds.

// 02 — Definition

What is an ECDF plot?

The empirical cumulative distribution function (ECDF) plot shows, for every value on the x-axis, the proportion of data points that are less than or equal to that value. The result is a monotonically increasing step function that starts near 0% on the left and reaches 100% on the right.

Unlike histograms and density plots, an ECDF requires no binning decisions. There is no bandwidth parameter or bin width to tune — the ECDF is a direct, unsmoothed representation of the data. This makes it uniquely objective: two analysts looking at the same dataset will always produce the identical ECDF.

ECDFs are the backbone of many statistical tests (Kolmogorov-Smirnov, Anderson-Darling) that compare observed data to theoretical distributions or to each other. They’re also the most direct way to read percentiles: draw a horizontal line at any proportion, and where it hits the curve tells you the corresponding value.

Key advantage: No binning means no arbitrary choices. The ECDF is the only distribution plot that shows the data exactly as it is, with no smoothing, grouping, or approximation.

// 03 — Anatomy

Parts of an ECDF plot

ABC
A — Value axis (x): The measured variable — each data point maps to a position along this axis
B — Step function: The ECDF curve — rises by 1/n at each data point; steep sections = dense data
C — Proportion axis (y): Shows cumulative proportion from 0 (none below) to 1 (all below)

// 04 — Usage

When to use it — and when not to

✓Use an ECDF plot when…
  • You need to read exact percentiles from the chart (median, P95, P99)
  • You want to avoid binning decisions — ECDFs require no parameters
  • Comparing two or more distributions — overlaid ECDFs make differences obvious
  • Performing goodness-of-fit tests (K-S test) — the ECDF is the test statistic
  • SLA monitoring — ‘what percentage of requests complete under 200ms?’ is a direct read
×Avoid an ECDF plot when…
  • Your audience is unfamiliar with cumulative functions — a histogram is more intuitive
  • You want to see the distribution shape (peaks, skew) — density plots are better for that
  • You have only a handful of data points — the step function looks jagged and sparse
  • You need to see the mode — ECDFs don’t show peaks, only cumulative proportions
  • Visual impact matters more than precision — ECDFs look dry compared to filled charts

// 05 — Reading guide

How to read an ECDF plot

ECDFs answer percentile questions directly — no calculation needed.

1

Pick a value on the x-axis

Draw a vertical line up to the curve. Where it meets the curve, read across to the y-axis — that’s the proportion of data at or below that value.

2

Pick a proportion on the y-axis

Draw a horizontal line from (say) 0.5 across to the curve. Where it hits, read down to the x-axis — that’s the median. Any proportion gives you the corresponding percentile.

3

Read the steepness

Where the curve rises steeply, data is densely packed. Where it flattens, data is sparse. A sudden vertical jump = many tied values.

4

Compare overlaid ECDFs

The curve that is consistently to the left has lower values. The vertical gap between two curves at any x-value shows how much their cumulative proportions differ.

5

Check for stochastic dominance

If one ECDF is always above or always below another, that distribution is systematically larger or smaller — useful for ranking treatment effects.

// 06 — Data format

What your data should look like

A single column of numeric values. No pre-aggregation needed — the ECDF is computed from raw data.

load_time_ms
320
480
510
750
1020
1480
2100
3200

Code sketch — Python

import seaborn as sns
sns.ecdfplot(data=df, x="load_time_ms")

// 07 — Construction

How to build one, step by step

01.

Sort the data values from smallest to largest.

02.

Assign each value its cumulative proportion: the i-th value out of n gets proportion i/n.

03.

Plot each (value, proportion) pair as a point.

04.

Connect the points with horizontal-then-vertical step segments (or smooth if sample is large).

05.

Set the y-axis from 0 to 1 (or 0% to 100%). The x-axis spans the data range.

06.

Optionally overlay a theoretical CDF (e.g., normal) for comparison.

// 08 — Common mistakes

Mistakes to avoid

Confusing ECDF with density

The ECDF shows cumulative proportion, not frequency density. A steep section means high density — it’s the derivative, not the curve itself.

Not labeling the y-axis

Always label the y-axis as 'Proportion' or 'Cumulative %'. Without it, readers may misinterpret the chart as a time series.

Overplotting many ECDFs

Beyond 4–5 overlaid ECDFs, lines tangle. Use small multiples or color-code with a clear legend.

Interpolating between steps

With small samples, each step matters. Don’t smooth or interpolate — the step function IS the ECDF.

// 09 — In the wild

Real-world examples

01

Site reliability engineering

P50, P95, P99 latency SLAs are read directly from ECDF plots of request response times.

02

Clinical trials

Survival analysis uses the complement of the ECDF (the Kaplan-Meier curve) to estimate patient survival probabilities.

03

Quality assurance

Manufacturing uses ECDFs to check what proportion of products fall within tolerance limits.

// 10 — At a glance

Quick reference

Category

Distribution

Data type

Continuous numeric

Best for

Percentile reading

Parameters

None (bin-free)

Difficulty

Intermediate

Related test

K-S test

// 11 — Accessibility

Accessibility notes

✓

Use a thick line (2–3px) for the ECDF step function so it remains visible at small sizes

✓

When overlaying groups, use dash patterns in addition to color for colorblind accessibility

✓

Add interactive tooltips showing exact (value, percentile) pairs on hover

✓

Provide a companion table of key percentiles (P10, P25, P50, P75, P90, P95, P99)

✓

Label the y-axis clearly as proportion or percentage to avoid misinterpretation

// 12 — Variations

Variations

Complementary ECDF (CCDF)

Plots 1 - ECDF, answering ‘what proportion exceeds this value?’ Common in survival analysis and tail-risk assessment.

ECDF with confidence band

Adds a shaded confidence region (e.g., Dvoretzky-Kiefer-Wolfowitz band) showing uncertainty around the empirical estimate.

Q-Q variant (ECDF vs. theoretical)

Overlays the theoretical CDF on the ECDF — departures show where the data diverges from the model.

Weighted ECDF

Each data point contributes a different weight to the cumulative sum — used in survey data and importance sampling.

// 13 — FAQs

Frequently asked questions

What is an ecdf plot?+

The empirical cumulative distribution function (ECDF) plot shows, for every value on the x-axis, the proportion of data points that are less than or equal to that value. The result is a monotonically increasing step function that starts near 0% on the left and reaches 100% on the right.

When should you use an ecdf plot?+

Use an ecdf plot when you need to read exact percentiles from the chart (median, P95, P99). It also works well when you want to avoid binning decisions — ECDFs require no parameters, and when comparing two or more distributions — overlaid ECDFs make differences obvious.

When should you avoid an ecdf plot?+

Avoid an ecdf plot when your audience is unfamiliar with cumulative functions — a histogram is more intuitive. It is also a poor fit when you want to see the distribution shape (peaks, skew) — density plots are better for that, or when you have only a handful of data points — the step function looks jagged and sparse.

What data do you need to make an ecdf plot?+

A single column of numeric values. No pre-aggregation needed — the ECDF is computed from raw data.

What size of dataset works best for an ecdf plot?+

ECDF Plot works best for Percentile reading. Outside that range the chart either looks empty or becomes too cluttered to read clearly.

Are ecdf plots accessible to screen readers?+

Yes — a ecdf plot can be made accessible to screen readers by pairing it with a clear text summary of the key insight, ensuring color choices meet WCAG contrast guidelines, adding descriptive alt text or aria-label to the SVG, and offering the underlying data as an HTML table fallback for assistive technologies.