HierarchyIntermediate

Dendrogram

A tree-shaped diagram where branch heights encode the distance at which clusters merge — the go-to visualization for hierarchical clustering results.

// 01 — The chart

What it looks like

Example — Hierarchical clustering of six data samples
02468ABCDEFcut

A dendrogram showing hierarchical clustering of six samples. The dashed line indicates a cut that produces three clusters: {A,B}, {C,D}, {E,F}.

// 02 — Definition

What is a dendrogram?

A dendrogram (from Greek dendron, “tree”) is a tree diagram where the height of each branch point encodes the distance or dissimilarity at which two clusters merge. Individual observations sit at the bottom as leaves, and successively larger clusters form as you move upward.

Unlike a regular tree diagram (where link lengths are arbitrary), dendrograms have a meaningful vertical axis. The higher a merge occurs, the more dissimilar the two clusters being combined. This makes dendrograms the standard output of hierarchical clustering algorithms.

To decide on the number of clusters, you draw a horizontal cut line across the dendrogram. The number of vertical branches that cross this line equals the number of clusters. Moving the cut line up produces fewer, coarser clusters; moving it down produces more, finer clusters.

Linkage matters: The shape of a dendrogram depends entirely on the linkage method used (single, complete, average, Ward). Different methods can produce very different trees from the same data. Always report which linkage method was used.

// 03 — Anatomy

Parts of a dendrogram

ABCD
A — Leaves: Individual observations at the bottom of the tree — the most granular level
B — Merge points: Horizontal bars connecting two branches — height encodes the distance at which clusters merge
C — Distance axis: Vertical axis measuring dissimilarity — higher means more different
D — Cut line: A horizontal line drawn to determine the number of clusters — branches crossing it define cluster membership

// 04 — Usage

When to use it — and when not to

✓Use a dendrogram when…
  • Presenting hierarchical clustering results
  • Showing evolutionary or phylogenetic relationships
  • Exploring natural groupings at multiple granularity levels
  • The merge distance is meaningful and worth encoding
  • You need to choose the number of clusters by visual inspection
  • Comparing similarity between items in a small-to-medium dataset
×Avoid a dendrogram when…
  • You have thousands of leaves — it becomes an unreadable wall
  • The distance metric is arbitrary or meaningless
  • You need non-hierarchical clusters — use k-means scatter plots instead
  • Your hierarchy doesn't have meaningful branch lengths — use a simple tree
  • The audience is unfamiliar with clustering — simpler charts tell the story faster
  • You only care about the final clusters, not the merge process

// 05 — Reading guide

How to read a dendrogram

Follow these steps whenever you encounter a dendrogram in the wild.

1

Start at the leaves

Read the labels at the bottom — each leaf is an individual observation or item. Similar items will be close together.

2

Read the vertical axis

The y-axis shows the distance or dissimilarity metric. Higher merge points mean the clusters being joined are more different from each other.

3

Follow the merges upward

Each horizontal bar shows two branches merging into one cluster. The height of this bar tells you how dissimilar those two sub-clusters were.

4

Look for large gaps

A big jump in merge height suggests a natural cluster boundary. Items below the jump are similar; items across the jump are different.

5

Draw a cut line

Place a horizontal line at a chosen height. Count how many vertical branches cross it — that's your number of clusters. Items within each branch form one cluster.

// 06 — Pitfalls

Common mistakes

×

Not reporting the linkage method

Single, complete, average, and Ward linkage produce very different dendrograms. Always state which method was used.

×

Ignoring the distance scale

Without a meaningful y-axis, readers can't interpret merge heights. Always include axis labels and units.

×

Over-interpreting leaf order

The left-right order of leaves is partially arbitrary — any two sibling branches can be swapped without changing the tree's meaning.

×

Too many leaves

With hundreds of leaves, individual labels become unreadable. Consider showing a truncated dendrogram or using a heatmap with dendrogram margins.

×

Using inappropriate distance metrics

Euclidean distance isn't always right — cosine similarity, correlation, or domain-specific distances may be more meaningful for your data.

// 07 — In the wild

Real-world examples

Gene expression analysis

Bioinformaticians cluster genes by expression profiles, using dendrograms to identify co-regulated gene groups and functional categories.

Market segmentation

Marketers cluster customers by purchase behavior, using dendrogram branch heights to decide how many segments are meaningfully distinct.

Phylogenetic trees

Evolutionary biologists use dendrograms to show how species diverged over time, with branch lengths proportional to evolutionary distance.

// 08 — Quick reference

Key facts

Also known as

Cluster tree, hierarchical clustering tree

Data type

Distance matrix or hierarchical clustering output

Best for

Clustering results, similarity analysis, phylogenetics

Audience level

Intermediate — common in scientific contexts

Leaf limit

~50–100 for readability

Related to

Tree diagram, heatmap with dendrogram, radial tree

// 09 — Variations

Variations and extensions

Clustered heatmap

A heatmap with dendrograms on the margins — rows and/or columns are reordered by cluster, making block patterns emerge.

Circular dendrogram

Leaves arranged in a circle with branches growing inward. Fits more leaves in less space and creates a visually striking display.

Tanglegram

Two dendrograms placed face-to-face with lines connecting matching leaves. Used to compare two different clustering solutions or evolutionary trees.

// 10 — FAQs

Frequently asked questions

What is a dendrogram?+

A dendrogram (from Greek dendron, "tree") is a tree diagram where the height of each branch point encodes the distance or dissimilarity at which two clusters merge. Individual observations sit at the bottom as leaves, and successively larger clusters form as you move upward.

When should you use a dendrogram?+

Use a dendrogram when presenting hierarchical clustering results. It also works well when showing evolutionary or phylogenetic relationships, and when exploring natural groupings at multiple granularity levels.

When should you avoid a dendrogram?+

Avoid a dendrogram when you have thousands of leaves — it becomes an unreadable wall. It is also a poor fit when the distance metric is arbitrary or meaningless, or when you need non-hierarchical clusters — use k-means scatter plots instead.

What is another name for a dendrogram?+

Dendrogram is also known as Cluster tree, hierarchical clustering tree. The name varies between fields, but the visualisation technique is the same.

What size of dataset works best for a dendrogram?+

Dendrogram works best for Clustering results, similarity analysis, phylogenetics. Outside that range the chart either looks empty or becomes too cluttered to read clearly.

Is a dendrogram suitable for dashboards?+

Yes — a dendrogram can work well in dashboards as long as the panel is large enough for readers to perceive the encoded values, has a clear title, and includes the legend or axis labels needed to interpret it.