Dendrogram
A tree-shaped diagram where branch heights encode the distance at which clusters merge — the go-to visualization for hierarchical clustering results.
// 01 — The chart
What it looks like
A dendrogram showing hierarchical clustering of six samples. The dashed line indicates a cut that produces three clusters: {A,B}, {C,D}, {E,F}.
// 02 — Definition
What is a dendrogram?
A dendrogram (from Greek dendron, “tree”) is a tree diagram where the height of each branch point encodes the distance or dissimilarity at which two clusters merge. Individual observations sit at the bottom as leaves, and successively larger clusters form as you move upward.
Unlike a regular tree diagram (where link lengths are arbitrary), dendrograms have a meaningful vertical axis. The higher a merge occurs, the more dissimilar the two clusters being combined. This makes dendrograms the standard output of hierarchical clustering algorithms.
To decide on the number of clusters, you draw a horizontal cut line across the dendrogram. The number of vertical branches that cross this line equals the number of clusters. Moving the cut line up produces fewer, coarser clusters; moving it down produces more, finer clusters.
Linkage matters: The shape of a dendrogram depends entirely on the linkage method used (single, complete, average, Ward). Different methods can produce very different trees from the same data. Always report which linkage method was used.
// 03 — Anatomy
Parts of a dendrogram
// 04 — Usage
When to use it — and when not to
- Presenting hierarchical clustering results
- Showing evolutionary or phylogenetic relationships
- Exploring natural groupings at multiple granularity levels
- The merge distance is meaningful and worth encoding
- You need to choose the number of clusters by visual inspection
- Comparing similarity between items in a small-to-medium dataset
- You have thousands of leaves — it becomes an unreadable wall
- The distance metric is arbitrary or meaningless
- You need non-hierarchical clusters — use k-means scatter plots instead
- Your hierarchy doesn't have meaningful branch lengths — use a simple tree
- The audience is unfamiliar with clustering — simpler charts tell the story faster
- You only care about the final clusters, not the merge process
// 05 — Reading guide
How to read a dendrogram
Follow these steps whenever you encounter a dendrogram in the wild.
Start at the leaves
Read the labels at the bottom — each leaf is an individual observation or item. Similar items will be close together.
Read the vertical axis
The y-axis shows the distance or dissimilarity metric. Higher merge points mean the clusters being joined are more different from each other.
Follow the merges upward
Each horizontal bar shows two branches merging into one cluster. The height of this bar tells you how dissimilar those two sub-clusters were.
Look for large gaps
A big jump in merge height suggests a natural cluster boundary. Items below the jump are similar; items across the jump are different.
Draw a cut line
Place a horizontal line at a chosen height. Count how many vertical branches cross it — that's your number of clusters. Items within each branch form one cluster.
// 06 — Pitfalls
Common mistakes
Not reporting the linkage method
Single, complete, average, and Ward linkage produce very different dendrograms. Always state which method was used.
Ignoring the distance scale
Without a meaningful y-axis, readers can't interpret merge heights. Always include axis labels and units.
Over-interpreting leaf order
The left-right order of leaves is partially arbitrary — any two sibling branches can be swapped without changing the tree's meaning.
Too many leaves
With hundreds of leaves, individual labels become unreadable. Consider showing a truncated dendrogram or using a heatmap with dendrogram margins.
Using inappropriate distance metrics
Euclidean distance isn't always right — cosine similarity, correlation, or domain-specific distances may be more meaningful for your data.
// 07 — In the wild
Real-world examples
Gene expression analysis
Bioinformaticians cluster genes by expression profiles, using dendrograms to identify co-regulated gene groups and functional categories.
Market segmentation
Marketers cluster customers by purchase behavior, using dendrogram branch heights to decide how many segments are meaningfully distinct.
Phylogenetic trees
Evolutionary biologists use dendrograms to show how species diverged over time, with branch lengths proportional to evolutionary distance.
// 08 — Quick reference
Key facts
Also known as
Cluster tree, hierarchical clustering tree
Data type
Distance matrix or hierarchical clustering output
Best for
Clustering results, similarity analysis, phylogenetics
Audience level
Intermediate — common in scientific contexts
Leaf limit
~50–100 for readability
Related to
Tree diagram, heatmap with dendrogram, radial tree
// 09 — Variations
Variations and extensions
Clustered heatmap
A heatmap with dendrograms on the margins — rows and/or columns are reordered by cluster, making block patterns emerge.
Circular dendrogram
Leaves arranged in a circle with branches growing inward. Fits more leaves in less space and creates a visually striking display.
Tanglegram
Two dendrograms placed face-to-face with lines connecting matching leaves. Used to compare two different clustering solutions or evolutionary trees.
// 10 — FAQs
Frequently asked questions
What is a dendrogram?+
A dendrogram (from Greek dendron, "tree") is a tree diagram where the height of each branch point encodes the distance or dissimilarity at which two clusters merge. Individual observations sit at the bottom as leaves, and successively larger clusters form as you move upward.
When should you use a dendrogram?+
Use a dendrogram when presenting hierarchical clustering results. It also works well when showing evolutionary or phylogenetic relationships, and when exploring natural groupings at multiple granularity levels.
When should you avoid a dendrogram?+
Avoid a dendrogram when you have thousands of leaves — it becomes an unreadable wall. It is also a poor fit when the distance metric is arbitrary or meaningless, or when you need non-hierarchical clusters — use k-means scatter plots instead.
What is another name for a dendrogram?+
Dendrogram is also known as Cluster tree, hierarchical clustering tree. The name varies between fields, but the visualisation technique is the same.
What size of dataset works best for a dendrogram?+
Dendrogram works best for Clustering results, similarity analysis, phylogenetics. Outside that range the chart either looks empty or becomes too cluttered to read clearly.
Is a dendrogram suitable for dashboards?+
Yes — a dendrogram can work well in dashboards as long as the panel is large enough for readers to perceive the encoded values, has a clear title, and includes the legend or axis labels needed to interpret it.