Skip to content

Hierarchical Clustering (Dendrograms)

What hierarchical clustering does

Hierarchical clustering creates a hierarchy (tree) of clusters.

Two main styles:

  • Agglomerative (bottom-up): start with points, merge them
  • Divisive (top-down): start with one cluster, split it

Dendrogram intuition

A dendrogram is a tree diagram showing merge/split steps.

false


  flowchart TD
  A[Point A] --> AB[Merge]
  B[Point B] --> AB
  C[Point C] --> CD[Merge]
  D[Point D] --> CD
  AB --> ALL[Merge higher]
  CD --> ALL

false

Linkage criteria

How we measure distance between clusters:

  • single linkage (min distance)
  • complete linkage (max distance)
  • average linkage
  • Ward linkage (common default in sklearn; minimizes variance)

Scikit-learn example

AgglomerativeClustering
from sklearn.cluster import AgglomerativeClustering
 
hc = AgglomerativeClustering(n_clusters=3, linkage="ward")
labels = hc.fit_predict(X)
AgglomerativeClustering
from sklearn.cluster import AgglomerativeClustering
 
hc = AgglomerativeClustering(n_clusters=3, linkage="ward")
labels = hc.fit_predict(X)

Pros and cons

Pros:

  • no need to choose K in advance (you can cut the dendrogram later)
  • can work with different distance metrics

Cons:

  • can be slow on large datasets

Mini-checkpoint

Try Ward linkage and complete linkage and see how clusters differ.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did