Skip to content

The Elbow Method for Finding "K"

Why choosing K is hard

K-means requires K as an input.

Too small K:

  • clusters too broad

Too large K:

  • clusters become fragmented

Inertia (within-cluster sum of squares)

K-means optimizes β€œinertia”.

As K increases:

  • inertia always decreases

The elbow method looks for a point where improvement slows.

false


  flowchart LR
  K[K increases] --> I[Inertia decreases]
  I --> E[Look for elbow (diminishing returns)]

false

Typical workflow

  1. run K-means for K = 1..N
  2. plot K vs inertia
  3. pick the elbow point

Scikit-learn snippet

Elbow method (inertia)
from sklearn.cluster import KMeans
 
inertias = []
for k in range(1, 11):
    km = KMeans(n_clusters=k, n_init=10, random_state=42)
    km.fit(X)
    inertias.append(km.inertia_)
Elbow method (inertia)
from sklearn.cluster import KMeans
 
inertias = []
for k in range(1, 11):
    km = KMeans(n_clusters=k, n_init=10, random_state=42)
    km.fit(X)
    inertias.append(km.inertia_)

Limitations

  • sometimes no clear elbow
  • prefers spherical clusters (because K-means does)

You can also use:

  • silhouette score
  • domain knowledge

Mini-checkpoint

If there is no elbow, try:

  • silhouette score
  • DBSCAN
  • hierarchical clustering

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did