Skip to content

Customer Churn Analysis

Goal

Given a customer dataset with churnchurn (0/1), analyze:

  • Overall churn rate
  • Churn by segment (plan, region)
  • Numeric differences (tenure, usage)

Step 1: Load

Load
import pandas as pd
 
df = pd.read_csv("data/churn.csv")
print(df.shape)
print(df.head())
Load
import pandas as pd
 
df = pd.read_csv("data/churn.csv")
print(df.shape)
print(df.head())

Step 2: Churn rate

Overall churn
rate = df["churn"].mean()
print("Churn rate:", rate)
Overall churn
rate = df["churn"].mean()
print("Churn rate:", rate)

Step 3: Churn by category

Churn by segment
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.barplot(data=df, x="plan", y="churn")
plt.title("Churn rate by plan")
plt.tight_layout()
plt.show()
Churn by segment
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.barplot(data=df, x="plan", y="churn")
plt.title("Churn rate by plan")
plt.tight_layout()
plt.show()

Step 4: Numeric differences

Tenure by churn
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.boxplot(data=df, x="churn", y="tenure")
plt.title("Tenure vs churn")
plt.tight_layout()
plt.show()
Tenure by churn
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.boxplot(data=df, x="churn", y="tenure")
plt.title("Tenure vs churn")
plt.tight_layout()
plt.show()

Step 5: Create a simple model-ready dataset

  • Handle missing values
  • Encode categories
  • Split train/test

This connects back to Phase 4 preprocessing.

Deliverable

Write insights:

  • Which plan/segment churns more?
  • Which features differ strongly?
  • What interventions might reduce churn?

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did