Customer Churn Analysis
Goal
Given a customer dataset with churnchurn (0/1), analyze:
- Overall churn rate
- Churn by segment (plan, region)
- Numeric differences (tenure, usage)
Step 1: Load
Load
import pandas as pd
df = pd.read_csv("data/churn.csv")
print(df.shape)
print(df.head())Load
import pandas as pd
df = pd.read_csv("data/churn.csv")
print(df.shape)
print(df.head())Step 2: Churn rate
Overall churn
rate = df["churn"].mean()
print("Churn rate:", rate)Overall churn
rate = df["churn"].mean()
print("Churn rate:", rate)Step 3: Churn by category
Churn by segment
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(7, 4))
sns.barplot(data=df, x="plan", y="churn")
plt.title("Churn rate by plan")
plt.tight_layout()
plt.show()Churn by segment
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(7, 4))
sns.barplot(data=df, x="plan", y="churn")
plt.title("Churn rate by plan")
plt.tight_layout()
plt.show()Step 4: Numeric differences
Tenure by churn
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(7, 4))
sns.boxplot(data=df, x="churn", y="tenure")
plt.title("Tenure vs churn")
plt.tight_layout()
plt.show()Tenure by churn
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(7, 4))
sns.boxplot(data=df, x="churn", y="tenure")
plt.title("Tenure vs churn")
plt.tight_layout()
plt.show()Step 5: Create a simple model-ready dataset
- Handle missing values
- Encode categories
- Split train/test
This connects back to Phase 4 preprocessing.
Deliverable
Write insights:
- Which plan/segment churns more?
- Which features differ strongly?
- What interventions might reduce churn?
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
