Skip to content

Chi-Square Test (categorical association)

When to use

Use the chi-square test of independence when you have:

  • Two categorical variables
  • Counts in a contingency table

Example questions:

  • Is purchase (yes/no) associated with plan type (basic/pro)?
  • Is churn associated with region?

Example

Chi-square test
import pandas as pd
from scipy.stats import chi2_contingency
 
# Example contingency table
# rows: plan, columns: churn
ct = pd.DataFrame(
    {
        "churn_no": [80, 120],
        "churn_yes": [20, 60],
    },
    index=["basic", "pro"],
)
 
chi2, p, dof, expected = chi2_contingency(ct)
print("chi2:", chi2)
print("p:", p)
print("dof:", dof)
print("expected:\n", expected)
Chi-square test
import pandas as pd
from scipy.stats import chi2_contingency
 
# Example contingency table
# rows: plan, columns: churn
ct = pd.DataFrame(
    {
        "churn_no": [80, 120],
        "churn_yes": [20, 60],
    },
    index=["basic", "pro"],
)
 
chi2, p, dof, expected = chi2_contingency(ct)
print("chi2:", chi2)
print("p:", p)
print("dof:", dof)
print("expected:\n", expected)

Interpreting results

  • Small p-value β†’ evidence of association
  • Expected counts should not be too small (rule of thumb: mostly >= 5)

Effect size (optional)

After significance, consider effect size like CramΓ©r’s V.

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did