Skip to content

A/B Testing Basics

What is A/B testing?

A/B testing compares two variants:

  • A = control
  • B = treatment

You measure a metric (conversion, revenue, retention) and test whether B improves it.

Choose the right metric

  • Primary metric (one)
  • Guardrail metrics (to prevent harm)

Examples:

  • Primary: conversion rate
  • Guardrail: refund rate, latency

Example: compare conversion rates

Two-proportion z-test (approx)
import numpy as np
 
# A: 120 conversions out of 4000
x1, n1 = 120, 4000
# B: 150 conversions out of 4100
x2, n2 = 150, 4100
 
p1 = x1 / n1
p2 = x2 / n2
 
p_pool = (x1 + x2) / (n1 + n2)
se = np.sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))
 
z = (p2 - p1) / se
 
# two-sided p-value using normal approximation
from math import erf, sqrt
 
def norm_cdf(z):
    return 0.5 * (1 + erf(z / sqrt(2)))
 
p_value = 2 * (1 - norm_cdf(abs(z)))
 
print("p1:", p1)
print("p2:", p2)
print("z:", z)
print("p-value:", p_value)
Two-proportion z-test (approx)
import numpy as np
 
# A: 120 conversions out of 4000
x1, n1 = 120, 4000
# B: 150 conversions out of 4100
x2, n2 = 150, 4100
 
p1 = x1 / n1
p2 = x2 / n2
 
p_pool = (x1 + x2) / (n1 + n2)
se = np.sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))
 
z = (p2 - p1) / se
 
# two-sided p-value using normal approximation
from math import erf, sqrt
 
def norm_cdf(z):
    return 0.5 * (1 + erf(z / sqrt(2)))
 
p_value = 2 * (1 - norm_cdf(abs(z)))
 
print("p1:", p1)
print("p2:", p2)
print("z:", z)
print("p-value:", p_value)

Practical guidance

  • Run long enough to cover weekly seasonality.
  • Don’t peek too often (increases false positives).
  • Report effect size + CI.

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did