Descriptive Statistics (mean, median, variance)
Central tendency
Mean
- Sensitive to outliers
- Good for symmetric distributions
Mean
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print(np.mean(x))Mean
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print(np.mean(x))Median
- Robust to outliers
Median
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print(np.median(x))Median
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print(np.median(x))Mode
Useful for categorical data.
Mode (SciPy)
import numpy as np
from scipy import stats
x = np.array([1, 1, 2, 2, 2, 3])
print(stats.mode(x, keepdims=True))Mode (SciPy)
import numpy as np
from scipy import stats
x = np.array([1, 1, 2, 2, 2, 3])
print(stats.mode(x, keepdims=True))Spread (variability)
- Range: max - min (very sensitive)
- Variance: average squared distance from mean
- Standard deviation (std): sqrt(variance)
Variance / Std
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print("var:", np.var(x, ddof=1))
print("std:", np.std(x, ddof=1))Variance / Std
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
print("var:", np.var(x, ddof=1))
print("std:", np.std(x, ddof=1))IQR (interquartile range)
Robust measure of spread.
IQR
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
q1 = np.percentile(x, 25)
q3 = np.percentile(x, 75)
print("IQR:", q3 - q1)IQR
import numpy as np
x = np.array([10, 12, 12, 13, 12, 11, 100])
q1 = np.percentile(x, 25)
q3 = np.percentile(x, 75)
print("IQR:", q3 - q1)Quick checklist
- Use median/IQR when outliers exist
- Use mean/std when distribution is roughly symmetric
- Always visualize (histogram/boxplot) before trusting summary stats
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
