Skip to content

Statistical Functions in NumPy

Why statistics in NumPy?

Quick descriptive stats help you:

  • Understand distributions
  • Detect outliers
  • Summarize data before modeling/visualization

Sample data

data
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
data
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])

Mean, median

mean-median
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.mean(data))
print(np.median(data))
mean-median
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.mean(data))
print(np.median(data))

Min, max, range

min-max
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.min(data))
print(np.max(data))
print(np.ptp(data))  # peak-to-peak = max - min
min-max
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.min(data))
print(np.max(data))
print(np.ptp(data))  # peak-to-peak = max - min

Variance and standard deviation

var-std
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.var(data))
print(np.std(data))
var-std
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.var(data))
print(np.std(data))

Percentiles / quantiles

percentile
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.percentile(data, 25))
print(np.percentile(data, 50))
print(np.percentile(data, 75))
percentile
import numpy as np
 
data = np.array([12, 15, 14, 10, 25, 19, 18])
 
print(np.percentile(data, 25))
print(np.percentile(data, 50))
print(np.percentile(data, 75))

Working across axes (2D)

axes
import numpy as np
 
mat = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
 
print(np.mean(mat, axis=0))  # per column
print(np.mean(mat, axis=1))  # per row
axes
import numpy as np
 
mat = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
 
print(np.mean(mat, axis=0))  # per column
print(np.mean(mat, axis=1))  # per row

Correlation and covariance

corr-cov
import numpy as np
 
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
 
cov = np.cov(x, y)
print("cov:\n", cov)
 
corr = np.corrcoef(x, y)
print("corr:\n", corr)
corr-cov
import numpy as np
 
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
 
cov = np.cov(x, y)
print("cov:\n", cov)
 
corr = np.corrcoef(x, y)
print("corr:\n", corr)

Next

Continue to: Saving and Loading NumPy Data to persist arrays efficiently.

๐Ÿงช Try It Yourself

Exercise 1 โ€“ Create a NumPy Array

Exercise 2 โ€“ Array Shape and Reshape

Exercise 3 โ€“ Array Arithmetic

If this helped you, consider buying me a coffee โ˜•

Buy me a coffee

Was this page helpful?

Let us know how we did