Skip to content

Feature Scaling (MinMax vs Standard)

Why scale features?

Scaling is mainly important for ML algorithms that use distances or gradients:

  • k-NN
  • k-means
  • SVM
  • linear/logistic regression (often)
  • neural networks

Tree-based models (like Random Forest) usually don’t need scaling.

Two common scalers

Min-Max scaling

Maps values to a fixed range (usually 0 to 1):

  • x_scaled = (x - min) / (max - min)

Standardization (z-score scaling)

Centers and scales to mean=0 and std=1:

  • x_scaled = (x - mean) / std

Example (using scikit-learn)

MinMaxScaler vs StandardScaler
import numpy as np
from sklearn.preprocessing import MinMaxScaler, StandardScaler
 
x = np.array([[10], [20], [30], [100]])
 
mm = MinMaxScaler()
ss = StandardScaler()
 
print("MinMax:")
print(mm.fit_transform(x))
 
print("Standard:")
print(ss.fit_transform(x))
MinMaxScaler vs StandardScaler
import numpy as np
from sklearn.preprocessing import MinMaxScaler, StandardScaler
 
x = np.array([[10], [20], [30], [100]])
 
mm = MinMaxScaler()
ss = StandardScaler()
 
print("MinMax:")
print(mm.fit_transform(x))
 
print("Standard:")
print(ss.fit_transform(x))

Important rule: fit on train only

When doing ML:

  • Fit scaler on training data
  • Transform both train and test using that scaler

This avoids data leakage.

Practical guidance

  • If data has strong outliers: StandardScaler can be influenced; consider RobustScaler.
  • If you need bounded values (0–1): use MinMaxScaler.

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did