Skip to content

Bagging - Random Forest Regressor/Classifier

What bagging is

Bagging = Bootstrap Aggregating.

Steps:

  1. sample (with replacement) multiple datasets from the training set
  2. train one model per sample
  3. aggregate predictions (vote/average)

false


  flowchart TD
  D[Training data] --> S1[Bootstrap sample 1]
  D --> S2[Bootstrap sample 2]
  D --> S3[Bootstrap sample 3]
  S1 --> T1[Tree 1]
  S2 --> T2[Tree 2]
  S3 --> T3[Tree 3]
  T1 --> A[Aggregate]
  T2 --> A
  T3 --> A
  A --> P[Final prediction]

false

Random Forest in one line

A Random Forest is bagging + decision trees + random feature selection at each split.

This randomness increases diversity → improves generalization.

Classification vs regression

  • RandomForestClassifier: majority vote
  • RandomForestRegressor: average prediction

Scikit-learn examples

Random Forest (classification)
from sklearn.ensemble import RandomForestClassifier
 
rf = RandomForestClassifier(
    n_estimators=200,
    max_depth=None,
    random_state=42,
    n_jobs=-1,
)
Random Forest (classification)
from sklearn.ensemble import RandomForestClassifier
 
rf = RandomForestClassifier(
    n_estimators=200,
    max_depth=None,
    random_state=42,
    n_jobs=-1,
)
Random Forest (regression)
from sklearn.ensemble import RandomForestRegressor
 
rf = RandomForestRegressor(
    n_estimators=300,
    random_state=42,
    n_jobs=-1,
)
Random Forest (regression)
from sklearn.ensemble import RandomForestRegressor
 
rf = RandomForestRegressor(
    n_estimators=300,
    random_state=42,
    n_jobs=-1,
)

Useful hyperparameters

  • n_estimatorsn_estimators: more trees → better but slower
  • max_depthmax_depth: limits overfitting
  • min_samples_leafmin_samples_leaf: smooths leaves
  • max_featuresmax_features: controls feature randomness

Mini-checkpoint

First try:

  • deep trees in forest
  • tune max_depthmax_depth and min_samples_leafmin_samples_leaf if overfitting

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did