RandomizedSearchCV for Large Parameter Spaces
Why randomized search
Randomized search samples a fixed number of combinations from distributions.
Itβs often more efficient than grid search because:
- not all hyperparameters matter equally
- you can explore wide ranges quickly
The idea
false
flowchart LR D[Parameter distributions] --> R[Sample N combos] R --> CV[Cross-validation] CV --> B[Best found]
false
Scikit-learn example
RandomizedSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
params = {
"n_estimators": [200, 400, 800],
"max_depth": [None, 5, 10, 20],
"min_samples_leaf": [1, 2, 4, 8],
}
search = RandomizedSearchCV(
estimator=RandomForestClassifier(random_state=42),
param_distributions=params,
n_iter=20,
cv=5,
scoring="accuracy",
n_jobs=-1,
random_state=42,
)
search.fit(X, y)
print("best params:", search.best_params_)
print("best score:", search.best_score_)RandomizedSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
params = {
"n_estimators": [200, 400, 800],
"max_depth": [None, 5, 10, 20],
"min_samples_leaf": [1, 2, 4, 8],
}
search = RandomizedSearchCV(
estimator=RandomForestClassifier(random_state=42),
param_distributions=params,
n_iter=20,
cv=5,
scoring="accuracy",
n_jobs=-1,
random_state=42,
)
search.fit(X, y)
print("best params:", search.best_params_)
print("best score:", search.best_score_)When to use which
- GridSearchCV: small search space
- RandomizedSearchCV: large/continuous search space
Mini-checkpoint
If youβre tuning learning rate in [1e-4, 1], randomized search is usually better.
If this helped you, consider buying me a coffee β
Buy me a coffeeWas this page helpful?
Let us know how we did
