Support Vector Machines (SVM)
Core idea: maximum margin
SVM tries to find a decision boundary with the largest margin between classes.
false
flowchart LR A[Class -1 points] --> B[Max-margin hyperplane] C[Class +1 points] --> B
false
Support vectors
Only a subset of points (support vectors) define the boundary.
Kernels (non-linear boundaries)
SVM can use kernels to separate data that isnβt linearly separable.
Common kernels:
- linear
- RBF (radial basis function)
- polynomial
Scaling matters
SVM is sensitive to feature scales.
Use StandardScalerStandardScaler.
Scikit-learn example
SVM with scaling
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
svm = Pipeline(
steps=[
("scaler", StandardScaler()),
("model", SVC(kernel="rbf", C=1.0, gamma="scale", probability=True)),
]
)SVM with scaling
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
svm = Pipeline(
steps=[
("scaler", StandardScaler()),
("model", SVC(kernel="rbf", C=1.0, gamma="scale", probability=True)),
]
)Pros and cons
Pros:
- strong on medium-size datasets
- effective in high dimensions
Cons:
- can be slow on very large datasets
- less interpretable
Mini-checkpoint
Try a linear kernel and RBF kernel and compare.
If this helped you, consider buying me a coffee β
Buy me a coffeeWas this page helpful?
Let us know how we did
