Support Vector Machines (SVM)

Core idea: maximum margin

SVM tries to find a decision boundary with the largest margin between classes.

false

  flowchart LR
  A[Class -1 points] --> B[Max-margin hyperplane]
  C[Class +1 points] --> B

false

Support vectors

Only a subset of points (support vectors) define the boundary.

Kernels (non-linear boundaries)

SVM can use kernels to separate data that isn’t linearly separable.

Common kernels:

linear
RBF (radial basis function)
polynomial

Scaling matters

SVM is sensitive to feature scales.

Use StandardScalerStandardScaler.

Scikit-learn example

SVM with scaling

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
 
svm = Pipeline(
    steps=[
        ("scaler", StandardScaler()),
        ("model", SVC(kernel="rbf", C=1.0, gamma="scale", probability=True)),
    ]
)

SVM with scaling

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
 
svm = Pipeline(
    steps=[
        ("scaler", StandardScaler()),
        ("model", SVC(kernel="rbf", C=1.0, gamma="scale", probability=True)),
    ]
)

Pros and cons

Pros:

strong on medium-size datasets
effective in high dimensions

Cons:

can be slow on very large datasets
less interpretable

Mini-checkpoint

Try a linear kernel and RBF kernel and compare.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Support Vector Machines (SVM)

Core idea: maximum margin

false

Support vectors

Kernels (non-linear boundaries)

Scaling matters

Scikit-learn example

Pros and cons

Mini-checkpoint

Was this page helpful?