The ROC Curve and AUC

What ROC is

ROC stands for Receiver Operating Characteristic.

It’s a curve that shows the tradeoff between:

TPR (True Positive Rate) = recall = TP / (TP + FN)TP / (TP + FN)
FPR (False Positive Rate) = FP / (FP + TN)FP / (FP + TN)

As you change the probability threshold, TPR and FPR change.

false

  flowchart LR
  S[Model outputs scores/probabilities] --> T[Choose threshold]
  T --> R[Compute TPR and FPR]
  R --> C[Plot ROC curve]

false

How to interpret the curve

A curve closer to the top-left is better.
The diagonal line is a “random guess” baseline.

AUC

AUC is the Area Under the ROC Curve.

AUC = 1.0 → perfect ranking
AUC = 0.5 → random

AUC measures how well the model ranks positives higher than negatives.

When ROC/AUC is useful

good for comparing classifiers independent of a single threshold
useful when you care about ranking quality

When ROC can mislead

With highly imbalanced data, PR curves (precision-recall) can be more informative.

Scikit-learn example

Compute ROC-AUC

from sklearn.metrics import roc_auc_score
 
# y_score is probability for the positive class
auc = roc_auc_score(y_true, y_score)
print("ROC-AUC:", auc)

Compute ROC-AUC

from sklearn.metrics import roc_auc_score
 
# y_score is probability for the positive class
auc = roc_auc_score(y_true, y_score)
print("ROC-AUC:", auc)

Mini-checkpoint

If a model has high accuracy but ROC-AUC ~ 0.5, what might be happening?

The model may be predicting the majority class and not learning useful ranking.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

The ROC Curve and AUC

What ROC is

false

How to interpret the curve

AUC

When ROC/AUC is useful

When ROC can mislead

Scikit-learn example

Mini-checkpoint

Was this page helpful?