Precision, Recall, and F1-Score

Precision

Precision answers:

“Of all predicted positives, how many were correct?”

precision = TP / (TP + FP)precision = TP / (TP + FP)

High precision means few false alarms.

Recall

Recall answers:

“Of all actual positives, how many did we catch?”

recall = TP / (TP + FN)recall = TP / (TP + FN)

High recall means you miss fewer true cases.

F1-score

F1 is the harmonic mean:

F1 = 2 * (precision * recall) / (precision + recall)F1 = 2 * (precision * recall) / (precision + recall)

Good when you need a balance.

When to prefer which

fraud detection: often prioritize recall (don’t miss fraud)
email spam: balance (F1) or precision (don’t block good emails)
medical screening: often recall (catch cases) with follow-up tests

Scikit-learn example

Precision/Recall/F1

from sklearn.metrics import precision_score, recall_score, f1_score
 
print("precision:", precision_score(y_true, y_pred))
print("recall:", recall_score(y_true, y_pred))
print("f1:", f1_score(y_true, y_pred))

Precision/Recall/F1

from sklearn.metrics import precision_score, recall_score, f1_score
 
print("precision:", precision_score(y_true, y_pred))
print("recall:", recall_score(y_true, y_pred))
print("f1:", f1_score(y_true, y_pred))

Mini-checkpoint

If you increase threshold from 0.5 to 0.9:

precision usually goes (up/down)?
recall usually goes (up/down)?

(Precision up, recall down.)

If this helped you, consider buying me a coffee ☕

Buy me a coffee