Skip to content

Cost Functions - Mean Squared Error (MSE)

What a cost function is

A cost function measures how bad predictions are.

Training usually means:

find parameters that minimize cost.

For regression, the most common is Mean Squared Error (MSE).

MSE definition

For N samples:

MSE = (1/N) * Σ (yi - ŷi)²MSE = (1/N) * Σ (yi - ŷi)²

Why square?

  • penalizes large errors more
  • differentiable (good for optimization)

Intuition

If you miss by:

  • 1 unit → error contributes 1
  • 10 units → error contributes 100

MSE pushes the model to avoid big mistakes.

MSE vs RMSE

  • MSE: squared units (harder to interpret)
  • RMSE: square root of MSE, back to original units

Code example

Compute MSE and RMSE
import numpy as np
 
y_true = np.array([3, 5, 2, 7])
y_pred = np.array([2.5, 5.2, 1.8, 7.9])
 
mse = np.mean((y_true - y_pred) ** 2)
rmse = np.sqrt(mse)
 
print("MSE:", mse)
print("RMSE:", rmse)
Compute MSE and RMSE
import numpy as np
 
y_true = np.array([3, 5, 2, 7])
y_pred = np.array([2.5, 5.2, 1.8, 7.9])
 
mse = np.mean((y_true - y_pred) ** 2)
rmse = np.sqrt(mse)
 
print("MSE:", mse)
print("RMSE:", rmse)

Mini-checkpoint

If your target is “price” in dollars:

  • which is easier to explain to a business stakeholder: MSE or RMSE?

(Usually RMSE.)

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did