Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

21. Calibration error

medium
GeneralGeneral
staff

Compute the Expected Calibration Error (ECE) to measure how well a classifier’s predicted probabilities match actual outcomes. ECE bins predictions by confidence and compares average confidence to empirical accuracy in each bin.

The metric is defined as:

ECE=∑b=1B∣Sb∣n∣acc(Sb)−conf(Sb)∣\mathrm{ECE} = \sum_{b=1}^{B}\frac{|S_b|}{n}\left|\mathrm{acc}(S_b)-\mathrm{conf}(S_b)\right|

where (S_b) is the set of indices in bin (b), (n) is the total number of samples, (\mathrm{acc}(S_b)) is average correctness, and (\mathrm{conf}(S_b)) is average predicted confidence.

Requirements

Implement the function:

python

Rules:

  • Use equal-width bins over the interval ([0, 1]) (e.g., for 10 bins: [0.0,0.1), …, [0.9,1.0]).
  • For each bin, compute accuracy as the fraction of samples with y_true == (y_prob >= 0.5) (i.e., predicted label uses threshold 0.5).
  • For each bin, compute confidence as the mean of y_prob in that bin.
  • Combine bins using the weighted sum shown in the formula (empty bins contribute 0).
  • Use only NumPy and Python built-in libraries (no sklearn/scipy).

Example

python

Output:

python
Input Signature
ArgumentType
n_binsint
y_probnp.ndarray
y_truenp.ndarray
Output Signature
Return NameType
valuefloat

Constraints

  • Use NumPy only; no sklearn/scipy.

  • Equal-width bins on [0,1].

  • Threshold predictions at 0.5 for accuracy.

Hint 1

Start by converting y_true and y_prob to NumPy arrays, then create n_bins+1 equally spaced edges in [0,1] using np.linspace.

Hint 2

Assign each probability to a bin index (0..n_bins-1). Watch the edge case where y_prob == 1.0 should fall into the last bin (not overflow). np.digitize + clamping works well.

Hint 3

For each bin: compute count, acc = mean(y_true == (y_prob>=0.5)), and conf = mean(y_prob). Accumulate ece += (count/n) * abs(acc-conf); skip empty bins.

Roles
ML Engineer
AI Engineer
Data Scientist
Quantitative Analyst
Companies
GeneralGeneral
Levels
staff
senior
Tags
calibration-metrics
binning
numpy-vectorization
binary-classification
29 people are solving this problem
Python LogoPython Editor
Ln 1, Col 1

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit