Threshold optimization - ML & AI Coding

245. Threshold optimization

medium

General

staff

Optimize a binary-classification decision threshold to maximize an evaluation metric on a validation set, a common step in ml_evaluation when converting predicted probabilities into hard labels.

Given predicted probabilities (p_i) and a threshold (t), the predicted label is (\hat{y}_i = \mathbb{1}[p_i \ge t]), and you should choose (t) that maximizes the F1 score:

F1(t)=\frac{2\cdot \text{Precision}(t)\cdot \text{Recall}(t)}{\text{Precision}(t)+\text{Recall}(t)}

Requirements

Implement the function

python

Rules:

Use thresholds drawn from the unique values in y_prob (plus 0.0 and 1.0 if you want), and pick the one with the highest F1.
If multiple thresholds tie for best F1, return the smallest threshold.
Do not use any prebuilt metric/threshold utilities (e.g., sklearn.metrics).
Return (best_threshold, best_f1) as Python floats.
Use only NumPy and Python built-in libraries.

Example

python

Output:

python

Input Signature

Argument	Type
y_prob	np.ndarray
y_true	np.ndarray

Output Signature

Return Name	Type
value	tuple

Constraints

Only NumPy; no sklearn metrics/utilities.
Thresholds from unique y_prob; ties choose smallest.
Return Python floats (best_threshold, best_f1).
Computations should handle NumPy array inputs.

Hint 1

Compute F1 for thresholds taken from unique y_prob values; tie-break by smallest threshold.

Hint 2

Sort by y_prob (descending) and sweep the threshold from high to low, updating TP/FP counts as more items become predicted-positive.

Hint 3

Process equal-probability values as a group (since p >= t includes all with p == t). Update TP/FP per group and compute precision/recall/F1 with divide-by-zero guards.

Roles

ML Engineer

AI Engineer

Data Scientist

Quantitative Analyst

Companies

General

Levels

staff

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit