Stable softmax - ML & AI Coding

208. Stable softmax

easy

General

senior

Implement a numerically stable softmax as used inside Transformer attention, so you can safely turn attention logits into probabilities without overflow. The softmax is defined as:

\text{softmax}(z_i)=\frac{e^{z_i}}{\sum_{j=1}^{n} e^{z_j}}

Requirements

Implement the function

python

Rules:

Compute softmax row-wise: each query’s logits should sum to 1 across keys.
Use the stability trick: subtract the row max before exponentiating.
Use only NumPy operations (no SciPy / deep learning frameworks).
Return the output as a NumPy array.
Keep it numerically stable for large positive/negative logits.

Example

python

Output:

python

Input Signature

Argument	Type
logits	np.ndarray

Output Signature

Return Name	Type
value	np.ndarray

Constraints

Use NumPy only.
Row-wise softmax over last dimension.
Return NumPy array.

Hint 1

Ensure logits is a float array so you can operate row-wise with axis=1.

Hint 2

For numerical stability, subtract each row’s maximum: shifted = arr - arr.max(axis=1, keepdims=True) before np.exp.

Hint 3

Compute exp = np.exp(shifted) then normalize per row: probs = exp / exp.sum(axis=1, keepdims=True).

Roles

ML Engineer

AI Engineer

Companies

General

Levels

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit