Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

197. Masked attention scores

medium
GeneralGeneral
staff

Implement masked attention scores for a Transformer, where certain key positions are blocked so they contribute zero probability to the final attention. You’ll compute the scaled dot-product scores and apply a boolean mask before the softmax.

The attention scores are defined as:

S=QK⊤d,A=softmax(S+M)S = \frac{QK^\top}{\sqrt{d}} \quad,\quad A = \text{softmax}(S + M)

where Mij=0M_{ij}=0 if key jj is allowed for query ii, and Mij=−109M_{ij}=-10^9 if it’s masked.

Requirements

Implement the function

python

Rules:

  • Compute scaled dot-product scores between every query and key.
  • Apply the mask by adding -1e9 to masked score entries (where mask[i][j] == 0).
  • Apply a numerically stable softmax across each row (over keys) to get probabilities.
  • Return the attention probabilities as a NumPy array.
  • Don’t use any prebuilt attention utilities (no torch, no scipy, etc.).

Example

python

Output:

python
Input Signature
ArgumentType
Knp.ndarray
Qnp.ndarray
masknp.ndarray
Output Signature
Return NameType
valuenp.ndarray

Constraints

  • Use NumPy only; no torch/scipy.

  • Return ndarray.

  • Softmax must be stable across each row.

Hint 1

Compute the raw score matrix with a dot product: scores = (Q @ K.T) / sqrt(d) where d is the embedding dimension.

Hint 2

Turn the 0/1 mask into an additive bias: add -1e9 wherever mask[i][j] == 0 so those entries become ~0 after softmax.

Hint 3

Implement a numerically stable row-wise softmax: subtract each row’s max before exp, then divide by the row sum.

Roles
ML Engineer
AI Engineer
Companies
GeneralGeneral
Levels
staff
senior
entry
Tags
scaled-dot-product-attention
masking
numerically-stable-softmax
numpy-matrix-multiplication
12 people are solving this problem
Python LogoPython Editor
Ln 1, Col 1

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit