Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Implement mean pooling to turn a sequence of token embeddings into a single fixed-size vector, a common step in embeddings-and-retrieval pipelines. Given token embeddings for one text, you’ll average each embedding dimension across tokens.
Mean pooling is defined as:
where (E) is the token-embedding matrix with shape ((T, D)).
Implement the function
Rules:
Output:
| Argument | Type |
|---|---|
| embeddings | np.ndarray |
| Return Name | Type |
|---|---|
| value | np.ndarray |
Return NumPy array, not Python list.
No deep learning pooling utilities allowed.
Input shape is (T, D) token embeddings.
Treat embeddings as a matrix with T rows (tokens) and D columns (dimensions), and average column-wise.
Use np.mean(embeddings, axis=0) to compute the mean along the first axis.