Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

251. Mean pooling of embeddings

easy
GeneralGeneral
senior

Implement mean pooling to turn a sequence of token embeddings into a single fixed-size vector, a common step in embeddings-and-retrieval pipelines. Given token embeddings for one text, you’ll average each embedding dimension across tokens.

Mean pooling is defined as:

pooledj=1T∑i=1TEi,j\text{pooled}_j = \frac{1}{T}\sum_{i=1}^{T} E_{i,j}

where (E) is the token-embedding matrix with shape ((T, D)).

Requirements

Implement the function

python

Rules:

  • Compute the element-wise mean across the token dimension (average over all tokens).
  • Return a 1D NumPy array.
  • Do not use any prebuilt pooling utilities (e.g., from PyTorch/TensorFlow).
  • Use only NumPy and Python built-in libraries.
  • Keep it a single function (no helper classes).

Example

python

Output:

python
Input Signature
ArgumentType
embeddingsnp.ndarray
Output Signature
Return NameType
valuenp.ndarray

Constraints

  • Return NumPy array, not Python list.

  • No deep learning pooling utilities allowed.

  • Input shape is (T, D) token embeddings.

Hint 1

Treat embeddings as a matrix with T rows (tokens) and D columns (dimensions), and average column-wise.

Hint 2

Use np.mean(embeddings, axis=0) to compute the mean along the first axis.

Roles
ML Engineer
AI Engineer
Companies
GeneralGeneral
Levels
senior
entry
Tags
mean-pooling
embeddings
array-aggregation
numpy
29 people are solving this problem
Python LogoPython Editor
Ln 1, Col 1

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit