Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

18. Word embedding lookup

easy
GeneralGeneral
senior

Implement a word embedding lookup step, which maps token IDs to dense vectors used in NLP models. Given a matrix of embeddings and a list of token IDs, return the corresponding sequence of embedding vectors using the lookup rule (E[t]).

Yi=Eti\mathbf{Y}_i = \mathbf{E}_{t_i}

where E∈RV×D\mathbf{E} \in \mathbb{R}^{V \times D} is the embedding matrix and tit_i is the token ID at index ii.

Requirements

Implement the function

python

Rules:

  • Use the lookup rule (Y_j = E[\text{token_ids}_j]) for each position (j).
  • Preserve the order of token_ids in the output.
  • Return a np.ndarray.
  • Do not use any prebuilt embedding layers (e.g., from PyTorch/TensorFlow).

Example

python

Output:

python
Input Signature
ArgumentType
token_idsnp.ndarray
embeddingsnp.ndarray
Output Signature
Return NameType
valuenp.ndarray

Constraints

  • Return NumPy array

  • Preserve token_ids order in output

  • No ML embedding layers (PyTorch/TensorFlow)

Hint 1

Start with the rule: for each id in token_ids, gather embeddings[id].

Hint 2

Use NumPy fancy indexing: E[token_ids] selects rows efficiently and preserves order.

Roles
ML Engineer
AI Engineer
Data Scientist
Quantitative Analyst
Companies
GeneralGeneral
Levels
senior
entry
Tags
numpy-fancy-indexing
nlp-embeddings
array-indexing
data-structures
26 people are solving this problem
Python LogoPython Editor
Ln 1, Col 1

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit