Word embedding lookup - ML & AI Coding

Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

18. Word embedding lookup

easy

General

senior

Implement a word embedding lookup step, which maps token IDs to dense vectors used in NLP models. Given a matrix of embeddings and a list of token IDs, return the corresponding sequence of embedding vectors using the lookup rule (E[t]).

$\mathbf{Y}_i = \mathbf{E}_{t_i}$

where $\mathbf{E} \in \mathbb{R}^{V \times D}$ is the embedding matrix and $t_i$ is the token ID at index $i$ .

Requirements

Implement the function

python

Rules:

Use the lookup rule (Y_j = E[\text{token_ids}_j]) for each position (j).
Preserve the order of token_ids in the output.
Return a np.ndarray.
Do not use any prebuilt embedding layers (e.g., from PyTorch/TensorFlow).

Example

python

Output:

python

Input Signature

Argument	Type
token_ids	np.ndarray
embeddings	np.ndarray

Output Signature

Return Name	Type
value	np.ndarray

Constraints

Return NumPy array
Preserve token_ids order in output
No ML embedding layers (PyTorch/TensorFlow)

Hint 1

Start with the rule: for each id in token_ids, gather embeddings[id].

Hint 2

Use NumPy fancy indexing: E[token_ids] selects rows efficiently and preserves order.

Roles

ML Engineer

AI Engineer

Data Scientist

Quantitative Analyst

Companies

General

Levels

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit