Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Implement a word embedding lookup step, which maps token IDs to dense vectors used in NLP models. Given a matrix of embeddings and a list of token IDs, return the corresponding sequence of embedding vectors using the lookup rule (E[t]).
where is the embedding matrix and is the token ID at index .
Implement the function
Rules:
token_ids in the output.np.ndarray.Output:
| Argument | Type |
|---|---|
| token_ids | np.ndarray |
| embeddings | np.ndarray |
| Return Name | Type |
|---|---|
| value | np.ndarray |
Return NumPy array
Preserve token_ids order in output
No ML embedding layers (PyTorch/TensorFlow)
Start with the rule: for each id in token_ids, gather embeddings[id].
Use NumPy fancy indexing: E[token_ids] selects rows efficiently and preserves order.