Top k embeddings retrieval

248. Top k embeddings retrieval

easy

General

senior

Retrieve the top‑k most similar embeddings for a query embedding using cosine similarity, which is a common building block in embedding search. You’ll compute similarity scores and return the indices of the best matches.

Cosine similarity is:

\text{cos\_sim}(q, x_i) = \frac{q \cdot x_i}{\|q\|\;\|x_i\|}

Requirements

Implement the function

python

Rules:

Compute cosine similarity between query and every row in embeddings.
Return the indices of the top k most similar embeddings, sorted from most similar to least similar.
Do not use any prebuilt nearest-neighbor/search libraries (e.g., FAISS, sklearn NearestNeighbors).
Use NumPy for vectorized computation.
If there are ties in similarity, break ties by smaller index first.

Example

python

Output:

python

Input Signature

Argument	Type
k	int
query	np.ndarray
embeddings	np.ndarray

Output Signature

Return Name	Type
value	list

Constraints

Use NumPy for vectorization.
Return indices sorted; ties use smaller index
Handle k <= n; embeddings shape (n,d)

Hint 1

Compute norms: q_norm = np.linalg.norm(query) and e_norms = np.linalg.norm(embeddings, axis=1).

Hint 2

Compute dot products: dots = embeddings @ query.

Hint 3

Calculate similarities: sims = dots / (q_norm * e_norms); create pairs (sim, idx) and sort with key (-sim, idx).

Roles

ML Engineer

AI Engineer

Companies

General

Levels

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit