Join Our 5-Week ML/AI Engineer Interview Bootcamp π led by ML Tech Leads at FAANGs
Implement the forward pass of a single-layer GRU to process a sequence of input vectors and produce the sequence of hidden states. Youβll compute the GRU gates and hidden updates step-by-step using the standard GRU equations.
A GRU update at time step (t) is defined as:
Implement the function
Rules:
x_t as a row from X and compute h_t sequentially from t=0 to T-1.sigmoid and tanh yourself using NumPy (no deep learning frameworks).Output:
| Argument | Type |
|---|---|
| X | np.ndarray |
| h0 | np.ndarray |
| params | dict |
| Return Name | Type |
|---|---|
| value | np.ndarray |
Use only NumPy; no deep learning frameworks.
Matrix shapes must match specified dimensions.
Return np.ndarray of T hidden states.
Convert X, h0, and all params entries to NumPy arrays first so you can use @ for matrixβvector products and * for elementwise multiply.
Inside a loop over timesteps, compute the gates in the exact order: z_t = sigmoid(Wz@x_t + Uz@h + bz) and r_t = sigmoid(Wr@x_t + Ur@h + br); watch shapes: x_t is (input_dim,), h is (hidden_dim,).
Compute h_tilde = tanh(Wh@x_t + Uh@(r_t*h) + bh), then update: h = (1-z_t)*h + z_t*h_tilde. Append a copy (e.g., .tolist()) each step to return all hidden states.