Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Build cumulative-sum (running total) features, a simple but useful feature-engineering trick for time-ordered data. You’ll compute per-group running totals so each row can use information from earlier rows only.
Implement the function
Rules:
values as already sorted in time order (do not sort).values.Output:
| Argument | Type |
|---|---|
| groups | np.ndarray |
| values | np.ndarray |
| Return Name | Type |
|---|---|
| value | np.ndarray |
No pandas; only Python/NumPy.
Input order is time order; do not sort.
Single pass O(n) expected.
Use a dictionary/map: group_id -> running_sum while scanning left-to-right.
For each index i, read previous sum via totals.get(group, 0.0), add values[i], store back, and write to output.
Single-pass pattern: for i, (v,g) in enumerate(zip(values, groups)): update totals[g] and set out[i] = totals[g].