Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Build a simple sliding-window object detector that scans an image and returns the best-matching window for a given template. You’ll compute a similarity score at each window location and pick the top result.
The matching score is normalized cross-correlation (NCC):
Implement the function
Rules:
image from top-left to bottom-right using the given stride.template.Output:
| Argument | Type |
|---|---|
| image | np.ndarray |
| stride | int |
| template | np.ndarray |
| Return Name | Type |
|---|---|
| value | tuple |
Use NumPy; no OpenCV template-matching functions.
Only fully-contained windows; top-left coordinates returned.
Handle zero-variance denominator without crashing.
Convert image and template to NumPy arrays, and iterate only over valid top-left corners: r in [0..H-h], c in [0..W-w] stepping by stride.
Precompute the template’s mean-centered version T0 = T - T.mean() and its norm ||T0|| once; only the image patch statistics change per window.
For each patch P, compute P0 = P - P.mean(), then NCC = sum(P0*T0) / (||P0||*||T0||). If the denominator is 0 (zero variance patch or template), treat the score as -inf so it can’t win.