Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Explain and implement a simple bounding box conversion utility used in computer vision, where different models and datasets store boxes in different formats.
A common conversion is from corner format ((x_{\min}, y_{\min}, x_{\max}, y_{\max})) to center format ((c_x, c_y, w, h)), defined as:
Implement the function
Rules:
boxes will be an array of shape (N, 4).(N, 4).Output:
| Argument | Type |
|---|---|
| boxes | np.ndarray |
| Return Name | Type |
|---|---|
| value | np.ndarray |
Return NumPy array, not Python list
Use NumPy vectorized ops; no torchvision utils
Compute using floating-point dtype
Start by converting boxes into a NumPy array of dtype=float so arithmetic stays floating-point and can be vectorized.
Use column slicing: x_min = arr[:,0], y_min = arr[:,1], x_max = arr[:,2], y_max = arr[:,3] and apply the given formulas elementwise.
Reassemble with np.stack([c_x, c_y, w, h], axis=1).