Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs

Back to Questions

181. Binning continuous variables

easy
GeneralGeneral
senior

Binning (a.k.a. discretization) turns a continuous feature into bucket IDs that you can feed into models or use for analysis. You’ll implement equal-width binning for one feature and optionally return one-hot encoded bins.

Requirements

Implement the function

python

Rules:

  • Use equal-width binning with bin width ( w = \dfrac{\max(x) - \min(x)}{\text{num_bins}} ).
  • Assign each value (x_i) to a bin using ( \text{bin}(x_i) = \left\lfloor \dfrac{x_i - \min(x)}{w} \right\rfloor ), but ensure the maximum value maps to the last bin (num_bins - 1).
  • Return both the integer bin_ids and the one_hot encoding for each value.
  • Use only NumPy and/or Python built-in libraries (no pandas, no scikit-learn).

Example

python

Output:

python
Input Signature
ArgumentType
xnp.ndarray
num_binsint
Output Signature
Return NameType
valuetuple

Constraints

  • Use only NumPy and Python built-in libraries

  • Output bin_ids array and one_hot matrix

  • Clamp max value to last bin

Hint 1

Compute x_min = np.min(x) and x_max = np.max(x), then bin width w = (x_max - x_min) / num_bins.

Hint 2

For each value v, compute idx = floor((v - x_min) / w); in NumPy use ((x - x_min) / w).astype(int).

Hint 3

Edge case: when v == x_max, the formula yields num_bins; clamp with np.clip(idx, 0, num_bins-1). Then build one-hot array with advanced indexing.

Roles
ML Engineer
AI Engineer
Companies
GeneralGeneral
Levels
senior
entry
Tags
discretization
equal-width-binning
one-hot-encoding
numpy
35 people are solving this problem
Python LogoPython Editor
Ln 1, Col 1

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit