Join Our 5-Week ML/AI Engineer Interview Bootcamp 🚀 led by ML Tech Leads at FAANGs
Implement linear regression from scratch using NumPy, trained with batch gradient descent to minimize mean squared error. Provide fit to learn weights and bias and predict to output continuous values.
Implement a class LinearRegressionGD:
Rules:
w and bias b; do not add an intercept column to X.__init__ must set defaults and initialize params (e.g., self.w = np.array([]), self.b = 0.0).fit learns w and b; predict returns X @ w + b.Default hyperparameters:
learning_rate = 0.1n_iters = 1000Output:
| Argument | Type |
|---|---|
| X_test | list |
| X_train | list |
| y_train | list |
| Return Name | Type |
|---|---|
| value | list |
NumPy only; no sklearn/statsmodels
Vectorized batch GD; no per-sample loops
No intercept column; use separate bias b
Start by converting X to a 2D np.ndarray and y to a 1D array (y.reshape(-1)), then initialize w with zeros of length n_features and b = 0.0.
Use a fully vectorized forward pass each iteration: y_pred = X @ w + b and error = y_pred - y. Batch gradient descent means compute gradients using all samples each step (no loops over rows).
For MSE, the gradients are dw = (1/n) * (X.T @ error) and db = (1/n) * error.sum(). Update with w -= lr * dw, b -= lr * db, and ensure predict(X) returns X @ w + b without adding an intercept column.