Feature correlation filtering

113. Feature correlation filtering

medium

General

staff

Feature correlation filtering is a simple feature-engineering step that removes redundant input columns that are strongly correlated with each other. You’ll implement a function that drops features whose absolute correlation with any already-kept feature exceeds a threshold, using Pearson correlation.

The Pearson correlation between two features (x) and (y) is:

\rho(x, y) = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}\sqrt{\sum_{i=1}^{n}(y_i-\bar{y})^2}}

Requirements

Implement the function:

python

Rules:

Compute the feature-feature Pearson correlation matrix from X.
Use absolute correlation (i.e., compare abs(corr) to threshold).
Greedy rule: iterate features from left to right; keep a feature only if it is not too correlated with any previously kept feature.
Return only the kept feature_names (don’t return indices or a modified X).
Don’t use any prebuilt feature-selection utilities; just NumPy + basic Python.

Example

python

Output:

python

Input Signature

Argument	Type
X	np.ndarray
threshold	float
feature_names	list

Output Signature

Return Name	Type
value	list

Constraints

Use NumPy only; no feature-selection utilities.
Return kept feature_names; preserve original order.
Drop if abs(Pearson corr) > threshold.

Hint 1

Convert X to a NumPy float array so you can compute correlations column-wise.

Hint 2

Use np.corrcoef(X, rowvar=False) to get an n_features × n_features Pearson correlation matrix; compare abs(corr[i,j]) to the threshold.

Hint 3

Apply a greedy left-to-right rule: maintain a list of kept feature indices; for each new feature, drop it if it exceeds the threshold with any previously kept feature, otherwise keep it and continue.

Roles

ML Engineer

AI Engineer

Companies

General

Levels

staff

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit