Quantitative Researcher Interview Prep (2026): Skills, Salary & Questions

Quantitative Researcher at a Glance

Total Compensation

$200k - $2000k/yr

Interview Rounds

6 rounds

Difficulty

Levels

Entry - Principal

Education

Master's / PhD

Experience

0–25+ yrs

Python C++ RQuantitative FinanceAlgorithmic TradingMachine LearningStatistical ModelingRisk ManagementFinancial Markets

From hundreds of mock interviews, the single biggest surprise is how many candidates with perfect probability scores get cut for lacking finance intuition. Quant researcher comp ranges from $150K at the junior floor to $5M at the principal ceiling, yet the interview loop cares less about your math ceiling than whether you can reason about why a signal decays after transaction costs. That gap between "smart" and "profitable" is exactly what firms screen for.

What Quantitative Researchers Actually Do

Primary Focus

Quantitative FinanceAlgorithmic TradingMachine LearningStatistical ModelingRisk ManagementFinancial Markets

Skill Profile

Math & Stats

Expert

Deep understanding of mathematics, probability, statistics, and linear algebra is fundamental for quantitative research and developing predictive signals.

Software Eng

High

Strong programming skills are required for manipulating large financial datasets, conducting empirical research, and developing/enhancing proprietary research systems.

Data & SQL

Medium

Ability to collect, clean, process, and analyze large, potentially high-frequency datasets. Experience with databases (e.g., SQL Server, MySQL) is a plus.

Machine Learning

High

Expertise in machine learning and artificial intelligence is crucial for developing new return predictive signals and advanced quantitative models.

Applied AI

Medium

There is no explicit mention of modern AI or Generative AI in the provided job descriptions. The focus is on traditional machine learning and statistical modeling.

Infra & Cloud

Low

This role focuses on quantitative research and model development, with no explicit mention of infrastructure, cloud platforms, or deployment responsibilities.

Business

High

Deep understanding of financial markets and trading, enabling the design of impactful strategies, alpha signal research, and performance enhancement.

Viz & Comms

High

Clear and effective communication skills are required for collaboration within a fast-paced, highly intellectual environment.

Languages

PythonC++R

Tools & Technologies

scikit-learnTensorFlow

Want to ace the interview?

Practice with real questions.

Start Mock Interview

You'll find this role at multi-strategy hedge funds (Citadel, Two Sigma, DE Shaw, Millennium), prop trading firms (Jane Street, Optiver, SIG), and occasionally on banks' electronic trading desks. The work spans everything from prototyping cross-sectional equity factor models in Python to refining C++ execution algorithms that reduce market impact by a few basis points. Success after year one means a signal you built made it into the live portfolio and generated attributable PnL, not just a well-received research memo.

A Typical Week

A Week in the Life of a Quantitative Researcher

Weekly time split

Analysis — 25%Research — 18%Meetings — 15%Break — 14%Coding — 13%Writing — 10%Infrastructure — 5%

Only about 15% of the week is meetings, but each one is high-density: think PnL attribution reviews where a strategy head grills your out-of-sample methodology, not status standups. What candidates consistently underestimate is the 10% writing allocation. Internal research memos circulate to the CIO's office, and sloppy prose erodes your credibility with portfolio managers just as fast as a flawed backtest.

Skills & What's Expected

Machine learning scores "high" in the skill profile for good reason: you'll use scikit-learn for feature selection pipelines and, at some firms, TensorFlow for nonlinear signal models. But the ML that matters here is interpretable and low-latency (penalized regression, gradient-boosted trees, Kalman filters), not billion-parameter architectures. The truly underrated skill is writing production-grade C++ for an optimizer module or transaction cost model. Plenty of PhD candidates can derive a shrinkage estimator on a whiteboard yet freeze when asked to refactor that math into tested, reviewable code. Business acumen also scores surprisingly high because you're expected to reason about market microstructure, capacity constraints, and Sharpe degradation from trading costs without someone holding your hand.

Levels & Career Growth

Quantitative Researcher Levels

Each level has different expectations, compensation, and interview focus.

Base

$125k

Stock/yr

$0k

Bonus

$75k

0–2 yrs Master's or PhD typically required

What This Level Looks Like

You work on well-scoped research tasks: backtesting a signal, implementing a pricing model, running a statistical analysis. A senior researcher defines the problem; you build the implementation.

Interview Focus at This Level

Probability and statistics (combinatorics, distributions, conditional probability), math puzzles/brainteasers, coding (algorithms in Python/C++), and basic finance concepts.

Find your level

Practice with questions tailored to your target level.

Start Practicing

Most hires land at entry or mid level, even with a PhD, because firms want to see you produce PnL before granting autonomy. The jump from mid to senior is where careers stall: it demands demonstrating independent alpha generation, not just executing on someone else's research agenda. At staff and above, the path splits between IC leadership (shaping firm-wide research methodology across multiple desks) and portfolio management (running your own book with direct capital allocation). Either way, comp becomes overwhelmingly bonus-driven, and a bad year on your strategy can cut total pay by half or more.

Quantitative Researcher Compensation

The bonus is the comp. Bases at quant firms cluster in a narrow band regardless of seniority, so your total package lives or dies on the variable piece. At pod shops like Citadel or Millennium, that variable piece is a cut of your desk's PnL. A strong year might push total comp to 3x base, while a flat year could leave you near base with a bonus that barely covers taxes. Multi-strat firms like Two Sigma and DE Shaw tend toward more structured bonus pools, which smooths the volatility but caps the upside. Bank quant desks (Goldman, JPMorgan) sit at the lower end of each band, though they offer more stability since bonuses there are tied to firm-wide revenue pools rather than a single strategy's P&L.

Deferred compensation is the gotcha most candidates miss. At senior levels, 20-40% of your bonus may vest over two to three years, and unvested portions are typically forfeited if you leave (though some firms allow partial vesting or negotiate exceptions during lateral moves). For negotiation, the lever that actually moves is the first-year guaranteed minimum, not base salary. Competing offers from big tech companies force quant firms to compete on guaranteed money rather than projections about future PnL splits. Push hard on guarantee length: locking in a two-year floor instead of one year can mean six figures of downside protection while your strategy ramps. And when a recruiter quotes an "expected" bonus number, pin them down on what a bad year looks like, because that figure is often a rosy target, not a floor.

Quantitative Researcher Interview Process

6 rounds·~6 weeks end to end

Initial Screen

1 round

Recruiter Screen

30mPhone

An initial phone call with a recruiter to discuss your background, interest in the role, and confirm basic qualifications. Expect questions about your experience, compensation expectations, and timeline.

generalbehavioralfinancestatistics

Tips for this round

Prepare a 60–90 second narrative linking your research (thesis/papers) to alpha research skills: hypothesis → test → validate → iterate
Have 2–3 crisp examples of projects where you handled noisy data and avoided overfitting (e.g., cross-validation, regularization, out-of-sample testing)
Know the basics of the company’s style (systematic, empirical research) and be able to articulate why you prefer systematic investing vs discretionary
Be ready to summarize your coding comfort (Python/R/C++), including libraries (NumPy/pandas/statsmodels) and scale (vectorization, profiling)

Technical Assessment

4 rounds

Coding & Algorithms

90mtake-home

You'll be given an online assessment designed to test your foundational quantitative and programming skills. This typically involves problems related to logic, data manipulation, basic algorithms, and statistical reasoning, often requiring code implementation.

algorithmsdata_structuresmathematicsstatisticsprobability

Tips for this round

Be prepared to code solutions to algorithmic problems in your preferred language, explaining your thought process clearly.
Review common data structures like arrays, linked lists, trees, and hash maps, and their time/space complexities.
Practice explaining your approach to probability and statistics problems step-by-step.
Articulate your assumptions and consider edge cases when solving problems.

Statistics & Probability

60mVideo Call

This round tests your statistical intuition: hypothesis testing, confidence intervals, probability, distributions, and experimental design applied to real product scenarios.

probabilitystatisticsmathematicsmathfinance

Tips for this round

Practice core probability moves: Bayes’ rule, conditional expectation/variance, order statistics, and common distributions (Normal, Bernoulli, Poisson)
Be fluent in hypothesis testing concepts: p-values, power, Type I/II error, multiple comparisons, and when asymptotics break
When stuck, explicitly state assumptions and simplify (e.g., independence, Gaussianity) before refining—interviewers reward clarity
Explain intuition after math: what the quantity means economically or statistically (signal vs noise, bias/variance tradeoff)

Machine Learning & Modeling

60mVideo Call

Covers model selection, feature engineering, evaluation metrics, and deploying ML in production. You'll discuss tradeoffs between model types and explain how you'd approach a real business problem.

machine_learningstatisticsdeep_learningml_codingalgorithms

Tips for this round

Understand the mathematical foundations of key ML algorithms (e.g., linear regression, logistic regression, SVMs, decision trees, neural networks).
Be prepared to discuss model evaluation metrics (e.g., precision, recall, F1, AUC, RMSE) and when to use them.
Familiarize yourself with concepts like overfitting, regularization, cross-validation, and feature engineering.
Discuss your experience with ML projects, highlighting your contributions and the challenges you faced.

Case Study

75mVideo Call

Prepare for a practical case study where you'll be presented with a real-world financial problem or dataset. You'll need to outline an approach, discuss potential models, data considerations, and how you would evaluate the solution's effectiveness.

financestatisticsmachine_learningmathematicsproduct_sense

Tips for this round

Clarify the problem statement and objectives with the interviewer before diving into solutions.
Structure your approach logically, starting with problem decomposition and moving to data, methodology, and potential solutions.
Think critically about assumptions, limitations, and potential biases in your proposed solution.
Consider various quantitative tools and techniques (e.g., statistical models, ML algorithms, simulation) that could be relevant.

Onsite

1 round

Behavioral

60mLive

Assesses collaboration, leadership, conflict resolution, and how you handle ambiguity. Interviewers look for structured answers (STAR format) with concrete examples and measurable outcomes.

behavioralgeneralfinancealgorithmsstatistics

Tips for this round

Prepare 5 STAR stories tailored to research work: failed hypothesis, debugging a data issue, influencing peers, and delivering under uncertainty
Demonstrate intellectual honesty: emphasize how you document assumptions, run ablations, and report negative findings
Practice explaining a complex model to a non-expert in 2 minutes, focusing on inputs/outputs/risks rather than equations
Highlight teamwork mechanisms: code reviews, experiment tracking (MLflow/W&B), and written research memos

Expect roughly 6 weeks from your first recruiter call to a final offer, based on data across 14 firms. Larger multi-strat funds tend to add buffer for PM interviews and committee sign-offs, while smaller systematic shops sometimes move faster when they have an urgent seat to fill. The process structure above is consistent across firm types, but the format shifts: some firms run rounds 2 through 5 as back-to-back on-site sessions, others spread them across weeks of video calls.

The most common elimination point, from what candidates report, is the statistics and probability round. It's not that people get answers wrong; it's that they can't articulate a structured approach under time pressure, which interviewers read as weak fundamentals rather than nerves. One pattern worth knowing: strong case study performance can rescue a wobbly earlier round, because that's where you demonstrate the end-to-end thinking (hypothesis, feature engineering, backtest, cost-adjusted evaluation) that separates researchers from puzzle-solvers.

Quantitative Researcher Interview Questions

Probability & Mathematical Reasoning

Your execution model assumes each of $n$ child orders fills independently with probability $p$, but the desk thinks there is a common liquidity shock so fills are positively dependent; name a realistic one-factor model for this and explain how it changes $\mathrm{Var}(K)$ for the fill count $K$. Be explicit about whether variance goes up or down versus $\mathrm{Binomial}(n,p)$.

AQRMediumDependence and Mixture Models

Sample Answer

You could model dependence with a Beta-Binomial mixture (random $p$ driven by a latent liquidity factor) or with a Gaussian copula on Bernoulli latents. The Beta-Binomial wins here because it gives an analytic variance decomposition: $\mathrm{Var}(K)=\mathbb{E}[\mathrm{Var}(K\mid p)]+\mathrm{Var}(\mathbb{E}[K\mid p])=n\mathbb{E}[p(1-p)]+n^2\mathrm{Var}(p)$. Compared to $\mathrm{Binomial}(n,p)$, the extra $n^2\mathrm{Var}(p)$ term makes variance strictly larger when $\mathrm{Var}(p)>0$, that is the overdispersion you see under common shocks.

A daily PnL stream has i.i.d. returns with tail $\mathbb{P}(R>x)\sim Cx^{-\alpha}$ for large $x$ with $\alpha\in(1,2)$; how does the $q$-quantile of the $n$-day sum $S_n=\sum_{t=1}^n R_t$ scale with $n$ for fixed high $q$ close to 1? State the scaling and justify it without invoking a full theorem statement.

AQRHardHeavy Tails and Stable Scaling

Practice more Probability & Mathematical Reasoning questions

Statistics

Most candidates underestimate how much precision matters when you talk about estimators, bias/variance, uncertainty, and asymptotics. You’ll need to translate messy market-style data situations into statistically sound procedures and sanity checks.

What is a confidence interval and how do you interpret one?

EasyFundamentals

Sample Answer

A 95% confidence interval is a range of values that, if you repeated the experiment many times, would contain the true population parameter 95% of the time. For example, if a survey gives a mean satisfaction score of 7.2 with a 95% CI of [6.8, 7.6], it means you're reasonably confident the true mean lies between 6.8 and 7.6. A common mistake is saying "there's a 95% probability the true value is in this interval" — the true value is fixed, it's the interval that varies across samples. Wider intervals indicate more uncertainty (small sample, high variance); narrower intervals indicate more precision.

You run a daily cross-sectional regression of next-day returns on a single standardized value signal across 2,000 US equities, then average the daily slope to estimate the signal premium; what is the right way to compute a standard error and $t$-stat when slopes are serially correlated? Name a concrete adjustment and what drives the lag choice.

AQRMediumInference under dependence

Sample Answer

Most candidates default to the naive standard error of the time-series mean of daily slopes assuming IID, but that fails here because factor premia are autocorrelated and volatility clusters. You should use a HAC estimator (Newey-West) on the time series of daily slopes, or equivalently block bootstrap those slopes, so the variance accounts for serial dependence. The lag is driven by the correlation horizon in the slope series (often tied to signal decay, rebalancing frequency, and microstructure effects), and you sanity check it via the slope ACF and stability of the $t$-stat across reasonable lag choices.

Your return-prediction regression uses daily stock returns and 20 characteristics, and you see clear heteroskedasticity across market cap; how does this change coefficient inference, and when would you switch from OLS standard errors to robust or GLS-style approaches? Be explicit about what remains unbiased and what breaks.

AQRHardRegression diagnostics and robust inference

Practice more Statistics questions

Machine Learning & Modeling

Your ability to choose and critique models is tested more than your ability to name them. You’ll need to diagnose overfitting, leakage, non-stationarity, and evaluation pitfalls common in alpha research, and justify modeling tradeoffs with rigor.

What is the bias-variance tradeoff?

EasyFundamentals

Sample Answer

Bias is error from oversimplifying the model (underfitting) — a linear model trying to capture a nonlinear relationship. Variance is error from the model being too sensitive to training data (overfitting) — a deep decision tree that memorizes noise. The tradeoff: as you increase model complexity, bias decreases but variance increases. The goal is to find the sweet spot where total error (bias squared + variance + irreducible noise) is minimized. Regularization (L1, L2, dropout), cross-validation, and ensemble methods (bagging reduces variance, boosting reduces bias) are practical tools for managing this tradeoff.

You are building a monthly cross-sectional equity return model using fundamentals, analyst revisions, and price-based signals, and you must choose between (a) L1-regularized linear regression on standardized features and (b) gradient-boosted trees. Which do you pick for a first production-quality signal at top tech companies, and how do you validate it to avoid lookahead and overfitting?

AQRMediumModel Selection and Validation

Sample Answer

You could do L1-regularized linear regression or gradient-boosted trees. L1 wins here because it is easier to audit for leakage, more stable under non-stationarity, and gives you a clean mapping from predictors to exposures that risk and portfolio construction can actually use. Validate with a strict time-series split and an embargo around the label horizon, then evaluate by out-of-sample IC, turnover, and net-of-cost Sharpe instead of generic $R^2$.

You train a model to predict next-month returns using daily features aggregated to month-end, and backtest performance collapses when you switch from random CV to time-based CV. Walk through, step by step, the most likely leakage paths and the fixes, including how you would implement purged and embargoed validation for a label with horizon $h$.

AQRHardLeakage and Non-Stationary Validation

Practice more Machine Learning & Modeling questions

Mathematics

You fit a linear signal $\hat y = X\beta$ for intraday return prediction with ridge: $$\min_\beta \; \lVert y - X\beta \rVert_2^2 + \lambda \lVert \beta \rVert_2^2.$$ Write the closed form $\hat\beta$ and explain what happens as $\lambda \to \infty$ to the fitted exposures in the eigenbasis of $X^\top X$.

CitadelMediumRegularized Least Squares and Spectral View

Sample Answer

This question is checking whether you can move between algebra and geometry, then reason about shrinkage where collinearity is severe. The solution is $$\hat\beta = (X^\top X + \lambda I)^{-1} X^\top y,$$ assuming $\lambda > 0$ so the matrix is invertible. Diagonalize $X^\top X = Q\Lambda Q^\top$, then $$\hat\beta = Q(\Lambda + \lambda I)^{-1}Q^\top X^\top y,$$ so each eigen-direction is scaled by $\frac{1}{\lambda_i + \lambda}$. As $\lambda \to \infty$, all directions shrink to $0$, and the model collapses toward zero exposures, which is exactly the point when you do not trust the cross-sectional conditioning.

You need to compute a risk parity portfolio for a futures book by minimizing $$f(w)=\frac{1}{2} w^\top \Sigma w - \sum_{i=1}^n \log w_i$$ with constraint $w_i>0$ (barrier encourages diversification). Derive $\nabla f(w)$ and $\nabla^2 f(w)$, then state a sufficient condition for $f$ to be strictly convex.

CitadelHardConvexity, Gradients, Hessians

Practice more Mathematics questions

Coding & Algorithms

In this section you are proving you can turn a tight mathematical idea into correct, fast Python. Expect problems where edge cases matter, you need clean reasoning about complexity, and your code has to be readable under pressure.

In a backtest you have daily simple returns $r_t$ and you want the maximum sum over any contiguous subarray of $r_t$ after applying a per-day risk penalty $λ r_t^2$ (objective is maximize $\sum (r_t - \lambda r_t^2)$). Return the max objective value and the start and end indices in $O(n)$ time.

CitadelMediumDynamic Programming, Maximum Subarray

Sample Answer

You could do brute force over all $O(n^2)$ subarrays or use Kadane style dynamic programming on the transformed values $x_t = r_t - \lambda r_t^2$. The brute force approach is dead on arrival for realistic backtests, since it is quadratic and easy to TLE. Kadane wins here because it maintains the best subarray ending at each index in $O(1)$, giving $O(n)$ time and $O(1)$ extra space while still letting you recover the argmax interval.

Python

1from typing import List, Tuple
2
3
4def max_risk_penalized_subarray(returns: List[float], lam: float) -> Tuple[float, int, int]:
5    """Maximize sum_{t=i..j} (r_t - lam * r_t^2) over contiguous subarrays.
6
7    Args:
8        returns: Daily simple returns r_t.
9        lam: Risk penalty coefficient (can be 0).
10
11    Returns:
12        (best_value, start_idx, end_idx), where indices are inclusive.
13        For empty input, returns (0.0, -1, -1).
14    """
15    n = len(returns)
16    if n == 0:
17        return 0.0, -1, -1
18
19    # Transform each day into a score.
20    x = [r - lam * (r * r) for r in returns]
21
22    best_sum = x[0]
23    best_l = best_r = 0
24
25    curr_sum = x[0]
26    curr_l = 0
27
28    for i in range(1, n):
29        # Either extend or restart at i.
30        if curr_sum + x[i] >= x[i]:
31            curr_sum += x[i]
32        else:
33            curr_sum = x[i]
34            curr_l = i
35
36        # Update global best.
37        if curr_sum > best_sum:
38            best_sum = curr_sum
39            best_l = curr_l
40            best_r = i
41
42    return float(best_sum), int(best_l), int(best_r)
43
44
45if __name__ == "__main__":
46    r = [0.01, -0.02, 0.015, 0.005, -0.001]
47    print(max_risk_penalized_subarray(r, lam=10.0))
48

You are building an-style daily rebalancer: given $N$ assets with expected returns $\mu_i$, risk model covariance $\Sigma$, per-asset transaction cost rates $c_i$, and current weights $w^{(0)}$, compute the new weights $w$ that maximize $\mu^\top w - \lambda w^\top \Sigma w - \sum_i c_i |w_i - w_i^{(0)}|$ subject to $\sum_i w_i = 1$ and box constraints $l_i \le w_i \le u_i$. Implement a solver using coordinate descent with soft-thresholding and projection, and return $w$ plus the objective value.

AQRHardConvex Optimization via Coordinate Descent

Practice more Coding & Algorithms questions

Business & Finance

What is ROI and how would you calculate it for a data science project?

EasyFundamentals

Sample Answer

ROI (Return on Investment) = (Net Benefit - Cost) / Cost x 100%. For a data science project, costs include engineering time, compute, data acquisition, and maintenance. Benefits might be revenue uplift from a recommendation model, cost savings from fraud detection, or efficiency gains from automation. Example: a churn prediction model costs $200K to build and maintain, and saves $1.2M/year in retained revenue, so ROI = ($1.2M - $200K) / $200K = 500%. The hard part is isolating the model's contribution from other factors — use a holdout group or A/B test to measure incremental impact rather than attributing all improvement to the model.

You have a daily cross-sectional value signal for US equities and you suspect it is just a dressed-up low-beta or quality tilt; what exact regression or portfolio tests do you run to separate alpha from risk premia, and what would convince you the signal is still real after controls?

AQRMediumFactor Attribution and Risk Premia

Sample Answer

Reason through it: Start by defining the unit of analysis, usually daily or monthly cross-sectional returns, and decide whether you are testing forecasting power (IC) or monetizable returns (long-short). Then run cross-sectional regressions of next-period returns on your signal and a set of known exposures, for example market beta, size, value, momentum, quality, industry dummies, using standardized exposures so coefficients are comparable. Next, build a beta-neutral and factor-neutral version via residualization, trade it with the same portfolio construction rules, and check whether net performance survives after realistic costs and turnover. You get convinced when the signal keeps a stable, positive coefficient and a robust IC, and the residualized portfolio keeps Sharpe and hit rate across regimes, not just in-sample.

You are asked to implement a daily long-short equity factor with 15 bps one-way costs and a 10% daily ADV participation cap; how do you decide whether to trade at the close or use a one-day delay, and how do you estimate capacity in dollars for the strategy?

AQRHardTransaction Costs and Capacity

Sample Answer

Start with what the interviewer is really testing: This question is checking whether you can translate backtest PnL into implementable PnL under costs, market impact, and liquidity constraints. You compare close-to-close execution versus a delay by mapping signal formation time to realistic fills and by explicitly modeling slippage, spread, and impact, then re-running the backtest net of those. Capacity comes from the binding constraint, typically participation, so you compute per-name max tradable shares per day as $0.10 \times \text{ADV}$ and convert to dollars, then aggregate across the portfolio subject to your turnover and rebalance schedule. If the required daily traded notional from turnover exceeds what participation allows, the strategy is capacity-limited and the paper Sharpe is fake.

Practice more Business & Finance questions

Math

Your ability to reason about objective functions, constraints, and stability is a direct proxy for whether you can build tradable models. Interviewers look for linear algebra fluency, convexity intuition, and the ability to sanity-check derivations.

You have 3 years of daily returns for 2,000 equities and 50 candidate alpha signals computed at EOD, and you want to ship a market-neutral long short portfolio with 10 bps expected daily turnover cost. Design a research plan to estimate out-of-sample Sharpe and decide whether any signal is real after multiple testing, and specify what you would plot as sanity checks.

DE ShawMediumAlpha Research, Multiple Testing, Time Series CV

Sample Answer

This question is checking whether you can separate signal from noise under dependence, costs, and selection bias. You need a time-series aware validation scheme (for example, rolling or purged walk-forward), explicit transaction cost modeling in the objective, and a multiple testing correction (for example, BH-FDR on per-signal IC t-stats with Newey West standard errors). Sanity plots should include cumulative PnL with drawdowns, IC over time, turnover and capacity curves, and performance versus market regime buckets to catch instability.

A high frequency market making model shows a strong edge in backtest, but live PnL collapses after launch, and the only change is that quotes are now subject to queue priority and occasional partial fills. Propose a minimal mathematical model for fill probability as a function of spread, queue position, and latency, then explain how you would re-estimate expected PnL and risk with that model.

DE ShawHardMarket Microstructure, Fill Modeling, Execution Risk

Practice more Math questions

ML Coding & Implementation

You are prototyping a cross-sectional signal for a market-neutral equities book: each minute, rank stocks by a feature $x_{i,t}$ and trade the next minute return $r_{i,t+1}$. Write code that computes Spearman rank IC per minute and the overall mean IC, while correctly handling ties and missing data.

CitadelMediumBacktest Metrics and Validation

Sample Answer

The standard move is to compute Spearman IC as the Pearson correlation of ranks between $x_{i,t}$ and $r_{i,t+1}$ per timestamp. But here, ties and NaNs matter because they are common in minute bars and they bias rank statistics if you drop or rank incorrectly. You need per-minute filtering, tie-aware ranking, and a guardrail for small cross sections. Then aggregate, usually with a simple mean and a count of valid minutes.

Python

1import numpy as np
2
3
4def _rankdata_average_ties(a: np.ndarray) -> np.ndarray:
5    """Return 1..n ranks with average ranks for ties, NaNs preserved as NaN."""
6    a = np.asarray(a, dtype=float)
7    n = a.size
8    ranks = np.full(n, np.nan)
9
10    finite_idx = np.where(np.isfinite(a))[0]
11    if finite_idx.size == 0:
12        return ranks
13
14    vals = a[finite_idx]
15    order = np.argsort(vals, kind="mergesort")  # stable
16    sorted_idx = finite_idx[order]
17    sorted_vals = vals[order]
18
19    # Walk runs of equal values.
20    i = 0
21    while i < sorted_vals.size:
22        j = i
23        while j + 1 < sorted_vals.size and sorted_vals[j + 1] == sorted_vals[i]:
24            j += 1
25        # Average rank for positions i..j in 1-based rank space.
26        avg_rank = (i + 1 + j + 1) / 2.0
27        ranks[sorted_idx[i : j + 1]] = avg_rank
28        i = j + 1
29
30    return ranks
31
32
33def spearman_ic_by_minute(x: np.ndarray, r_next: np.ndarray, min_names: int = 5):
34    """Compute per-minute Spearman rank IC and overall mean IC.
35
36    Parameters
37    ----------
38    x : np.ndarray
39        Shape (T, N) features at minute t.
40    r_next : np.ndarray
41        Shape (T, N) next-minute returns aligned so r_next[t] is r_{t+1}.
42    min_names : int
43        Minimum number of valid names required to compute IC for a minute.
44
45    Returns
46    -------
47    ic_t : np.ndarray
48        Shape (T,) IC per minute, NaN where undefined.
49    mean_ic : float
50        Mean IC over valid minutes.
51    valid_minutes : np.ndarray
52        Boolean mask over T.
53    """
54    x = np.asarray(x, dtype=float)
55    r_next = np.asarray(r_next, dtype=float)
56    if x.shape != r_next.shape or x.ndim != 2:
57        raise ValueError("x and r_next must be 2D arrays with the same shape (T, N)")
58
59    T, N = x.shape
60    ic_t = np.full(T, np.nan)
61    valid_minutes = np.zeros(T, dtype=bool)
62
63    for t in range(T):
64        xt = x[t]
65        rt = r_next[t]
66        ok = np.isfinite(xt) & np.isfinite(rt)
67        if ok.sum() < min_names:
68            continue
69
70        rx = _rankdata_average_ties(xt)
71        rr = _rankdata_average_ties(rt)
72        # Restrict to valid names (ranks are NaN where invalid).
73        ok2 = np.isfinite(rx) & np.isfinite(rr)
74        if ok2.sum() < min_names:
75            continue
76
77        a = rx[ok2]
78        b = rr[ok2]
79
80        # Pearson correlation of ranks.
81        a0 = a - a.mean()
82        b0 = b - b.mean()
83        denom = np.sqrt((a0 * a0).sum() * (b0 * b0).sum())
84        if denom <= 0:
85            continue
86
87        ic_t[t] = float((a0 * b0).sum() / denom)
88        valid_minutes[t] = True
89
90    mean_ic = float(np.nanmean(ic_t)) if np.any(valid_minutes) else np.nan
91    return ic_t, mean_ic, valid_minutes
92
93
94if __name__ == "__main__":
95    rng = np.random.default_rng(0)
96    T, N = 4, 6
97    x = rng.normal(size=(T, N))
98    r = rng.normal(size=(T, N))
99    x[0, 1] = np.nan
100    r[2, 3] = np.nan
101    x[1, :] = 1.0  # ties
102
103    ic_t, mean_ic, vm = spearman_ic_by_minute(x, r)
104    print(ic_t)
105    print(mean_ic)
106    print(vm)
107

You are building a next-5-minute return model using minute bars and a feature pipeline that includes rolling means and cross-sectional ranks. Write code that produces walk-forward splits with an embargo of $E$ minutes around each test block, and verify in code that no training row uses any timestamp in the forbidden region.

CitadelHardTime Series Cross-Validation

Practice more ML Coding & Implementation questions

The split between probability, statistics, mathematics, and math (linear algebra, optimization) on one side and coding, ML modeling, and ML implementation on the other is closer to even than most candidates assume. Where it gets painful is onsite rounds that blend both halves: you derive a ridge penalty's gradient analytically, then open a blank Python file and implement the optimizer before time runs out. The prep mistake this distribution screams at you is over-indexing on brainteaser collections while barely touching the 32% of questions that require you to write real code under pressure.

Practice quant-calibrated questions across all eight areas at datainterview.com/questions.

How to Prepare

Probability and statistics deserve outsized prep time relative to other areas. They account for roughly a third of all questions, and from what candidates report, these rounds have the steepest penalty for hesitation. Spend weeks 1-2 working through Mark Joshi's "green book" and Heard on the Street, but pair each puzzle with a full derivation using Bayes' theorem, conditional expectation, or generating functions so you internalize the reasoning, not just the answer.

Layer in hypothesis testing (power analysis, multiple comparisons via Bonferroni/BH) and distribution theory (conjugate priors, moment-generating functions) during those same two weeks. Weeks 3-4, solve one medium-hard algorithm problem daily in Python or C++, targeting dynamic programming, graph search, and numerical methods like Newton-Raphson. For ML, implement logistic regression's update rule, k-fold cross-validation, and a gradient boosted tree from scratch without importing sklearn.

Weeks 5-6, run two or three full case studies end-to-end: engineer features like rolling volatility and order flow imbalance from publicly available equity data (Yahoo Finance daily bars work fine), backtest a simple mean-reversion signal, then estimate its Sharpe ratio net of transaction costs. This is where generalist data scientists consistently fail the quant loop, because they've never had to connect a model's predictions to a PnL statement with realistic slippage and commission assumptions.

Try a Real Interview Question

Log Score and Brier Score for Prediction Market Forecasts

python

You are given n prediction market probabilities p_i for a binary event and realized outcomes y_i in \{0,1\}. Implement a function that returns the mean negative log loss $-frac{1}{n}sum_i=1^n left[y_ilog(p_i)+(1-y_i)log(1-p_i)right]and the mean Brier scorefrac{1}{n}sum_i=1^n (p_i-y_i)^2, using clippingp_i \leftarrow \min(1-\epsilon,\max(\epsilon,p_i))with input\epsilon. Return a tuple(\text{logloss},\text{brier})$ as floats.

Python

1from typing import Iterable, Tuple
2
3
4def score_forecasts(ps: Iterable[float], ys: Iterable[int], eps: float = 1e-15) -> Tuple[float, float]:
5    """Compute mean negative log loss and mean Brier score for binary forecasts.
6
7    Args:
8        ps: Iterable of predicted probabilities.
9        ys: Iterable of realized outcomes, each 0 or 1.
10        eps: Clipping parameter for probabilities.
11
12    Returns:
13        (mean_logloss, mean_brier)
14    """
15    pass
16

Python

1from typing import Iterable, Tuple
2import math
3
4
5def score_forecasts(ps: Iterable[float], ys: Iterable[int], eps: float = 1e-15) -> Tuple[float, float]:
6    """Compute mean negative log loss and mean Brier score for binary forecasts.
7
8    Args:
9        ps: Iterable of predicted probabilities.
10        ys: Iterable of realized outcomes, each 0 or 1.
11        eps: Clipping parameter for probabilities.
12
13    Returns:
14        (mean_logloss, mean_brier)
15
16    Raises:
17        ValueError: If inputs have different lengths, are empty, eps is invalid,
18            or ys contains values other than 0 or 1.
19    """
20    if not (0.0 < eps < 0.5):
21        raise ValueError("eps must satisfy 0 < eps < 0.5")
22
23    it_p = iter(ps)
24    it_y = iter(ys)
25
26    n = 0
27    sum_logloss = 0.0
28    sum_brier = 0.0
29
30    while True:
31        try:
32            p = next(it_p)
33        except StopIteration:
34            try:
35                next(it_y)
36                raise ValueError("ps and ys must have the same length")
37            except StopIteration:
38                break
39
40        try:
41            y = next(it_y)
42        except StopIteration:
43            raise ValueError("ps and ys must have the same length")
44
45        if y not in (0, 1):
46            raise ValueError("ys must contain only 0 or 1")
47
48        if not math.isfinite(p):
49            raise ValueError("ps must contain finite floats")
50
51        p = min(1.0 - eps, max(eps, float(p)))
52
53        sum_logloss += -(y * math.log(p) + (1 - y) * math.log(1.0 - p))
54        diff = p - y
55        sum_brier += diff * diff
56        n += 1
57
58    if n == 0:
59        raise ValueError("inputs must be non-empty")
60
61    return (sum_logloss / n, sum_brier / n)
62

700+ ML coding problems with a live Python executor.

Practice in the Engine

Quant coding rounds reward recognizing mathematical structure (convexity, recurrence relations, Lagrangian relaxation) over brute-force pattern matching. Practice more problems calibrated to this difficulty at datainterview.com/coding.

Test Your Readiness

Quantitative Researcher Readiness Assessment

1 / 10

Statistics

Can you derive and interpret the bias-variance tradeoff in a regression setting, and explain how it affects out-of-sample error and model selection?

Timed practice is the closest simulation of the speed expected in probability and statistics rounds. Find more questions at datainterview.com/questions.

Frequently Asked Questions

What technical skills are tested in Quantitative Researcher interviews?

Core skills tested are probability, statistics, mathematical reasoning (brainteasers, proofs), coding (Python, C++, R), and finance (factor models, portfolio construction, risk). The math depth far exceeds other data roles.

How long does the Quantitative Researcher interview process take?

Most candidates report 3 to 6 weeks, though some quant firms move faster. The process typically includes a phone screen, math/probability round (often the hardest), coding round, research presentation, and behavioral/fit interview.

What is the total compensation for a Quantitative Researcher?

Total compensation across the industry ranges from $150k to $5000k depending on level, location, and company. This includes base salary, equity (RSUs or stock options), and annual bonus. Pre-IPO equity is harder to value, so weight cash components more heavily when comparing offers.

What education do I need to become a Quantitative Researcher?

A Master's is the practical minimum. A PhD in Mathematics, Statistics, Physics, CS, or a related quantitative field is strongly preferred at most top quant firms. The math depth required makes graduate training very valuable.

How should I prepare for Quantitative Researcher behavioral interviews?

Use the STAR format (Situation, Task, Action, Result). Prepare 5 stories covering cross-functional collaboration, handling ambiguity, failed projects, technical disagreements, and driving impact without authority. Keep each answer under 90 seconds. Most interview loops include 1-2 dedicated behavioral rounds.

How many years of experience do I need for a Quantitative Researcher role?

Entry-level positions typically require 0+ years (including internships and academic projects). Senior roles expect 12-25+ years of industry experience. What matters more than raw years is demonstrated impact: shipped models, experiments that changed decisions, or pipelines you built and maintained.

Quantitative Researcher Interview Prep

What Quantitative Researchers Actually Do

A Typical Week

A Week in the Life of a Quantitative Researcher

Weekly time split

Skills & What's Expected

Levels & Career Growth

Quantitative Researcher Levels

Quantitative Researcher Compensation

Quantitative Researcher Interview Process

Initial Screen

Recruiter Screen

Technical Assessment

Coding & Algorithms

Statistics & Probability

Machine Learning & Modeling

Case Study

Onsite

Behavioral

Quantitative Researcher Interview Questions

Probability & Mathematical Reasoning

Statistics

Machine Learning & Modeling

Mathematics

Coding & Algorithms

Business & Finance

Math

ML Coding & Implementation

How to Prepare

Try a Real Interview Question

Log Score and Brier Score for Prediction Market Forecasts

Test Your Readiness

Frequently Asked Questions

Dan Lee

Related Articles

Snap Machine Learning Engineer Interview Guide

TikTok Data Engineer Interview Guide

Snap Data Scientist Interview Guide