Join ML Engineer Interview MasterClass (April Cohort) led by FAANG Data Scientists | Just 6 seats remaining...
ML Engineer MasterClass (April) | 6 seats left
Virtu Financial went public in 2014 and disclosed that it had been profitable on all but one trading day in the previous six years. Not most days. All but one. That's what a well-run market making operation looks like from the outside. What it looks like from the inside is a constant, microsecond-level fight against your own inventory.
Market making is simple to describe and brutal to execute. You post a price to buy and a price to sell simultaneously, collect the spread when both sides fill, and repeat millions of times a day across thousands of instruments. The spread is your revenue. The problem is that fills don't come in matched pairs. You buy 10,000 shares of SPY and the market drops before anyone sells back to you. That spread you earned is gone, and you're still holding the bag.
Every filled order creates a position, and every position is a liability until it's flat. The quoting logic, the inventory tracker, and the risk controls aren't separate concerns; they're a feedback loop that has to close in under a millisecond or you're quoting the wrong price into a moving market. Firms like Citadel, HRT, and Virtu expect you to reason about all three forces at once: spread capture (how you make money), inventory accumulation (how you take on risk), and adverse selection (how informed traders pick you off before you can react). Candidates who only describe the happy-path quoting loop don't make it far in these interviews.
Every market making cycle starts the same way: a market data update arrives. Maybe the best bid on SPY just moved up a tick. The fair value engine sees that, recomputes its theoretical price, and the quoting engine immediately cancels the old orders and publishes fresh ones. Bid at fair value minus half the spread. Ask at fair value plus half the spread. Done. The whole loop, from incoming market data to new quotes resting on the exchange, needs to complete in microseconds.
Think of it like a store that reprices its shelves every time the wholesale cost changes, except the shelves update thousands of times per second and the store is simultaneously buying and selling the same item.
Here's what that flow looks like:

The diagram shows the happy path. Here's where it gets interesting.
Every time one of your quotes gets hit, you own something you didn't own before (or you're short something). A buyer aggressively takes your ask, and now you're net long. That position is risk. It earns you nothing until you unwind it, and if the price moves against you while you're holding it, it can erase hours of spread income in seconds.
The inventory tracker feeds this net position back into the quoting engine continuously. If you're long 10,000 shares, you shift both your bid and your ask down slightly. You're making your ask cheaper (more attractive to sellers) and your bid less attractive (discouraging more buying). The market naturally flows toward you to flatten your book. This is inventory skew, and it's the core feedback mechanism that keeps a market maker from drifting into dangerous one-sided exposure.
When interviewers ask "how do you compute your quote prices?", the wrong answer is "take the midpoint of the best bid and ask." That's a starting point, not a fair value model.
A real fair value engine layers in order book imbalance (if 80% of the visible depth is on the bid side, the instrument is probably going to tick up), recent trade flow (directional aggression in the last 50 milliseconds is a signal), and cross-asset correlations (ES futures moved, so SPY fair value just changed even before SPY's own book updates). The output is a theoretical price that represents where the instrument should be trading right now, not just where it last traded.
This matters in your interview because the sophistication of your fair value model is directly tied to your adverse selection risk. A naive mid-price model means informed traders know your fair value is stale. A richer model means you're harder to pick off.
Sitting above the quoting engine is a risk layer that can override everything. Position limits cap how much inventory you can accumulate in a single instrument. For options market makers, Greeks exposure (delta, gamma, vega) matters more than raw share count because a large gamma position can blow up violently if the underlying moves sharply. Real-time P&L monitoring watches for drawdown thresholds that trigger automatic spread widening or full quote withdrawal.
This isn't a safety net you bolt on at the end. It's a first-class part of the architecture, running in the same latency budget as the quoting engine itself.
The quoting engine, risk engine, and order management system all need to agree on the current inventory position at the same instant. If the OMS processes a fill confirmation 200 microseconds before the quoting engine sees it, you'll publish quotes that don't reflect your actual exposure. In a fast-moving market, that gap is enough to get picked off again on the same side.
This is why the architecture diagram isn't just a nice illustration. The connections between components, and the latency on each of those connections, are the engineering problem. Interviewers at firms like Citadel and Virtu care deeply about whether you understand that the strategy and the infrastructure are inseparable.
In an interview, you'll usually need to pick a specific approach. Here are the ones worth knowing.
This is the baseline. You compute a fair value for the instrument, apply a half-spread in each direction, and post a bid and ask simultaneously. If your fair value is $100.00 and your half-spread is $0.05, you're bidding $99.95 and offering $100.05. Every fill earns you the spread, assuming the market doesn't move against you before you can rebalance.
The spread itself isn't arbitrary. A good answer connects it to realized volatility: wider when the instrument is moving fast (more adverse selection risk), tighter when it's calm and you want to capture more flow. The system component doing this work is the Spread Calculator, sitting between the Fair Value Engine and the Order Publisher.
When to reach for this: any time the interviewer asks you to describe the basic quoting loop, or when they want you to establish a baseline before layering in complexity.

Symmetric quoting assumes you're flat. You won't be. Every fill moves your position, and a growing long or short creates directional risk that can dwarf your spread income if the market moves against you. Inventory skew is how you fight back passively: you shift both your bid and ask in the direction that attracts offsetting flow. Long 500 shares? Lower your bid and ask slightly so sellers find your offer more attractive and buyers find your bid less so. You're not changing your spread width, you're moving the whole quote down to mean-revert your position.
The tradeoff is real and the interviewer will probe it. Skewing means you're offering a worse price on one side, which reduces fill rate on the side you're trying to unload. You're explicitly trading expected P&L for risk reduction. The Inventory Tracker feeds a skew magnitude (proportional to your net exposure) into the Quoting Engine on every quote cycle. The larger the position, the more aggressive the skew.
When to reach for this: whenever the interviewer introduces inventory accumulation as a problem, or asks how a market maker manages directional exposure without crossing the spread.

The most dangerous fills you'll take are from informed traders who know something you don't. They see a large order about to move the market, they hit your stale quote, and by the time you realize what happened, you're holding inventory at the wrong price. Adverse selection is the single biggest threat to a market maker's P&L, and the defense is speed.
The Trade Flow Analyzer watches for toxicity signals: one-sided aggression across recent prints, fill rate spiking on one side of your book, correlated moves across venues. When the toxicity score crosses a threshold, the Quote Guard fires a cancel instruction to the OMS. This path must complete in under one microsecond. That's not a goal, it's a hard requirement. If you're using a message bus between the Quote Guard and the OMS, you've already lost. Shared memory or a direct function call is the only architecture that works here. The cancel reaches the exchange before the informed trader's next order does, or it doesn't matter.
When to reach for this: any time the interviewer asks about informed traders, quote protection, or what happens when someone consistently picks off your best prices.

Passive skewing works when the market will naturally bring you offsetting flow. Sometimes it won't, especially in a trending market or after a large one-sided fill. When your inventory exposure grows past the point where skewing can realistically mean-revert it within your risk tolerance, you stop waiting for the market to come to you and actively take liquidity in a correlated instrument.
The Hedge Selector identifies which instrument gives you the best risk reduction per unit of transaction cost. For an equity options market maker, that's usually the underlying stock or a closely correlated ETF. For a futures market maker, it might be a related contract on another exchange. The Execution Engine then aggressively crosses the spread in the hedge instrument, accepting the cost of taking liquidity in exchange for immediate risk reduction. The fill confirmation flows back to the Consolidated Risk View, which confirms your net exposure is back within limits before the quoting engine resumes normal operation.
This pattern is expensive. You're paying the spread in the hedge instrument, and if your correlation assumption breaks down during a stress event, you can end up with two bad positions instead of one. Interviewers at firms like Citadel will push on exactly this: what's your hedge ratio, how do you estimate it in real time, and what happens when it drifts?
When to reach for this: when the interviewer asks how you handle large inventory that passive skewing can't resolve, or when they introduce a scenario where the primary market has dried up and you can't attract offsetting flow.

| Pattern | Trigger | Latency Budget | Primary Cost |
|---|---|---|---|
| Symmetric Quoting | Baseline, flat inventory | ~10 microseconds | Adverse selection on fills |
| Inventory-Skewed Quoting | Growing net position | ~10 microseconds | Reduced fill rate, worse prices |
| Adverse Selection Detection | Toxic flow signal | Under 1 microsecond | Missed fills from over-pulling |
| Cross-Asset Hedging | Inventory exceeds passive skew capacity | Milliseconds acceptable | Transaction cost, correlation risk |
For most interview problems, you'll default to inventory-skewed quoting as the primary mechanism for managing position risk. Reach for adverse selection detection when the interviewer introduces informed traders or asks how you protect against being picked off. Cross-asset hedging is the answer when they push on what happens when passive approaches aren't enough and your risk limits are approaching.
These patterns run simultaneously in production, not sequentially. The interview signal that separates strong candidates is knowing the priority ordering when signals conflict: adverse selection detection always wins, because a toxic fill you can't cancel will cost more than any skew adjustment can recover.
Here's where candidates lose points — and it's almost always one of these.
A candidate will say something like: "To make more money, you just widen the spread. More revenue per fill." The interviewer nods, waits, and then asks a follow-up. The candidate has nothing.
Wider spreads don't just mean more revenue per fill. They mean fewer fills, because you're no longer competitive with other market makers. Worse, the fills you do get are increasingly from informed traders who are willing to cross your wide spread because they know something you don't. You've filtered out the noise and kept the toxic flow. That's not a business model, that's a slow bleed.
The optimal spread is a function of volatility, your adverse selection rate, and your inventory carrying cost. In a low-volatility regime, you tighten to stay competitive. When realized vol spikes, you widen to protect yourself from being run over. The spread is a risk parameter, not a revenue lever.
Candidates who know about skewing often present it like this: "When I'm long, I just skew my quotes down to attract sellers. Problem solved." They stop there.
Skewing has a real cost that most candidates skip entirely. When you shift both your bid and ask downward to attract sellers, you're offering a worse ask price to the market. Your fill rate on the ask side drops. You're deliberately making yourself less competitive on the side you're trying to unload. That's the tradeoff: you're spending P&L to reduce risk. It's often the right call, but it's a decision with a cost, not a magic fix.
There's also a second-order effect. If your skew is visible and predictable, sophisticated participants will learn to trade against it. They'll hit your artificially cheap bid when you're already long, making your inventory problem worse.
What to say instead: acknowledge the fill rate penalty explicitly. Then explain the decision boundary: passive skewing for small imbalances, switching to active hedging when the position exceeds a threshold where the skew cost exceeds the cost of crossing the spread in a hedge instrument.
This one is almost universal. A candidate walks through the quoting loop beautifully: market data comes in, fair value updates, spread gets applied, orders go out. Then the interviewer asks "what happens when an informed trader shows up?" and the candidate says "we'd widen the spread."
That's too slow. By the time you've detected the informed trader, computed a new spread, and sent a cancel-replace to the exchange, you've already been filled at a stale price. The cancel path isn't an afterthought. It's a first-class component of the system.
In production, quote pulling on a toxicity signal needs to happen in under a microsecond. That means hardware-level cancel triggers, pre-staged cancel messages sitting in kernel bypass buffers, and a Quote Guard that can fire without waiting for the main quoting loop to complete. The detection logic and the cancel mechanism are two separate systems, and the cancel side is often the harder engineering problem.
Ask most candidates "how do you control risk?" and they'll say "position limits." That's fine for equities market making at a basic level. For options, it's almost meaningless.
A position limit tells you how many contracts you hold. It says nothing about your delta, your gamma exposure if the market moves two standard deviations, or your correlation to a broader market selloff. An options market maker can be flat in terms of raw contract count and still be catastrophically exposed if they're short gamma into a volatile open.
Risk limits at a serious shop are expressed in Greeks and scenario P&L. You care about delta-adjusted notional, gamma per unit of vol move, and what your book looks like if the underlying gaps 5% overnight. Position count is an input to those calculations, not a substitute for them.
If you're interviewing for an options market making role and you only mention position limits, the interviewer will assume you've never thought about a book with real convexity in it.
Instead, describe a layered risk framework: delta limits for directional exposure, gamma limits for convexity risk, and scenario P&L thresholds that trigger quote pulling or hedging regardless of whether any individual position limit is breached.
Listen for these cues. They're your signal to shift into market making territory:
If you're interviewing at Citadel, Virtu, HRT, or Jump and the conversation touches liquidity provision at all, these concepts belong in your answer.
This first exchange covers the core tension. Notice how the candidate doesn't stop at "the spread."
This second exchange covers the adverse selection probe. The key is to describe the signals before you describe the response.
"How wide should your spread be?" The optimal spread is a function of volatility and adverse selection probability, not a fixed number. In practice, it's calibrated so that spread revenue exceeds expected inventory losses over a large sample of trades.
"What's the difference between a position limit and a risk limit?" A position limit is a raw count of shares or contracts. A risk limit incorporates Greeks, correlation across instruments, and scenario P&L. For options market makers especially, delta and gamma exposure matter far more than raw position size.
"How does the quoting engine get inventory updates fast enough?" Shared memory between the OMS and quoting engine, not a message bus. Any network hop adds microseconds you can't afford when inventory state needs to be current before the next quote goes out.
"What happens to your quoting during a flash crash?" Spreads widen automatically as volatility signals spike, and quotes may be pulled entirely if P&L-at-risk limits are breached. The system should have a safe mode that reduces size and widens spreads rather than going completely dark, both for risk reasons and to avoid regulatory scrutiny.