ML Engineer MasterClass (April) | 6 seats left

Market Making Strategies and Inventory Risk

Market Making Strategies and Inventory Risk

Market Making Strategies and Inventory Risk

Virtu Financial went public in 2014 and disclosed that it had been profitable on all but one trading day in the previous six years. Not most days. All but one. That's what a well-run market making operation looks like from the outside. What it looks like from the inside is a constant, microsecond-level fight against your own inventory.

Market making is simple to describe and brutal to execute. You post a price to buy and a price to sell simultaneously, collect the spread when both sides fill, and repeat millions of times a day across thousands of instruments. The spread is your revenue. The problem is that fills don't come in matched pairs. You buy 10,000 shares of SPY and the market drops before anyone sells back to you. That spread you earned is gone, and you're still holding the bag.

Every filled order creates a position, and every position is a liability until it's flat. The quoting logic, the inventory tracker, and the risk controls aren't separate concerns; they're a feedback loop that has to close in under a millisecond or you're quoting the wrong price into a moving market. Firms like Citadel, HRT, and Virtu expect you to reason about all three forces at once: spread capture (how you make money), inventory accumulation (how you take on risk), and adverse selection (how informed traders pick you off before you can react). Candidates who only describe the happy-path quoting loop don't make it far in these interviews.

How It Works

Every market making cycle starts the same way: a market data update arrives. Maybe the best bid on SPY just moved up a tick. The fair value engine sees that, recomputes its theoretical price, and the quoting engine immediately cancels the old orders and publishes fresh ones. Bid at fair value minus half the spread. Ask at fair value plus half the spread. Done. The whole loop, from incoming market data to new quotes resting on the exchange, needs to complete in microseconds.

Think of it like a store that reprices its shelves every time the wholesale cost changes, except the shelves update thousands of times per second and the store is simultaneously buying and selling the same item.

Here's what that flow looks like:

Market Making Core Loop: Quote, Fill, Skew

The Inventory Feedback Loop

The diagram shows the happy path. Here's where it gets interesting.

Every time one of your quotes gets hit, you own something you didn't own before (or you're short something). A buyer aggressively takes your ask, and now you're net long. That position is risk. It earns you nothing until you unwind it, and if the price moves against you while you're holding it, it can erase hours of spread income in seconds.

The inventory tracker feeds this net position back into the quoting engine continuously. If you're long 10,000 shares, you shift both your bid and your ask down slightly. You're making your ask cheaper (more attractive to sellers) and your bid less attractive (discouraging more buying). The market naturally flows toward you to flatten your book. This is inventory skew, and it's the core feedback mechanism that keeps a market maker from drifting into dangerous one-sided exposure.

⚠️Common mistake
Candidates describe quoting as if it's stateless. It isn't. Every fill changes the inputs to the next quote. If you describe the quoting loop without mentioning inventory feedback, the interviewer will probe until you get there, and it's better to lead with it.

Fair Value Isn't Just the Mid-Price

When interviewers ask "how do you compute your quote prices?", the wrong answer is "take the midpoint of the best bid and ask." That's a starting point, not a fair value model.

A real fair value engine layers in order book imbalance (if 80% of the visible depth is on the bid side, the instrument is probably going to tick up), recent trade flow (directional aggression in the last 50 milliseconds is a signal), and cross-asset correlations (ES futures moved, so SPY fair value just changed even before SPY's own book updates). The output is a theoretical price that represents where the instrument should be trading right now, not just where it last traded.

This matters in your interview because the sophistication of your fair value model is directly tied to your adverse selection risk. A naive mid-price model means informed traders know your fair value is stale. A richer model means you're harder to pick off.

The Risk Layer Above Everything

Sitting above the quoting engine is a risk layer that can override everything. Position limits cap how much inventory you can accumulate in a single instrument. For options market makers, Greeks exposure (delta, gamma, vega) matters more than raw share count because a large gamma position can blow up violently if the underlying moves sharply. Real-time P&L monitoring watches for drawdown thresholds that trigger automatic spread widening or full quote withdrawal.

This isn't a safety net you bolt on at the end. It's a first-class part of the architecture, running in the same latency budget as the quoting engine itself.

🔑Key insight
The risk layer communicates back to the quoting engine via shared memory, not a message bus. A 500-microsecond round-trip on a risk update means you're quoting stale inventory state into a live market. At HFT scale, that's not a minor inefficiency; it's a systematic source of losses.

Why State Consistency Is the Real Engineering Problem

The quoting engine, risk engine, and order management system all need to agree on the current inventory position at the same instant. If the OMS processes a fill confirmation 200 microseconds before the quoting engine sees it, you'll publish quotes that don't reflect your actual exposure. In a fast-moving market, that gap is enough to get picked off again on the same side.

This is why the architecture diagram isn't just a nice illustration. The connections between components, and the latency on each of those connections, are the engineering problem. Interviewers at firms like Citadel and Virtu care deeply about whether you understand that the strategy and the infrastructure are inseparable.

⏱️Your 30-second explanation
"A market maker continuously posts a bid and an ask around a theoretical fair value. Every fill creates inventory exposure, so the system skews subsequent quotes to attract offsetting flow. A risk layer monitors position limits and Greeks in real time and can widen spreads or pull quotes entirely if thresholds are breached. The whole loop, from market data in to new quotes out, runs in microseconds, and every component shares inventory state with sub-millisecond consistency because a stale read means quoting the wrong price into a moving market."

Patterns You Need to Know

In an interview, you'll usually need to pick a specific approach. Here are the ones worth knowing.

Symmetric Quoting

This is the baseline. You compute a fair value for the instrument, apply a half-spread in each direction, and post a bid and ask simultaneously. If your fair value is $100.00 and your half-spread is $0.05, you're bidding $99.95 and offering $100.05. Every fill earns you the spread, assuming the market doesn't move against you before you can rebalance.

The spread itself isn't arbitrary. A good answer connects it to realized volatility: wider when the instrument is moving fast (more adverse selection risk), tighter when it's calm and you want to capture more flow. The system component doing this work is the Spread Calculator, sitting between the Fair Value Engine and the Order Publisher.

When to reach for this: any time the interviewer asks you to describe the basic quoting loop, or when they want you to establish a baseline before layering in complexity.

Pattern 1: Symmetric Quoting (Baseline)

Inventory-Skewed Quoting

Symmetric quoting assumes you're flat. You won't be. Every fill moves your position, and a growing long or short creates directional risk that can dwarf your spread income if the market moves against you. Inventory skew is how you fight back passively: you shift both your bid and ask in the direction that attracts offsetting flow. Long 500 shares? Lower your bid and ask slightly so sellers find your offer more attractive and buyers find your bid less so. You're not changing your spread width, you're moving the whole quote down to mean-revert your position.

The tradeoff is real and the interviewer will probe it. Skewing means you're offering a worse price on one side, which reduces fill rate on the side you're trying to unload. You're explicitly trading expected P&L for risk reduction. The Inventory Tracker feeds a skew magnitude (proportional to your net exposure) into the Quoting Engine on every quote cycle. The larger the position, the more aggressive the skew.

💡Interview tip
When asked "what happens when you accumulate a large long position?", don't just say "skew your quotes." Explain the fill rate tradeoff explicitly, then describe the threshold at which passive skewing isn't enough and you switch to active hedging. That progression is what separates a strong answer from a textbook one.

When to reach for this: whenever the interviewer introduces inventory accumulation as a problem, or asks how a market maker manages directional exposure without crossing the spread.

Pattern 2: Inventory-Skewed Quoting

Adverse Selection Detection and Quote Pulling

The most dangerous fills you'll take are from informed traders who know something you don't. They see a large order about to move the market, they hit your stale quote, and by the time you realize what happened, you're holding inventory at the wrong price. Adverse selection is the single biggest threat to a market maker's P&L, and the defense is speed.

The Trade Flow Analyzer watches for toxicity signals: one-sided aggression across recent prints, fill rate spiking on one side of your book, correlated moves across venues. When the toxicity score crosses a threshold, the Quote Guard fires a cancel instruction to the OMS. This path must complete in under one microsecond. That's not a goal, it's a hard requirement. If you're using a message bus between the Quote Guard and the OMS, you've already lost. Shared memory or a direct function call is the only architecture that works here. The cancel reaches the exchange before the informed trader's next order does, or it doesn't matter.

⚠️Common mistake
Candidates describe the quoting logic in detail and skip this path entirely. In a production system, the ability to pull quotes is as important as the ability to post them. If you don't mention cancel-on-signal, the interviewer at HRT or Jump will notice.

When to reach for this: any time the interviewer asks about informed traders, quote protection, or what happens when someone consistently picks off your best prices.

Pattern 3: Adverse Selection Detection and Quote Pulling

Cross-Asset Hedging

Passive skewing works when the market will naturally bring you offsetting flow. Sometimes it won't, especially in a trending market or after a large one-sided fill. When your inventory exposure grows past the point where skewing can realistically mean-revert it within your risk tolerance, you stop waiting for the market to come to you and actively take liquidity in a correlated instrument.

The Hedge Selector identifies which instrument gives you the best risk reduction per unit of transaction cost. For an equity options market maker, that's usually the underlying stock or a closely correlated ETF. For a futures market maker, it might be a related contract on another exchange. The Execution Engine then aggressively crosses the spread in the hedge instrument, accepting the cost of taking liquidity in exchange for immediate risk reduction. The fill confirmation flows back to the Consolidated Risk View, which confirms your net exposure is back within limits before the quoting engine resumes normal operation.

This pattern is expensive. You're paying the spread in the hedge instrument, and if your correlation assumption breaks down during a stress event, you can end up with two bad positions instead of one. Interviewers at firms like Citadel will push on exactly this: what's your hedge ratio, how do you estimate it in real time, and what happens when it drifts?

When to reach for this: when the interviewer asks how you handle large inventory that passive skewing can't resolve, or when they introduce a scenario where the primary market has dried up and you can't attract offsetting flow.

Pattern 4: Cross-Asset Hedging to Offload Inventory

Comparing the Four Patterns

PatternTriggerLatency BudgetPrimary Cost
Symmetric QuotingBaseline, flat inventory~10 microsecondsAdverse selection on fills
Inventory-Skewed QuotingGrowing net position~10 microsecondsReduced fill rate, worse prices
Adverse Selection DetectionToxic flow signalUnder 1 microsecondMissed fills from over-pulling
Cross-Asset HedgingInventory exceeds passive skew capacityMilliseconds acceptableTransaction cost, correlation risk

For most interview problems, you'll default to inventory-skewed quoting as the primary mechanism for managing position risk. Reach for adverse selection detection when the interviewer introduces informed traders or asks how you protect against being picked off. Cross-asset hedging is the answer when they push on what happens when passive approaches aren't enough and your risk limits are approaching.

These patterns run simultaneously in production, not sequentially. The interview signal that separates strong candidates is knowing the priority ordering when signals conflict: adverse selection detection always wins, because a toxic fill you can't cancel will cost more than any skew adjustment can recover.

What Trips People Up

Here's where candidates lose points — and it's almost always one of these.


The Mistake: Treating Spread Width as a Profit Dial

A candidate will say something like: "To make more money, you just widen the spread. More revenue per fill." The interviewer nods, waits, and then asks a follow-up. The candidate has nothing.

Wider spreads don't just mean more revenue per fill. They mean fewer fills, because you're no longer competitive with other market makers. Worse, the fills you do get are increasingly from informed traders who are willing to cross your wide spread because they know something you don't. You've filtered out the noise and kept the toxic flow. That's not a business model, that's a slow bleed.

The optimal spread is a function of volatility, your adverse selection rate, and your inventory carrying cost. In a low-volatility regime, you tighten to stay competitive. When realized vol spikes, you widen to protect yourself from being run over. The spread is a risk parameter, not a revenue lever.

💡Interview tip
When asked about spread setting, say something like: "The spread needs to cover expected adverse selection losses plus inventory carrying cost, while staying tight enough to attract uninformed flow. I'd model it as a function of realized volatility and order flow toxicity, not a fixed constant."

The Mistake: Describing Inventory Skew as a Free Lunch

Candidates who know about skewing often present it like this: "When I'm long, I just skew my quotes down to attract sellers. Problem solved." They stop there.

Skewing has a real cost that most candidates skip entirely. When you shift both your bid and ask downward to attract sellers, you're offering a worse ask price to the market. Your fill rate on the ask side drops. You're deliberately making yourself less competitive on the side you're trying to unload. That's the tradeoff: you're spending P&L to reduce risk. It's often the right call, but it's a decision with a cost, not a magic fix.

There's also a second-order effect. If your skew is visible and predictable, sophisticated participants will learn to trade against it. They'll hit your artificially cheap bid when you're already long, making your inventory problem worse.

⚠️Common mistake
Candidates describe skewing as the solution to inventory risk. The interviewer hears that you don't understand it's a tradeoff between P&L and risk reduction, not a free hedge.

What to say instead: acknowledge the fill rate penalty explicitly. Then explain the decision boundary: passive skewing for small imbalances, switching to active hedging when the position exceeds a threshold where the skew cost exceeds the cost of crossing the spread in a hedge instrument.


The Mistake: Ignoring the Cancel Path

This one is almost universal. A candidate walks through the quoting loop beautifully: market data comes in, fair value updates, spread gets applied, orders go out. Then the interviewer asks "what happens when an informed trader shows up?" and the candidate says "we'd widen the spread."

That's too slow. By the time you've detected the informed trader, computed a new spread, and sent a cancel-replace to the exchange, you've already been filled at a stale price. The cancel path isn't an afterthought. It's a first-class component of the system.

In production, quote pulling on a toxicity signal needs to happen in under a microsecond. That means hardware-level cancel triggers, pre-staged cancel messages sitting in kernel bypass buffers, and a Quote Guard that can fire without waiting for the main quoting loop to complete. The detection logic and the cancel mechanism are two separate systems, and the cancel side is often the harder engineering problem.

💡Interview tip
After describing your quoting logic, proactively say: "I'd also want to talk about the cancel path, because adverse selection protection is just as important as the quoting logic itself." That sentence alone signals you've thought about this at production depth.

The Mistake: Conflating Position Limits with Risk Limits

Ask most candidates "how do you control risk?" and they'll say "position limits." That's fine for equities market making at a basic level. For options, it's almost meaningless.

A position limit tells you how many contracts you hold. It says nothing about your delta, your gamma exposure if the market moves two standard deviations, or your correlation to a broader market selloff. An options market maker can be flat in terms of raw contract count and still be catastrophically exposed if they're short gamma into a volatile open.

Risk limits at a serious shop are expressed in Greeks and scenario P&L. You care about delta-adjusted notional, gamma per unit of vol move, and what your book looks like if the underlying gaps 5% overnight. Position count is an input to those calculations, not a substitute for them.

If you're interviewing for an options market making role and you only mention position limits, the interviewer will assume you've never thought about a book with real convexity in it.

⚠️Common mistake
Saying "we'd cap position size at X contracts" as your risk control answer. The interviewer at an options shop hears "this person doesn't understand Greeks exposure."

Instead, describe a layered risk framework: delta limits for directional exposure, gamma limits for convexity risk, and scenario P&L thresholds that trigger quote pulling or hedging regardless of whether any individual position limit is breached.

How to Talk About This in Your Interview

When to Bring It Up

Listen for these cues. They're your signal to shift into market making territory:

  • "How would you design a system that continuously quotes prices?" (obvious, but don't just describe the happy path)
  • "How does a market maker actually make money?" (this is an invitation to show depth, not just say "the spread")
  • "What happens when you take on too much inventory?" (they want to hear skew, hedging, and the latency constraints around both)
  • "How do you protect against informed traders?" (adverse selection probe; jump to toxicity signals before you mention cancels)
  • Any question about risk feedback loops, position limits, or latency budgets in a trading context

If you're interviewing at Citadel, Virtu, HRT, or Jump and the conversation touches liquidity provision at all, these concepts belong in your answer.


Sample Dialogue

This first exchange covers the core tension. Notice how the candidate doesn't stop at "the spread."

I
Interviewer: "Walk me through how a market maker actually makes money. It's always on both sides of the trade, so where does the profit come from?"
Y
You: "The revenue source is the spread. You post a bid at 99.98 and an ask at 100.02, and if you get filled on both sides against uninformed flow, you pocket 4 cents per round trip. But that's the easy part. The real challenge is that every fill creates inventory. If I sell 10,000 shares and the market moves up, I'm short into a rising market and those spread gains disappear fast. So the quoting logic and the inventory management are inseparable."
I
Interviewer: "Okay, so what do you do when you start accumulating a large long position?"
Y
You: "First response is passive skewing. I shift both my bid and ask down slightly, so I'm offering a better price to sellers and a worse price to buyers. That attracts offsetting sell flow without me having to cross the spread myself. The tradeoff is real though: I'm giving up edge on the sell side to reduce risk on the long side. If the position keeps growing and hits a threshold, passive skewing isn't fast enough. That's when I switch to active hedging, either in a correlated instrument or a futures contract, where I'm aggressively taking liquidity to flatten the delta quickly."
I
Interviewer: "How do you decide where that threshold is?"
Y
You: "It's a function of volatility and the cost of hedging. If the instrument is moving 50 basis points a minute, my tolerance for a large long is much lower than in a calm market. In practice, the risk engine maintains a real-time P&L-at-risk estimate, and when that number crosses a pre-set limit, it triggers the active hedge path automatically. The quoting engine doesn't make that call; the risk manager does, and it feeds back into quoting as a hard constraint."

This second exchange covers the adverse selection probe. The key is to describe the signals before you describe the response.

I
Interviewer: "How do you know when you're being picked off by an informed trader rather than just seeing normal flow?"
Y
You: "A few signals. The most direct one is one-sided fill rate: if my ask keeps getting hit but my bid never trades, someone knows something I don't. You also watch order-to-trade ratio; informed traders tend to aggress rather than post and cancel. And cross-venue correlation is useful: if the same instrument is moving on another exchange before I see it on mine, that's a latency arbitrageur or an informed flow signal."
I
Interviewer: "And then what? You just cancel?"
Y
You: "Cancel or widen, yes, but the timing is everything. The cancel has to reach the exchange before the informed trader's order does, which means the quote guard component needs to act in under a microsecond from signal detection. That's why this path can't go through a message bus. It has to be in-process, using shared memory, with the cancel logic sitting as close to the network card as possible. Some shops use kernel bypass here specifically for this path."
I
Interviewer: "Isn't there a regulatory issue with pulling quotes too aggressively?"
Y
You: "Yes, and this is where it gets nuanced. Under MiFID II, registered market makers have quoting obligations during normal market conditions. Pulling quotes when you detect toxicity is generally defensible if you can show the market was disorderly or your risk limits were genuinely breached. But if you're pulling quotes opportunistically, just to avoid being on the wrong side of a move, that can create regulatory exposure. The system needs an audit trail that shows every cancel was triggered by a documented risk threshold, not a discretionary call."

Follow-Up Questions to Expect

"How wide should your spread be?" The optimal spread is a function of volatility and adverse selection probability, not a fixed number. In practice, it's calibrated so that spread revenue exceeds expected inventory losses over a large sample of trades.

"What's the difference between a position limit and a risk limit?" A position limit is a raw count of shares or contracts. A risk limit incorporates Greeks, correlation across instruments, and scenario P&L. For options market makers especially, delta and gamma exposure matter far more than raw position size.

"How does the quoting engine get inventory updates fast enough?" Shared memory between the OMS and quoting engine, not a message bus. Any network hop adds microseconds you can't afford when inventory state needs to be current before the next quote goes out.

"What happens to your quoting during a flash crash?" Spreads widen automatically as volatility signals spike, and quotes may be pulled entirely if P&L-at-risk limits are breached. The system should have a safe mode that reduces size and widens spreads rather than going completely dark, both for risk reasons and to avoid regulatory scrutiny.


What Separates Good from Great

  • A mid-level answer describes the quoting loop and mentions inventory skew. A senior answer connects every component choice to a latency budget and explains why the risk feedback path is architecturally separate from the quoting path.
  • Mid-level candidates treat adverse selection as an afterthought. Senior candidates describe the toxicity signal stack first, then the cancel mechanism, then the regulatory constraints on when pulling quotes is actually permissible.
  • The best answers acknowledge the tradeoffs explicitly: skewing costs edge, hedging costs transaction fees, widening spreads costs fill rate. Framing every decision as a tradeoff rather than a solution shows you've thought about this in production terms, not just theory.
🎯Key takeaway
Market making isn't just about quoting a spread; it's about managing the feedback loop between fills, inventory, and risk controls fast enough that stale state never reaches the quoting engine.