ML Engineer MasterClass (April) | 6 seats left

The Monty Hall Problem and Variants

The Monty Hall Problem and Variants

Problem Statement

The Question

Interview question: You're a contestant on a game show. There are three doors. Behind one is a car; behind the other two are goats. You pick door 1. The host, who knows what's behind every door, opens door 3 to reveal a goat. He then offers you a choice: stick with door 1, or switch to door 2. What is the probability you win the car if you switch? If you stay? What should you do, and why?

This problem shows up constantly at Jane Street, Citadel, Two Sigma, and IMC, typically in first-round interviews for quantitative researcher and trading roles. It's a warm-up in the sense that the answer is well-known, but interviewers aren't asking because they want to hear "two-thirds." They're asking because they want to watch you reason under uncertainty, apply Bayes' theorem cleanly, and explain why a smart person's first instinct is wrong.

The skill being tested is Bayesian updating. Can you identify when an observation carries information, formalize that information mathematically, and update a prior correctly? That's the same reasoning that drives signal processing, options pricing, and market microstructure. The Monty Hall problem is just a clean, contained version of it.


Clarifying the Problem

Before computing anything, ask these questions. They aren't stalling tactics; they're how you demonstrate that you understand which assumptions are doing the mathematical work.

  1. Does the host always open a goat door, or could he open the car door by accident? Answer: The host always reveals a goat. He knows the layout and will never open the car door. This constraint is non-negotiable in the classic problem.
  2. Does the host always offer the switch, or only sometimes? Answer: The host always offers the switch, regardless of which door you originally picked. He never withholds the offer strategically.
  3. If the car is behind my chosen door, how does the host decide which goat door to open? Answer: Uniformly at random between the two remaining goat doors. This matters for the Bayesian calculation and becomes critical in variants.

These three answers together define what's sometimes called "the constrained host" model. Change any one of them and the answer changes. Interviewers will sometimes deliberately leave one ambiguous to see if you notice.


Key Observations

The host's action is not random. Monty cannot open your door, and he cannot reveal the car. That double constraint means his choice of which door to open is a function of where the car actually is. When the car is behind door 2, he is forced to open door 3. When the car is behind your door, he has a free choice. That asymmetry is where the information lives.

Your initial pick partitions the probability space into two unequal regions. Before Monty does anything, your door holds probability $P(C = 1) = \frac{1}{3}$ and the other two doors together hold $\frac{2}{3}$. Monty's reveal doesn't touch your door's probability. It just collapses the $\frac{2}{3}$ mass from two doors onto one.

The 50/50 intuition is wrong for a precise reason, not just "because math." Most people reason: "Two doors remain, one has the car, so it's 50/50." That argument would be correct if Monty had chosen which door to open uniformly at random. He didn't. His constrained behavior breaks the symmetry between the two remaining doors, and that's exactly what Bayes' theorem will quantify.

Let $C \in {1, 2, 3}$ denote the door hiding the car and $H \in {1, 2, 3}$ denote the door Monty opens. The quantities you need to compute are $P(\text{win} \mid \text{switch})$ and $P(\text{win} \mid \text{stay})$, which reduce to $P(C = 2 \mid H = 3)$ and $P(C = 1 \mid H = 3)$ respectively, given that you picked door 1 and Monty opened door 3.

💡Interview tip
Before writing a single equation, say out loud which technique you're reaching for. Something like: "I'll use Bayes' theorem, conditioning on the door Monty opened, because his action is informative and I need to update the prior on each door." That one sentence tells the interviewer you have a plan, not just a memorized answer.

Solution

Approach

Three approaches work here, and you should know all of them. The cleanest starting point is direct enumeration: just list every possible state of the world and count. Once you've built that intuition, you back it up with Bayes' theorem, which is what a quant interviewer actually wants to see. The Bayesian derivation makes explicit why switching works, not just that it works.

The key mathematical insight driving everything is that Monty's action is not random. He is a constrained agent, and constrained agents leak information. Bayes' theorem is the right tool for formalizing exactly how much information leaks, and in which direction.


Step-by-Step Derivation

Part 1: Direct Enumeration

This is your sanity check. Run through every equally likely scenario.

Setup: You pick door 1. The car is equally likely to be behind any door, so there are three cases, each with probability $\frac{1}{3}$.

Car locationMonty opensStay wins?Switch wins?
Door 1Door 2 or 3 (random)YesNo
Door 2Door 3 (forced)NoYes
Door 3Door 2 (forced)NoYes

Counting directly:

$$P(\text{win} \mid \text{stay}) = \frac{1}{3}, \qquad P(\text{win} \mid \text{switch}) = \frac{2}{3}$$

Notice that when the car is behind door 1, Monty has a choice. That doesn't affect the win probabilities here, but it matters enormously in the Bayesian derivation below.


Part 2: Bayes' Theorem Derivation

This is the version that separates candidates. Work through it carefully.

1. Define events.

Let $C \in {1, 2, 3}$ be the door hiding the car, and let $H \in {2, 3}$ be the door Monty opens. You pick door 1. Suppose Monty opens door 3 (by symmetry, the analysis is identical if he opens door 2).

2. State the prior.

$$P(C = 1) = P(C = 2) = P(C = 3) = \frac{1}{3}$$

3. Compute the likelihoods $P(H = 3 \mid C = j)$ for each $j$.

This is where Monty's constraint does its work.

  • If $C = 1$: Monty can open door 2 or door 3 freely. Assuming he picks uniformly among available goat doors:

$$P(H = 3 \mid C = 1) = \frac{1}{2}$$

  • If $C = 2$: Monty cannot open door 2 (car) or door 1 (your pick), so he must open door 3:

$$P(H = 3 \mid C = 2) = 1$$

  • If $C = 3$: Monty cannot open door 3 (car) or door 1 (your pick), so he must open door 2. He cannot open door 3:

$$P(H = 3 \mid C = 3) = 0$$

4. Compute the marginal probability $P(H = 3)$ via total probability.

$$P(H = 3) = \sum_{j=1}^{3} P(H = 3 \mid C = j) \cdot P(C = j)$$

$$= \frac{1}{2} \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 0 \cdot \frac{1}{3} = \frac{1}{6} + \frac{1}{3} = \frac{1}{2}$$

5. Apply Bayes' theorem to get the posterior for each door.

$$P(C = j \mid H = 3) = \frac{P(H = 3 \mid C = j) \cdot P(C = j)}{P(H = 3)}$$

For $j = 1$ (your door, staying):

$$P(C = 1 \mid H = 3) = \frac{\frac{1}{2} \cdot \frac{1}{3}}{\frac{1}{2}} = \frac{1}{3}$$

For $j = 2$ (the switch door):

$$P(C = 2 \mid H = 3) = \frac{1 \cdot \frac{1}{3}}{\frac{1}{2}} = \frac{2}{3}$$

For $j = 3$ (the opened door):

$$P(C = 3 \mid H = 3) = \frac{0 \cdot \frac{1}{3}}{\frac{1}{2}} = 0$$

6. Read off the decision.

Staying keeps you on door 1, which has posterior probability $\frac{1}{3}$. Switching moves you to door 2, which has posterior probability $\frac{2}{3}$. Switch.


Part 3: The Symmetry Argument

When you picked door 1, the prior split was:

$$P(C = 1) = \frac{1}{3}, \qquad P(C \in {2, 3}) = \frac{2}{3}$$

Monty's reveal doesn't change the probability mass on your door. He can't open your door, and he can't open the car door, so his action carries zero information about whether door 1 has the car. Your door stays at $\frac{1}{3}$.

The $\frac{2}{3}$ that was spread across doors 2 and 3 now has nowhere to go except door 2, because door 3 has been eliminated. Switching captures the entire $\frac{2}{3}$.

This argument is fast and clean. Use it to build intuition, then back it up with the Bayesian derivation if the interviewer pushes.


The Answer

Answer: $P(\text{win} \mid \text{switch}) = \dfrac{2}{3}$, and $P(\text{win} \mid \text{stay}) = \dfrac{1}{3}$. You should always switch.

Verification

Simulation. Write this mentally (or in Python if asked):

Python
1import random
2
3def monty_hall(switch, trials=10_000):
4    wins = 0
5    for _ in range(trials):
6        car = random.randint(0, 2)
7        pick = random.randint(0, 2)
8        # Monty opens a goat door that isn't the pick or the car
9        available = [d for d in range(3) if d != pick and d != car]
10        monty_opens = random.choice(available)
11        if switch:
12            # Switch to the door that is neither pick nor monty_opens
13            new_pick = next(d for d in range(3) if d != pick and d != monty_opens)
14            wins += (new_pick == car)
15        else:
16            wins += (pick == car)
17    return wins / trials
18
19# Expect ~0.333 and ~0.667
20print(monty_hall(switch=False))
21print(monty_hall(switch=True))
22

After 10,000 trials, you'll see the stay strategy hovering around 0.333 and switch around 0.667. Showing you'd run this check signals to the interviewer that you validate analytic results.

Boundary check. Push the problem to an extreme: 100 doors, you pick one, Monty opens 98 goat doors. Your door still has prior probability $\frac{1}{100}$. The one remaining door has probability $\frac{99}{100}$. Obviously switch. The extreme case makes the intuition undeniable, which is why it's a useful sanity check on the three-door result.

Posteriors sum to 1. A quick internal check: $P(C=1 \mid H=3) + P(C=2 \mid H=3) + P(C=3 \mid H=3) = \frac{1}{3} + \frac{2}{3} + 0 = 1$. The posteriors are coherent.

Monty Hall Probability Flow: Three-Door Sample Space
🔑Key insight
Monty's constraint is the entire problem. When $C = 2$, he must open door 3, giving that event a likelihood of 1. When $C = 1$, he opens door 3 only half the time, giving it a likelihood of $\frac{1}{2}$. That 2:1 likelihood ratio is exactly what Bayes' theorem converts into the 2:1 posterior odds favoring the switch door. A random host would have equal likelihoods in both cases, and the posterior would stay at 50/50. The host's knowledge, not the reveal itself, is what moves the probability.

Alternative Approaches

Approach 2: The N-Door Generalization

This is the approach that separates candidates who memorized the answer from candidates who actually understand it. Start with $N$ doors, one car, and a contestant who picks door 1. Monty then opens $k$ goat doors (never yours, never the car's). You switch to one of the remaining $N - 1 - k$ unopened doors.

The probability of winning by staying is still $\frac{1}{N}$, because your initial pick was uniformly random and Monty's constrained reveal gives you no information about your own door. The car is equally distributed across the $N - 1 - k$ remaining doors you could switch to, weighted by the probability mass that was on the opened doors. Formally:

$$P(\text{win} \mid \text{switch}) = \frac{N-1}{N(N-1-k)}$$

Verify the classic case: $N = 3$, $k = 1$ gives $\frac{2}{3(1)} = \frac{2}{3}$. Correct.

The edge case is instructive. Set $k = N - 2$, meaning Monty opens every door except yours and one other. Then:

$$P(\text{win} \mid \text{switch}) = \frac{N-1}{N \cdot 1} = \frac{N-1}{N}$$

As $N$ grows large, this approaches 1. Monty has essentially pointed at the car. That limiting behavior is a great intuition pump: the more doors Monty eliminates, the more information he's forced to reveal, and the more valuable switching becomes.

💡Interview tip
Interviewers at Jane Street and Citadel love this generalization. If you derive it unprompted, you signal that you're thinking about the structure of the problem, not just the answer. Mention it after your main derivation: "The same argument generalizes cleanly to $N$ doors."

Approach 3: Law of Total Probability (The Mechanical Method)

This approach is slower but more general. It's what you reach for when the host's behavior is non-uniform or partially specified.

Write out $P(\text{win} \mid \text{switch})$ by conditioning on where the car actually is. Suppose you picked door 1 and Monty opened door 3. You're considering switching to door 2.

$$P(\text{win} \mid \text{switch to 2}, H=3) = \frac{P(H=3 \mid C=2) \cdot P(C=2)}{P(H=3)}$$

Compute each piece:

  • $P(C=2) = \frac{1}{3}$
  • $P(H=3 \mid C=2) = 1$, because if the car is behind door 2, Monty must open door 3 (the only remaining goat door)
  • $P(H=3 \mid C=1) = \frac{1}{2}$, because Monty can open either door 2 or 3 when the car is behind your door
  • $P(H=3 \mid C=3) = 0$, Monty never opens the car door

So:

$$P(H=3) = \frac{1}{2} \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 0 \cdot \frac{1}{3} = \frac{1}{2}$$

$$P(\text{win} \mid \text{switch to 2}, H=3) = \frac{1 \cdot \frac{1}{3}}{\frac{1}{2}} = \frac{2}{3}$$

Same answer, more algebra. The payoff is that this machinery handles the "random Monty" variant immediately: if Monty picks uniformly at random and happens to open a goat door, $P(H=3 \mid C=1) = \frac{1}{2}$ and $P(H=3 \mid C=2) = \frac{1}{2}$ as well, so the posterior on door 2 becomes $\frac{1}{2}$. The host's constraint is exactly what breaks the symmetry in the original problem.

🔑Key insight
The reason the 50/50 intuition fails is that the two remaining doors are not exchangeable. Door 1 was chosen before Monty acted; door 2 was not. Exchangeability requires identical joint distributions, and these two doors have different histories relative to Monty's decision rule. The total probability approach makes this concrete: $P(H=3 \mid C=1) \neq P(H=3 \mid C=2)$, so the posterior weights are different.

Comparing Approaches

The enumeration argument (listing all cases) is fastest for convincing a skeptic. The symmetry argument ("your door holds 1/3, the other two hold 2/3 collectively, Monty collapses that mass") is most elegant and most memorable. The total probability / Bayes approach is the most generalizable, and it's the one you need when the problem gets twisted.

In a live interview, lead with symmetry because it's the clearest signal that you understand why the answer is what it is. Then offer the generalization to $N$ doors to show you can extend it. Keep the full Bayesian machinery in your back pocket for when the interviewer asks "what if Monty picks randomly?" That's when the mechanical method earns its keep.

⚠️Common mistake
Candidates who only know the enumeration argument get stuck the moment the interviewer changes the setup. If you can't re-derive from first principles using conditional probability, you're one follow-up question away from losing the thread.

Variations and Follow-Ups

The classic three-door problem is just the entry point. Every strong quant interviewer has a follow-up loaded and ready. Here's what they'll throw at you.


Variation 1: Random Monty (The 50/50 Case)

This is the most common follow-up, and it's a trap for candidates who think they understand the original problem.

Suppose Monty doesn't know where the car is. He picks one of the two remaining doors uniformly at random, and it happens to reveal a goat. Now what's the probability of winning if you switch?

Apply Bayes' theorem properly. Let $C$ be the car's location, and let $H = 3$ be the event that Monty opens door 3 and reveals a goat.

$$P(H=3 \mid C=1) = \frac{1}{2}, \quad P(H=3 \mid C=2) = \frac{1}{2}, \quad P(H=3 \mid C=3) = 0$$

Now compute the posterior on door 2 (the switch door):

$$P(C=2 \mid H=3) = \frac{P(H=3 \mid C=2) \cdot P(C=2)}{P(H=3)} = \frac{\frac{1}{2} \cdot \frac{1}{3}}{\frac{1}{2} \cdot \frac{1}{3} + \frac{1}{2} \cdot \frac{1}{3} + 0 \cdot \frac{1}{3}} = \frac{1}{2}$$

It genuinely is 50/50. Switching gives no advantage.

The intuition: constrained Monty is forced to reveal information about where the car is not. Random Monty's reveal is weaker evidence because he could have opened that door regardless of where the car sits. The two remaining doors become symmetric, and your posterior is uniform over them.

🔑Key insight
The host's selection rule is the entire mechanism. Same observation, different host behavior, different posterior. This is a clean illustration of why likelihood functions matter in Bayesian inference.

Variation 2: N Doors, Monty Opens k

Generalize to $N$ doors, where you pick one, Monty opens $k$ goat doors (with $1 \leq k \leq N-2$), and you decide whether to switch to one of the $N - 1 - k$ remaining doors.

By the same symmetry argument: your chosen door retains its prior probability $\frac{1}{N}$. The remaining $N - 1 - k$ doors share the probability mass that was originally spread over $N - 1$ doors. By symmetry among those doors, each one holds:

$$P(\text{win} \mid \text{switch to a specific door}) = \frac{N-1}{N(N-1-k)}$$

Verify this against the original: $N = 3$, $k = 1$ gives $\frac{2}{3 \cdot 1} = \frac{2}{3}$. Correct.

Two edge cases worth knowing cold:

Monty opens all but one other door ($k = N - 2$): switching gives $\frac{N-1}{N}$. As $N$ grows, this approaches 1. With a million doors, Monty opening 999,998 of them and leaving one closed is essentially pointing at the car.

$k = 0$ (Monty opens nothing): switching gives $\frac{1}{N}$, the same as staying. No information, no advantage. Makes sense.

⚠️Common mistake
Candidates sometimes write $P(\text{win} \mid \text{switch}) = \frac{N-1}{N}$ without conditioning on which specific door they switch to. That formula only holds when Monty opens all $N-2$ remaining goat doors. Get the $k$ parameter right before writing anything down.

Variation 3: Adversarial Monty

This one shows up at Jane Street and similar shops. It's less about computation and more about reasoning under strategic uncertainty.

Suppose Monty is adversarial: he wants you to lose, and he only offers you the switch when you've already chosen the car. When you've chosen a goat, he either doesn't offer the switch or the problem ends differently.

Under this policy:

$$P(\text{win} \mid \text{switch offered, you switch}) = 0$$

Monty only reveals a goat and offers the switch when switching would cost you the car. So switching is strictly dominated.

The deeper question the interviewer is probing: what should you do when you don't know Monty's policy? This is a minimax problem. If you assign prior probability $p$ to Monty being adversarial and $(1-p)$ to Monty being constrained, your expected win probability from switching is:

$$E[\text{win} \mid \text{switch}] = p \cdot 0 + (1-p) \cdot \frac{2}{3} = \frac{2(1-p)}{3}$$

Switching beats staying ($\frac{1}{3}$) when $\frac{2(1-p)}{3} > \frac{1}{3}$, i.e., when $p < \frac{1}{2}$. If you think Monty is more likely benign than adversarial, switch. If you think he's probably adversarial, stay.

This connects directly to how quant traders think about informed counterparties. The selection rule of whoever is offering you a trade carries information about whether the trade is good for you.


Variation 4: The Repeated Game

You play $M$ independent rounds of Monty Hall. Each round you win \$1 for a car and \$0 for a goat. You must commit to a fixed strategy (always switch, always stay, or randomize) before the game starts.

The single-round analysis extends trivially here. Since rounds are independent and identically distributed, the law of large numbers guarantees that always switching yields expected total winnings of $\frac{2M}{3}$ versus $\frac{M}{3}$ for always staying. No dynamic programming needed; the optimal per-round strategy is globally optimal.

The interesting wrinkle comes if you introduce a cost to switching (say, $c$ dollars per switch). Then switching is optimal when $\frac{2}{3} - c > \frac{1}{3}$, i.e., $c < \frac{1}{3}$. Interviewers sometimes add this to test whether you can translate a probability result into a decision threshold.


Common Follow-Up Questions

"What if Monty sometimes forgets which door hides the car?" This is the random Monty variant in disguise. Condition on the event that a goat was revealed, re-derive the posterior, and you get 50/50. The key phrase to say out loud: "I need to condition on what was actually observed, not just that a door was opened."

"Does it matter which door you originally picked?" No. By symmetry, the problem is identical regardless of which door you choose first. The labels are arbitrary. A candidate who re-derives everything for door 2 instead of door 1 is wasting time; state the symmetry and move on.

"What if there are two cars and one goat?" This one has a subtle twist that catches people off guard. If Monty must open a goat door, then the event that he successfully does so is itself informative. If you had initially picked the goat, both remaining doors would be cars and Monty would have no goat to reveal. So the fact that Monty opens a goat door tells you with certainty that you picked a car. After that reveal, the remaining unopened door must also be a car (there are two cars total and you're holding one). Both staying and switching win with probability 1. The problem doesn't flip so much as it collapses: Monty's action eliminates the only scenario where you lose.


🎯Key takeaway
Every variant of Monty Hall reduces to the same question: what is the host's selection rule, and what does the observation tell you about the car's location? A constrained host leaks information. A random host leaks less. An adversarial host's offer is itself a signal. Nail that framing and you can derive any variant from scratch.

What is Expected at Each Level

Junior Trader / Analyst

The bar here is correctness and clarity, not elegance. You need to arrive at $P(\text{win} | \text{switch}) = 2/3$ and be able to defend it.

  • State the problem setup precisely before solving. Identify that Monty's constraint (always reveals a goat, always offers the switch) is not flavor text. It is the entire reason the math works.
  • Walk through the enumeration argument without prompting: three equally likely initial states, trace what Monty must do in each, count wins under both strategies. This is the minimum viable proof.
  • Get the arithmetic right. $\frac{2}{3}$ vs $\frac{1}{3}$, not $\frac{1}{2}$ vs $\frac{1}{2}$. If you say "50/50 after the reveal," the interview is effectively over.
  • Recognize, even if only when asked, that a random Monty produces a different answer. You don't need to derive it on the spot at this level, but you should know the distinction exists.
⚠️Common mistake
Candidates who memorized "switch, it's 2/3" but can't explain why get caught immediately. Interviewers at Jane Street and Citadel will ask "but why isn't it 50/50?" and expect a real answer, not a shrug.

Senior Quant / Researcher

At this level, the enumeration argument is just the warm-up. You're expected to move fluidly between approaches and handle variants without being led.

  • Derive the result via Bayes' theorem with full notation. Compute $P(C=1 \mid H=3)$, $P(C=2 \mid H=3)$, and $P(C=3 \mid H=3)$ explicitly, showing how the constrained reveal redistributes probability mass. No hints needed.
  • Generalize to $N$ doors and $k$ reveals. The general formula is $P(\text{win} | \text{switch}) = \frac{N-1}{N(N-1-k)}$, but be precise about what "switch" means here: you are switching to one of the $N-1-k$ remaining doors chosen uniformly at random. The most common variant in interviews is the one where Monty opens $N-2$ doors, leaving exactly one alternative. In that case $k = N-2$, the formula reduces to $P(\text{win} | \text{switch}) = \frac{N-1}{N}$, and switching is nearly certain to win for large $N$. Verify the $N=3, k=1$ case recovers $\frac{2}{3}$ on the spot.
  • Handle the random-Monty variant on the fly. Explain why conditioning on "a goat happened to be revealed" under a uniform host policy yields a genuine $\frac{1}{2}$ posterior, and articulate exactly where the information difference lies.
  • Articulate the information-theoretic intuition without prompting: the constrained host's action carries more information precisely because it is constrained. A host who could have opened any door tells you less than a host who had no choice.

Lead / Portfolio Manager Level

The problem itself takes maybe two minutes. The rest of the conversation is what distinguishes a top-level candidate.

  • Pattern recognition is instant. You recognize this as a Bayesian posterior update problem under an asymmetric information structure, and you frame it that way from the first sentence.
  • You handle the adversarial Monty variant without hesitation. If Monty only offers the switch when you've already picked the car, $P(\text{win} | \text{switch}) = 0$. You note that the optimal response to an unknown host policy depends on your prior over Monty's intent, which is itself a game-theoretic inference problem.
  • Connect the posterior update structure to real applications. The constrained-vs-random Monty distinction maps directly onto how you should interpret signals from informed vs. uninformed counterparties in market microstructure. An order from a market maker who must quote you carries different information than one from a participant who chose to trade.
  • Identify which assumptions are load-bearing and what breaks when you relax them. Uniform prior over door positions, Monty's policy being known and fixed, a single decision point. Each assumption, if changed, produces a different problem with a different answer.
🎯Key takeaway
The Monty Hall problem is not a trick question. It is a test of whether you can correctly apply conditional probability when an informed agent takes a constrained action. The candidates who fail do so because they treat the two remaining doors as symmetric after the reveal. They are not symmetric. One was chosen before Monty acted; one was chosen by Monty's constraint. That asymmetry is the entire problem, and being able to state it precisely, in Bayesian terms, is what separates a passing answer from a great one.