ML Engineer MasterClass (April) | 6 seats left

Common Probability Traps to Avoid

Common Probability Traps to Avoid

Common Probability Traps to Avoid

Most candidates who fail quant probability interviews at Jane Street, Citadel, and Two Sigma know the material. They can recite Bayes' theorem, derive the geometric distribution, and explain conditional expectation. They fail because they apply the right formula to the wrong problem, and they don't catch it until the interviewer's expression tells them something has gone sideways.

The traps fall into three categories. Cognitive traps are where your intuition overrides your math: you "feel" that two events are independent, so you treat them as independent without checking. Mechanical traps are subtler: your reasoning is sound but your setup is wrong, like counting ordered outcomes when the problem requires unordered ones. Communication traps are the cruelest: you get the right answer but explain it in a way that reveals you don't know why it's right, which is often worse than a wrong answer with clean reasoning.

Interviewers at top quant firms are not just testing whether you know the tools. They are specifically probing for these failure modes. A Jane Street interviewer who asks a Monty Hall variant isn't curious whether you've seen it before. They want to watch how you handle a problem designed to make your intuition lie to you. This guide gives you a detection framework to run before you write a single equation, worked examples of each trap in action, and a pre-answer checklist you can execute in your head in under thirty seconds.

The Framework

Most candidates dive straight into calculation. That's exactly when traps bite. The STOP-CHECK-SOLVE-SANITY framework forces a 60-second pause before you write anything, and that pause is where interviews are won.

Here's the structure. Memorize this table.

PhaseTimeGoal
STOP15-20 secClassify the problem: discrete or continuous, counting or expectation, conditional or joint
CHECK30-40 secRun all five trap signatures; flag any that apply
SOLVERemaining timeExecute with a verified setup: notation first, derivation second, number last
SANITY15 secBoundary checks before you say your final answer out loud

The whole pre-computation phase takes under a minute. The interviewer won't notice the pause. They'll notice the rigor.

STOP-CHECK-SOLVE: Probability Trap Detection Loop

STOP: Classify Before You Touch the Problem

What to do:

  1. Identify whether the problem is asking for a probability, an expected value, or a distribution. These require different setups and different sanity checks at the end.
  2. Ask yourself: is this a counting problem (how many outcomes?) or a conditioning problem (given something happened, what's the probability of something else)? Conflating these two is the single most common source of wrong setups.
  3. Write down the sample space in words before you write any symbols. Not $\Omega = {H, T}^n$, but "each outcome is a sequence of $n$ coin flips." The act of writing it forces precision.

What to say:

"Before I set up anything, let me make sure I understand what kind of problem this is. We're looking for a probability, and the event depends on the outcome of multiple draws, so I want to be careful about whether order matters here."

How the interviewer is evaluating you:

They're watching whether you rush. A candidate who immediately writes $P(A) = \frac{\text{favorable}}{\text{total}}$ without pausing has already signaled that they're pattern-matching, not reasoning. Interviewers at Jane Street and Two Sigma specifically look for the moment you slow down and classify, because it predicts whether you'll catch your own errors later.


CHECK: Run the Five Trap Signatures

This is the core of the framework. Before you commit to a setup, run through each of the following five checks in order. You're looking for any that apply, not just the first one.

Trap 1: Conditional vs. unconditional confusion. Ask yourself: is the problem giving me information that updates the sample space? If the problem says "given that..." or "you observe that..." or "a player reveals...", you're in conditional probability territory. The trap is computing $P(A \cap B)$ when you need $P(A \mid B)$, or worse, computing $P(A \mid B)$ when the problem is actually asking for $P(B \mid A)$.

Trap 2: Unverified independence. Before you multiply probabilities, ask: are these events actually independent, or am I assuming they are because it's convenient? Draws without replacement are never independent. Events defined on overlapping outcomes are rarely independent. If you can't state why independence holds, don't assume it.

Trap 3: Sample space miscounting. Ask: does order matter in my sample space? Am I sampling with or without replacement? A common error is counting ordered outcomes in the numerator and unordered outcomes in the denominator, or vice versa. Both need to be consistent. When in doubt, enumerate small cases explicitly.

Trap 4: Base rate neglect. Any time you see a conditional probability problem with a rare event (a disease, a signal, an anomaly), ask: what is the prior probability of the event I'm conditioning on? Ignoring the prior and computing only the likelihood is the defining error of base rate neglect. The posterior is never just the likelihood.

Trap 5: False symmetry. Symmetry arguments are powerful but fragile. Before invoking symmetry, ask: are the outcomes I'm treating as equivalent actually equally likely? Sequences of the same length are not always equally likely to appear first. Positions in a game are not always interchangeable. If you can't write down the formal argument for why symmetry holds, don't use it.

What to say:

"Let me just run through a quick check before I set up the calculation. I want to confirm whether these draws are independent, and I want to make sure I have the conditioning direction right."

That's it. You don't need to narrate every check. One sentence signals that you're doing it.

How the interviewer is evaluating you:

They may have designed the problem specifically to trigger one of these traps. When you say "let me check whether order matters in my sample space," you've demonstrated that you know the trap exists. Even if you then proceed correctly, that verbalization earns credit. If you skip the check and walk into the trap, no amount of correct algebra afterward fully recovers the impression.

Do this: Say one sentence out loud during the CHECK phase. It doesn't have to cover all five signatures. It just has to show you paused and thought about setup before solving.

Don't do this: Run the CHECK phase silently and then present your solution as if the setup were obvious. The interviewer can't see your internal process. If you don't verbalize it, it didn't happen, as far as they're concerned.

SOLVE: Execute With a Verified Setup

Once you've classified the problem and flagged any traps, you can actually compute. The order matters here too.

What to do:

  1. Write down your notation explicitly before any algebra. Define your events: "Let $A$ be the event that the first card is an ace. Let $B$ be the event that the second card is also an ace." This forces precision and gives the interviewer something to correct if you've misread the problem.
  2. State the formula you're applying and why. "I'll use Bayes' theorem here because I need to reverse the conditioning direction." One sentence. Then derive.
  3. Carry the algebra to a numerical answer. Leaving the answer as $\frac{P(B \mid A) \cdot P(A)}{P(B)}$ is not an answer. Plug in the numbers.

What to say:

"Okay, I'm confident in the setup. Let me define the events formally and then work through the calculation."

How the interviewer is evaluating you:

They want to see clean, auditable reasoning. If your notation is sloppy, they can't tell whether a wrong answer came from a conceptual error or an arithmetic slip. Explicit notation protects you: if the setup is right and the arithmetic is wrong, a good interviewer will tell you and let you continue. If the setup is wrong and the notation is ambiguous, they can't help you.


SANITY: Check Before You Commit

Never say your final answer the moment you compute it. Take 15 seconds.

What to do:

  1. Check that your answer is in $[0, 1]$ if it's a probability. If it's greater than 1 or negative, you have an error, and it's better to catch it yourself than to have the interviewer point it out.
  2. Test an extreme case. If $p = 0$ or $p = 1$, does your formula give the right answer? If the number of draws goes to infinity, does the probability approach 1 as expected?
  3. Ask whether the answer is plausible given the problem. If you're computing the probability of a rare event and you get 0.7, pause.

What to say:

"Let me just do a quick sanity check. If $p = 1$, this should give probability 1, and it does. The answer looks right to me: $\boxed{2/3}$."

Example: "I'm getting $\frac{4}{3}$ here, which can't be right for a probability. Let me go back and check my denominator in the Bayes calculation."

That kind of self-correction, done calmly, is not a failure. It's exactly what the interviewer wants to see. The candidates who impress at Citadel and DE Shaw are not the ones who never make errors; they're the ones who catch their own errors before being prompted.


The framework is a real-time tool. You run it before you write a single equation, not after you've already committed to a setup. Treating it as a post-hoc check defeats the purpose entirely. The trap is already sprung by the time you're verifying a completed solution.

Putting It Into Practice

Each trap below comes with a full worked example: the problem, the wrong setup most candidates reach for, and the correct derivation with a numerical answer. Run the STOP-CHECK-SOLVE loop on each one before reading the solution.


Trap 1: Conditional Probability Confusion

The problem: You have two children. At least one is a boy. What is the probability both are boys?

Most candidates answer 1/2 immediately. That's wrong, and here's exactly why.

The mistake is treating "the other child is a boy" as an independent event with probability 1/2. But the problem has already conditioned on the family having at least one boy. You're computing $P(\text{both boys} \mid \text{at least one boy})$, not $P(\text{second child is boy})$.

Correct setup using Bayes' theorem:

Let $A$ = both children are boys. Let $B$ = at least one child is a boy.

$$P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}$$

The sample space for two children, equally likely: ${BB, BG, GB, GG}$.

$$P(A) = \frac{1}{4}, \quad P(B) = \frac{3}{4}, \quad P(B \mid A) = 1$$

$$P(A \mid B) = \frac{1 \cdot \frac{1}{4}}{\frac{3}{4}} = \frac{1}{3}$$

Answer: 1/3, not 1/2.

The flip that kills candidates: they set up $P(A \mid B)$ but compute $P(B \mid A)$ instead, or they forget to normalize by $P(B)$ entirely. The Monty Hall problem fails the same way. You're told the host opened door 3 (a goat). The posterior probability that door 1 hides the car is $P(\text{car at 1} \mid \text{host opens 3})$, which requires knowing the host's door-selection rule. Candidates who say "now it's 50/50" are computing $P(\text{car at 1})$ after ignoring the conditioning event.

Do this: Before writing any conditional probability, write out $P(_ \mid _)$ explicitly with both slots filled. If you can't name what's in each slot, you haven't set up the problem yet.

Trap 2: Independence Assumption Errors

The problem: A standard deck has 52 cards. You draw 5 without replacement. What is the probability that exactly 2 are aces?

The wrong setup: treat each draw as independent with $p = 4/52 = 1/13$, then apply the binomial formula.

$$P(\text{wrong}) = \binom{5}{2} \left(\frac{1}{13}\right)^2 \left(\frac{12}{13}\right)^3 \approx 0.0412$$

This is incorrect. Draws without replacement are not independent. Removing an ace changes the probability of drawing an ace on the next draw. The binomial model requires independent, identically distributed trials. Neither condition holds here.

Correct setup using the hypergeometric distribution:

You're sampling $n = 5$ cards from a population of $N = 52$, where $K = 4$ are "successes" (aces).

$$P(X = 2) = \frac{\binom{4}{2}\binom{48}{3}}{\binom{52}{5}}$$

Computing each term:

$$\binom{4}{2} = 6, \quad \binom{48}{3} = \frac{48 \cdot 47 \cdot 46}{6} = 17296, \quad \binom{52}{5} = 2598960$$

$$P(X = 2) = \frac{6 \cdot 17296}{2598960} = \frac{103776}{2598960} \approx 0.0399$$

Answer: approximately 3.99%.

The binomial gave 4.12%. Close enough that you might not catch the error on intuition alone, which is exactly why interviewers use this problem. The numerical difference is small; the conceptual error is large.

⚠️Common mistake
Candidates reach for the binomial because it's familiar. The trigger to switch to hypergeometric is simple: if the problem says "without replacement" and you're counting successes, the draws are dependent. Full stop.

Trap 3: Sample Space Miscounting

The problem: You roll two fair six-sided dice. What is the probability the sum equals 4?

Here's where candidates go wrong. They list the "ways to get 4" as: ${(1,3), (2,2), (3,1)}$ and say there are 3 outcomes. Then they count total outcomes as 21 (the number of unordered pairs from ${1,...,6}$). That gives $3/21 = 1/7$.

Wrong.

The sample space of two dice is ordered pairs. $(1,3)$ and $(3,1)$ are distinct outcomes because die 1 and die 2 are distinguishable objects. The correct sample space has $6 \times 6 = 36$ equally likely outcomes.

Correct enumeration:

Outcomes summing to 4: $(1,3), (2,2), (3,1)$. That's 3 outcomes.

$$P(\text{sum} = 4) = \frac{3}{36} = \frac{1}{12} \approx 0.0833$$

The wrong answer was $1/7 \approx 0.1429$. A 70% relative error from a sample space mistake.

The unordered approach fails because unordered pairs are not equally likely. The pair ${1,3}$ corresponds to two ordered outcomes ($(1,3)$ and $(3,1)$), while ${2,2}$ corresponds to only one. Assigning equal probability to unordered pairs violates the uniform distribution assumption.

🔑Key insight
Whenever you're about to divide by a count of outcomes, ask yourself: are all these outcomes equally likely? If you're using unordered pairs, multisets, or combinations, the answer is almost certainly no. Default to ordered outcomes and divide by the full ordered sample space.

For more complex dice problems, multinomial coefficients handle the counting cleanly. The number of ordered outcomes of rolling $k$ dice that produce a specific multiset ${n_1, n_2, \ldots}$ is $\frac{k!}{n_1! n_2! \cdots}$, and you weight each multiset by this factor before summing.


Trap 4: Base Rate Neglect

The problem: Your trading signal predicts an up-move correctly 90% of the time when an up-move occurs. It also fires a false positive 30% of the time on non-up-move days. Up-moves happen on 5% of trading days. The signal fires today. What is the probability there's actually an up-move?

Candidates who anchor on "90% accurate" say the answer is around 90%. It's not even close.

Correct Bayes setup:

Let $U$ = up-move occurs. Let $S$ = signal fires.

$$P(U \mid S) = \frac{P(S \mid U) \cdot P(U)}{P(S)}$$

Expanding the denominator using total probability:

$$P(S) = P(S \mid U)P(U) + P(S \mid U^c)P(U^c)$$

$$P(S) = (0.90)(0.05) + (0.30)(0.95) = 0.045 + 0.285 = 0.330$$

$$P(U \mid S) = \frac{(0.90)(0.05)}{0.330} = \frac{0.045}{0.330} \approx 0.136$$

Answer: approximately 13.6%.

A signal that's "90% accurate" gives you a posterior of 13.6% because up-moves are rare. The false positive rate of 30% on 95% of days swamps the true positive rate of 90% on 5% of days. This is the base rate neglect trap in its purest form.

Do this: In any Bayes problem, write down $P(\text{prior})$ before you write anything else. If the prior is extreme (close to 0 or 1), your posterior will be dominated by it. If you haven't accounted for the prior, your answer is wrong regardless of how carefully you computed the likelihood.

In a live interview, say this out loud: "Before I set up Bayes, let me note the base rate is 5%, which is low, so I'd expect the posterior to be much lower than the raw accuracy figure suggests." That one sentence signals you know exactly which trap is in play.


Trap 5: False Symmetry

The problem: What is the expected number of fair coin flips to see the pattern HH? What about HT? Are they the same?

Most candidates say they're the same. Both are length-2 sequences. The coin is fair. Feels symmetric. It's not.

Setting up the Markov chain for HH:

Define states by progress toward HH:

  • $S_0$: start (or last flip was T)
  • $S_1$: last flip was H
  • $S_2$: HH seen (absorbing)

Let $e_i$ = expected flips to absorption from state $S_i$.

From $S_0$: flip once. With probability 1/2 go to $S_1$, with probability 1/2 stay in $S_0$.

$$e_0 = 1 + \frac{1}{2}e_1 + \frac{1}{2}e_0$$

From $S_1$: flip once. With probability 1/2 go to $S_2$ (done), with probability 1/2 go to $S_0$.

$$e_1 = 1 + \frac{1}{2}(0) + \frac{1}{2}e_0$$

Solving: from the second equation, $e_1 = 1 + \frac{1}{2}e_0$. Substituting into the first:

$$e_0 = 1 + \frac{1}{2}\left(1 + \frac{1}{2}e_0\right) + \frac{1}{2}e_0 = 1 + \frac{1}{2} + \frac{1}{4}e_0 + \frac{1}{2}e_0$$

$$e_0 - \frac{3}{4}e_0 = \frac{3}{2} \implies \frac{1}{4}e_0 = \frac{3}{2} \implies e_0 = 6$$

Expected flips to HH: 6.

Setting up the Markov chain for HT:

  • $S_0$: start (or last flip was T)
  • $S_1$: last flip was H
  • $S_2$: HT seen (absorbing)

From $S_0$: flip once. With probability 1/2 go to $S_1$ (got H), with probability 1/2 stay in $S_0$ (got T).

$$e_0 = 1 + \frac{1}{2}e_1 + \frac{1}{2}e_0$$

From $S_1$: flip once. With probability 1/2 go to $S_2$ (got T, done), with probability 1/2 stay in $S_1$ (got H, still have H as last flip).

$$e_1 = 1 + \frac{1}{2}(0) + \frac{1}{2}e_1$$

Solving: $e_1 - \frac{1}{2}e_1 = 1 \implies e_1 = 2$. Then:

$$e_0 = 1 + \frac{1}{2}(2) + \frac{1}{2}e_0 \implies \frac{1}{2}e_0 = 2 \implies e_0 = 4$$

Expected flips to HT: 4.

The answers are 6 and 4. Not equal.

Why does symmetry fail? When you're hunting for HH and you see an H followed by a T, you've wasted progress: you're back to $S_0$ with nothing. When you're hunting for HT and you see an H followed by another H, you haven't wasted it: you're still in $S_1$ because you have a fresh H. The overlap structure of the target pattern determines how much "credit" you carry after a mismatch, and that's fundamentally asymmetric between HH and HT.

🔑Key insight
Symmetry arguments are powerful but dangerous. Before claiming two quantities are equal by symmetry, ask: is there any structural difference in how the two cases interact with the process generating them? For sequence problems, the answer is almost always yes. Draw the Markov chain. The asymmetry will be visible in the state transitions.

Common Mistakes

These aren't edge cases. Every one of these shows up constantly in quant interviews, and every one of them is avoidable if you know what to watch for.


Committing to an Answer Before Verifying the Sample Space

You hear the problem, you recognize the pattern, and you start calculating. That's the trap.

Here's what it looks like in practice:

Interviewer: "Two dice are rolled. What's the probability the sum is 7?"

Candidate: "There are 6 ways to get a sum of 7, and 36 total outcomes, so 6/36 = 1/6."

Interviewer: "What if I told you one of the dice already shows a 3?"

Candidate: "...still 1/6?"

Interviewer: "Are you sure? Which outcomes are still in your sample space?"

The candidate answered the unconditional question. The interviewer was asking a conditional one. Once you know one die shows a 3, the sample space shrinks from 36 to 11: the six outcomes where the first die is 3, plus the five outcomes where the second die is 3 and the first isn't (to avoid double-counting the (3,3) case). The favorable outcomes for a sum of 7 are (3,4) and (4,3), so the correct answer is $P(\text{sum}=7 \mid \text{one die shows } 3) = 2/11$, not $1/6$. The candidate was wrong, and confidently so.

Interviewers plant this trap deliberately. They want to see if you'll commit before you've confirmed what you're computing.

Do this: Before writing a single symbol, say out loud: "Let me confirm the sample space. Are we conditioning on anything here?" It takes five seconds and it signals exactly the kind of rigor these firms hire for.

Confusing "At Least One" With "Exactly One"

Read the problem again. Slowly. Does it say "at least one" or "exactly one"? These are different calculations, and candidates swap them constantly.

"At least one" means one or more. The clean way to compute it is almost always the complement:

$$P(\text{at least one}) = 1 - P(\text{none})$$

"Exactly one" requires inclusion-exclusion or direct counting, and it's messier. If you set up the complement method on an "exactly one" problem, you'll get the wrong answer with no obvious error to catch.

The failure mode looks like this: the problem says "at least one of three components fails," the candidate computes $1 - P(\text{all three work})$, gets a clean number, and moves on. That's actually correct for "at least one." But if the problem said "exactly one fails," that same setup produces a number that's too large and the candidate has no idea.

Don't do this: Skim the quantifier. "At least," "exactly," "at most," and "more than" are four different setups.

The fix: underline or repeat the quantifier back to the interviewer before you start. "So we want the probability that at least one of these events occurs, right?" You'll catch misreads before they cost you.


Dropping the Denominator in Bayes' Theorem

This one produces answers greater than 1. That should be an automatic alarm, but under time pressure, candidates miss it.

Bayes' theorem is:

$$P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}$$

What happens in interviews is that candidates correctly identify $P(B \mid A) \cdot P(A)$ as the numerator, write it down, and then either forget $P(B)$ entirely or wave at it with "and we normalize." That's not a derivation. That's a placeholder where a calculation should be.

The algebraic consequence is concrete. Suppose $P(B \mid A) = 0.9$ and $P(A) = 0.4$. The numerator is $0.36$. If $P(B) = 0.3$, then $P(A \mid B) = 1.2$. That's not a probability. If you'd computed $P(B)$ via the law of total probability, you'd have caught it:

$$P(B) = P(B \mid A)P(A) + P(B \mid A^c)P(A^c) = 0.9(0.4) + P(B \mid A^c)(0.6)$$

You need that second term. Skipping it is what produces nonsense.

Don't do this: Write the numerator and call it done. If your answer isn't in $[0, 1]$, you dropped the denominator.

Always expand $P(B)$ explicitly using the law of total probability. It's one extra line and it's the line that makes the answer valid.


Over-Applying the Law of Total Expectation Without Verifying the Partition

The law of total expectation is powerful: $E[X] = \sum_i E[X \mid A_i] P(A_i)$. But it only works when the $A_i$ are mutually exclusive and exhaustive. Candidates forget the second condition constantly.

The failure looks like this: a candidate conditions on two events, $A$ and $B$, computes $E[X \mid A]P(A) + E[X \mid B]P(B)$, and presents that as $E[X]$. But if $A$ and $B$ overlap, or if $A \cup B \neq \Omega$, this double-counts or misses probability mass. The answer can be off by a lot, and it'll look completely reasonable on the page.

Interviewers at firms like Two Sigma and DE Shaw specifically probe this because it reveals whether you understand the theorem or just pattern-match to it.

Do this: Before conditioning, explicitly state your partition. "I'm going to condition on whether the first card is a heart or not a heart. These are mutually exclusive and exhaustive, so the law of total expectation applies." That sentence protects you.

If you can't name a clean partition, you don't have one yet. Find it before you start computing.


Anchoring on the First Solution Method

You find an approach that works. You run with it. You get an answer. The interviewer says, "Interesting. Is there another way to see this?"

If your answer is a blank stare, you've already lost points.

At Jane Street and Optiver, finding a second solution path isn't a bonus. It's part of the evaluation. A direct counting argument that takes 12 steps is objectively worse than a symmetry argument that takes 2, and interviewers know which one you should have seen. When you anchor on the first method that comes to mind, you often miss the elegant approach entirely.

The practical cost is also real: direct counting on a complex problem is slow and error-prone. A recursion or generating function argument is faster and harder to mess up. Candidates who only know one gear make arithmetic mistakes under pressure that candidates with multiple approaches avoid.

Don't do this: Finish your calculation and stop thinking. The first method you find is a floor, not a ceiling.

After reaching an answer, spend 30 seconds asking yourself: "Could I solve this with symmetry? With a recursion? With a clever conditioning argument?" Say that out loud. Even if you don't pursue the second method fully, naming it shows the interviewer you're thinking like a quant, not just executing a procedure.

Quick Reference

The Five Traps at a Glance

TrapWarning Sign in Problem WordingCorrective ActionKey Formula/Technique
Conditional probability confusion"Given that," "knowing that," "after observing"Identify which event is conditioned on; write $P(A \mid B)$ explicitly before computing$P(A \mid B) = \frac{P(A \cap B)}{P(B)}$
Independence assumption errorDrawing cards, sampling people, sequential eventsAsk: is this with or without replacement? If without, draws are dependentHypergeometric, not binomial
Sample space miscountingDice rolls, arrangements, combinationsDecide ordered vs. unordered, with vs. without replacement before countingMultinomial coefficients; explicit enumeration for small cases
Base rate neglect"The signal is accurate X% of the time," any Bayesian setupWrite down the prior $P(H)$ before touching the likelihood$P(H \mid E) = \frac{P(E \mid H)P(H)}{P(E)}$; compute $P(E)$ via total probability
False symmetry"Expected time until sequence," patterns of equal lengthBuild Markov states explicitly; symmetry only holds when state spaces matchState equations: $E[T] = 1 + p \cdot E[\text{next}] + (1-p) \cdot E[\text{restart}]$

STOP-CHECK-SOLVE: Phase Timing

PhaseWhenWhat You're Doing
STOPFirst 30 secondsClassify: discrete or continuous, counting or expectation, conditional or joint
CHECKBefore writing any equationRun the five trap signatures above
FLAGAs soon as a trap is detectedName it out loud; state the corrective action
SOLVEAfter setup is verifiedNotation first, then derivation, then numerical answer
SANITYAfter you have a numberBoundary checks (see below) before saying "my answer is"

Phrases to Use

These signal rigor. Say them before committing to a setup, not after.

  1. "Let me confirm whether these events are independent before I decide on a counting method."
  2. "I want to check if order matters in my sample space, because that changes whether I use permutations or combinations."
  3. "Before I apply Bayes, let me write down the prior explicitly, since base rate neglect is easy to miss under pressure."
  4. "I'm going to condition on the first step and set up a recursion, then verify the states are mutually exclusive and exhaustive."
  5. "Let me sanity-check this answer at the boundary cases before I commit: if $p = 0$ the answer should be X, and if $p = 1$ it should be Y."
  6. "I think I see a symmetry argument here, but let me verify it holds before I use it, because sequence problems often break symmetry."

Boundary Sanity Checks

Run these on every answer before you say it out loud.

  • Range check. Any probability must satisfy $P \in [0, 1]$. An answer of 1.3 means you dropped the denominator in Bayes.
  • Limiting case check. Plug in $p = 0$ and $p = 1$ (or $n = 0$ and $n \to \infty$). If the answer doesn't collapse to something obvious, your setup is wrong.
  • Sign check. Expected values of non-negative random variables must be non-negative. Variances must be non-negative.
  • Complement check. If you computed $P(A)$, verify $P(A^c) = 1 - P(A)$ makes intuitive sense.
  • Symmetry spot-check. If two outcomes feel equally likely by symmetry, confirm your formula gives equal probabilities for both before moving on.

Problem Archetype to Trap Map

Problem TypeMost Likely Trap
Dice and coin gamesSample space miscounting (ordered vs. unordered)
Bayesian inference / signal accuracyBase rate neglect
Card drawing, urn modelsIndependence assumption (without replacement)
Coin sequence / pattern waiting timesFalse symmetry between sequences
Conditional expectation, tower propertyPartition error (non-exhaustive or overlapping conditioning events)
"At least one" problemsForgetting the complement; setting up $P(\text{exactly one})$ instead

Recovery Scripts

You will catch yourself mid-calculation sometimes. That's fine. Here's exactly what to say.

"Actually, I set up the conditioning backwards. Let me rewrite this with $P(B \mid A)$ on the correct side of Bayes and redo the denominator."

"I think I treated these draws as independent, but they're without replacement, so let me switch to the hypergeometric setup."

"I counted ordered pairs but the problem is asking for unordered outcomes. Let me divide through by the number of orderings."

"I conditioned on event $A$, but I haven't checked that my partition covers the full sample space. Let me list the cases explicitly before continuing."

Saying these things clearly, without panic, is what separates a strong candidate from a great one. Interviewers at Jane Street and Citadel are not penalizing you for catching your own error. They're penalizing you for not catching it.

🎯Key takeaway
The candidates who fail probability interviews aren't missing knowledge; they're missing the habit of verifying their setup before they solve, and these traps are the exact places where that habit pays off.