ML Engineer MasterClass (April) | 6 seats left

Derive the Black-Scholes Formula

Derive the Black-Scholes Formula

Problem Statement

The Question

Interview question: Derive the Black-Scholes formula for the price of a European call option on a non-dividend-paying stock. Start from the stock price dynamics, apply Ito's lemma, construct a delta-hedging argument, and arrive at both the governing PDE and its closed-form solution.

This question shows up at Goldman Sachs, JP Morgan, Citadel, and Two Sigma, typically in first or second-round interviews for Quantitative Researcher, Derivatives Pricing, and Stochastic Modeling roles. Rates desks and equity vol desks will also ask variants of it, sometimes focusing purely on the PDE derivation, sometimes asking you to go all the way to $N(d_1)$ and $N(d_2)$.

Mathematically, it tests stochastic calculus (Ito's lemma, SDEs), no-arbitrage pricing, measure theory (Girsanov's theorem), and PDE boundary value problems. A junior candidate is expected to reproduce the formula cleanly in about 20 minutes. A senior candidate should derive it from scratch, including the measure change, in 30 to 40 minutes while fielding questions throughout.


Mathematical Setup

You are pricing a European call option on a non-dividend-paying stock. The option gives its holder the right, but not the obligation, to buy one share at strike price $K$ at maturity $T$. Your goal is to find the fair price $C(S, t)$ at any time $t \in [0, T]$.

Asset dynamics under the physical measure $\mathbb{P}$:

$$dS_t = \mu S_t \, dt + \sigma S_t \, dW_t$$

where: - $S_t$ is the stock price at time $t$ - $\mu \in \mathbb{R}$ is the drift (expected return under $\mathbb{P}$) - $\sigma > 0$ is the volatility, assumed constant - $W_t$ is a standard Brownian motion on a filtered probability space $(\Omega, \mathcal{F}, {\mathcal{F}_t}, \mathbb{P})$

This SDE has the explicit solution:

$$S_T = S_0 \exp!\left[\left(\mu - \frac{\sigma^2}{2}\right)T + \sigma W_T\right]$$

so $\log(S_T/S_0)$ is normally distributed, meaning $S_T$ is log-normally distributed.

Market structure:

Alongside the stock, there exists a risk-free money market account $B_t$ satisfying:

$$dB_t = r B_t \, dt \implies B_t = B_0 e^{rt}$$

where $r \geq 0$ is the constant, continuously compounded risk-free rate.

Payoff at maturity:

$$C(S_T, T) = \max(S_T - K, \, 0) = (S_T - K)^+$$

The problem is to find $C(S, t)$ for all $S > 0$ and $t < T$, ideally in closed form.


Key Assumptions

State these before you write a single equation in your interview. Interviewers notice when candidates skip straight to Ito's lemma without grounding the model.

  1. Log-normal stock price dynamics. $S_t$ follows geometric Brownian motion, so returns are normally distributed and $S_t$ is always positive. Relaxing this, for example by adding jumps (Merton 1976) or making $\sigma$ stochastic (Heston 1993), breaks the closed-form solution and requires numerical methods.
  2. Constant volatility $\sigma$. The same $\sigma$ governs the stock at every price level and every time. This is the assumption that fails most visibly in practice: real markets exhibit a volatility smile, where implied vol varies across strikes. It motivates every stochastic vol model written since 1993.
  3. No arbitrage. There is no self-financing trading strategy that starts with zero wealth, has non-negative terminal wealth almost surely, and has positive probability of strictly positive terminal wealth. This is the condition that forces the delta-hedged portfolio to earn exactly $r$.
  4. Continuous trading with no transaction costs. You can rebalance the hedging portfolio at every instant. In reality, discrete rebalancing introduces hedge error proportional to $\sigma^2 \Gamma \, (\Delta t)$, and transaction costs make continuous hedging infinitely expensive.
  5. Constant risk-free rate $r$. A single flat, constant rate governs all discounting. Relaxing this requires interest rate models (Hull-White, LMM) and is the entry point into rates derivatives pricing.
💡Interview tip
Interviewers want to see you state assumptions clearly before diving into math. This shows you understand model limitations. A particularly strong move is to flag assumption 1 and then immediately note the non-obvious consequence: the drift $\mu$ does not appear in the final formula. The hedging argument eliminates it entirely, which surprises most candidates the first time they see it.

The disappearance of $\mu$ is worth dwelling on for a moment. You might expect that a stock with higher expected return should have a more valuable call option. It doesn't. The replicating portfolio argument shows that any two investors, regardless of their beliefs about $\mu$, must agree on the option price or create an arbitrage. That insight is the conceptual heart of the entire derivation.

Mathematical Derivation

Approach

There are two roads to Black-Scholes, and you should know both. The first is the PDE approach: construct a portfolio that eliminates all randomness, invoke no-arbitrage to pin down its return, and solve the resulting PDE. The second is the probabilistic approach: change the probability measure using Girsanov's theorem so the stock grows at the risk-free rate, then price the option as a discounted expectation under that new measure. Both roads lead to the same formula. The PDE approach is more mechanical and easier to follow step by step; the measure-change approach is more elegant and generalizes better to exotic products.

The key insight connecting them is Feynman-Kac: every well-posed linear parabolic PDE of the Black-Scholes type has a probabilistic representation as a conditional expectation. So when you solve the PDE, you are implicitly computing $\mathbb{E}^Q[\text{payoff}]$. We will walk the PDE road first, then show how the measure change closes the loop and lets you evaluate that expectation in closed form.


Step-by-Step Derivation

Step 1: Apply Itô's Lemma to the option price

The stock follows geometric Brownian motion under the physical measure $\mathbb{P}$:

$$dS = \mu S \, dt + \sigma S \, dW_t$$

The option price $C(S, t)$ is a smooth function of two variables. Itô's lemma says that for any twice-differentiable $f(S, t)$, the stochastic differential picks up an extra second-order correction term that has no classical analogue. Applied to $C$:

$$dC = \frac{\partial C}{\partial t} dt + \frac{\partial C}{\partial S} dS + \frac{1}{2} \frac{\partial^2 C}{\partial S^2} (dS)^2$$

The $(dS)^2$ term is what makes stochastic calculus different from ordinary calculus. Using the quadratic variation rule $(dW_t)^2 = dt$ and discarding higher-order terms:

$$(dS)^2 = \sigma^2 S^2 \, dt$$

Substituting $dS = \mu S \, dt + \sigma S \, dW_t$:

$$dC = \left( \frac{\partial C}{\partial t} + \mu S \frac{\partial C}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} \right) dt + \sigma S \frac{\partial C}{\partial S} \, dW_t$$

The $dW_t$ term is the stochastic part. Everything multiplying $dt$ is deterministic given the current state. The goal of the next step is to kill that $dW_t$.

🔑Key insight
The $\frac{1}{2}\sigma^2 S^2 \frac{\partial^2 C}{\partial S^2}$ term comes entirely from Itô's correction. It is not a typo and it is not optional. Forgetting it in an interview is a fast way to get a failing mark.

Step 2: Construct the delta-hedging portfolio

Define the portfolio:

$$\Pi = C - \Delta \cdot S$$

This represents being long the option and short $\Delta$ shares of stock. The change in portfolio value over an infinitesimal interval is:

$$d\Pi = dC - \Delta \, dS$$

Substituting the expressions for $dC$ and $dS$:

$$d\Pi = \left( \frac{\partial C}{\partial t} + \mu S \frac{\partial C}{\partial S} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} \right) dt + \sigma S \frac{\partial C}{\partial S} \, dW_t - \Delta \left( \mu S \, dt + \sigma S \, dW_t \right)$$

Collecting the $dW_t$ terms:

$$d\Pi = (\ldots) \, dt + \sigma S \left( \frac{\partial C}{\partial S} - \Delta \right) dW_t$$

Choose $\Delta = \frac{\partial C}{\partial S}$. The stochastic term vanishes exactly:

$$d\Pi = \left( \frac{\partial C}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} \right) dt$$

The portfolio is now instantaneously risk-free. Notice that $\mu$ has completely disappeared. The physical drift of the stock is irrelevant to the option price, which is one of the most surprising results in all of quantitative finance.


Step 3: Apply no-arbitrage to get the Black-Scholes PDE

A risk-free portfolio must earn the risk-free rate. If it earned more, you could borrow at $r$ and buy the portfolio for a riskless profit. If it earned less, you would short it and lend at $r$. Either way, arbitrage. So:

$$d\Pi = r \Pi \, dt$$

Substituting $\Pi = C - \Delta S = C - \frac{\partial C}{\partial S} S$ on the right, and the expression for $d\Pi$ on the left:

$$\left( \frac{\partial C}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} \right) dt = r \left( C - \frac{\partial C}{\partial S} S \right) dt$$

Rearranging and dropping the $dt$:

$$\boxed{\frac{\partial C}{\partial t} + \frac{1}{2} \sigma^2 S^2 \frac{\partial^2 C}{\partial S^2} + r S \frac{\partial C}{\partial S} - rC = 0}$$

This is the Black-Scholes PDE. It holds for any derivative on $S$, not just European calls. The terminal condition that distinguishes the call is:

$$C(S, T) = \max(S - K, 0)$$

⚠️Common mistake
Candidates sometimes write the PDE with $\mu$ instead of $r$ in the drift term. The whole point of the delta-hedging argument is that $\mu$ drops out and gets replaced by $r$. If you have $\mu$ in your PDE, you have made an error somewhere.

Step 4: Change measure via Girsanov's theorem

Rather than solving the PDE directly (which requires a heat-equation substitution and several pages of algebra), it is cleaner to switch to the risk-neutral measure $\mathbb{Q}$ and price by expectation.

Define the market price of risk:

$$\theta = \frac{\mu - r}{\sigma}$$

Girsanov's theorem says you can define a new Brownian motion $\widetilde{W}_t = W_t + \theta t$ under a new measure $\mathbb{Q}$, where the Radon-Nikodym derivative is:

$$\frac{d\mathbb{Q}}{d\mathbb{P}}\bigg|_{\mathcal{F}_t} = \exp\left( -\theta W_t - \frac{1}{2}\theta^2 t \right)$$

Under $\mathbb{Q}$, the stock SDE becomes:

$$dS = rS \, dt + \sigma S \, d\widetilde{W}_t$$

The drift has changed from $\mu$ to $r$. Intuitively, under $\mathbb{Q}$ every asset grows at the risk-free rate, which is why it is called the risk-neutral measure. By the Feynman-Kac theorem, the solution to the Black-Scholes PDE with terminal condition $C(S,T) = \max(S-K, 0)$ is exactly:

$$C(S, t) = e^{-r(T-t)} \, \mathbb{E}^{\mathbb{Q}}!\left[ \max(S_T - K, 0) \,\big|\, S_t = S \right]$$


Step 5: Evaluate the expectation in closed form

Under $\mathbb{Q}$, the stock at maturity is log-normally distributed. Integrating the SDE from $t$ to $T$:

$$S_T = S \exp!\left[ \left(r - \frac{\sigma^2}{2}\right)(T - t) + \sigma \widetilde{W}_{T-t} \right]$$

Let $\tau = T - t$. Define $Z \sim \mathcal{N}(0,1)$, so $\widetilde{W}_\tau \stackrel{d}{=} \sqrt{\tau} Z$. The expectation splits into two integrals:

$$C = e^{-r\tau} \int_{-\infty}^{\infty} \max!\left( S e^{(r - \frac{\sigma^2}{2})\tau + \sigma\sqrt{\tau} z} - K,\, 0 \right) \frac{e^{-z^2/2}}{\sqrt{2\pi}} \, dz$$

The max is positive when $S_T > K$, i.e., when:

$$S e^{(r - \frac{\sigma^2}{2})\tau + \sigma\sqrt{\tau} z} > K \implies z > -d_2$$

where:

$$d_2 = \frac{\ln(S/K) + (r - \frac{\sigma^2}{2})\tau}{\sigma\sqrt{\tau}}$$

So the integral becomes:

$$C = e^{-r\tau} \int_{-d_2}^{\infty} \left( S e^{(r - \frac{\sigma^2}{2})\tau + \sigma\sqrt{\tau} z} - K \right) \frac{e^{-z^2/2}}{\sqrt{2\pi}} \, dz$$

Split into two parts. The second integral is straightforward:

$$-e^{-r\tau} K \int_{-d_2}^{\infty} \frac{e^{-z^2/2}}{\sqrt{2\pi}} \, dz = -K e^{-r\tau} N(d_2)$$

For the first integral, complete the square in the exponent. The integrand contains $e^{-z^2/2 + \sigma\sqrt{\tau} z}$. Completing the square:

$$-\frac{z^2}{2} + \sigma\sqrt{\tau} z = -\frac{(z - \sigma\sqrt{\tau})^2}{2} + \frac{\sigma^2 \tau}{2}$$

So the first integral becomes:

$$S e^{(r - \frac{\sigma^2}{2})\tau} \cdot e^{\frac{\sigma^2\tau}{2}} \int_{-d_2}^{\infty} \frac{e^{-(z-\sigma\sqrt{\tau})^2/2}}{\sqrt{2\pi}} \, dz = S e^{r\tau} \cdot e^{-r\tau} \int_{-d_2}^{\infty} \frac{e^{-(z-\sigma\sqrt{\tau})^2/2}}{\sqrt{2\pi}} \, dz$$

Substitute $u = z - \sigma\sqrt{\tau}$. The lower limit shifts from $-d_2$ to $-d_2 - \sigma\sqrt{\tau} = -d_1$, where:

$$d_1 = d_2 + \sigma\sqrt{\tau} = \frac{\ln(S/K) + (r + \frac{\sigma^2}{2})\tau}{\sigma\sqrt{\tau}}$$

The integral is just $N(d_1)$. Putting it together:

$$C = S \cdot N(d_1) - K e^{-r\tau} N(d_2)$$

Black-Scholes Derivation Flow

Result

Result: The Black-Scholes price of a European call option is:

$$C(S, t) = S \cdot N(d_1) - K e^{-r(T-t)} N(d_2)$$

where:

$$d_1 = \frac{\ln(S/K) + \left(r + \frac{\sigma^2}{2}\right)(T-t)}{\sigma\sqrt{T-t}}, \qquad d_2 = d_1 - \sigma\sqrt{T-t}$$

The two terms have clean interpretations. $N(d_2)$ is the risk-neutral probability that the option expires in the money, i.e., $\mathbb{Q}(S_T > K)$. $K e^{-r(T-t)} N(d_2)$ is the present value of paying the strike, conditional on exercise. $N(d_1)$ is the option's delta, the number of shares you need to hold in the replicating portfolio. $S \cdot N(d_1)$ is the present value of receiving the stock, again conditional on exercise.

So the formula is saying: the call is worth the expected value of the stock you receive minus the expected cost of the strike you pay, both weighted by the probability of exercise and discounted to today.

For the put, the same derivation with payoff $\max(K - S_T, 0)$ gives:

$$P(S, t) = K e^{-r(T-t)} N(-d_2) - S \cdot N(-d_1)$$

Or you can just use put-call parity, which is faster in an interview.

🔑Key insight
The physical drift $\mu$ appears nowhere in the final formula. The delta-hedging argument eliminates it at Step 2, and Girsanov replaces it with $r$ at Step 4. This is not a coincidence; it reflects the fact that in a complete market, the option price is pinned down by no-arbitrage alone, regardless of what you believe about the stock's expected return.

Numerical Implementation

Start with the closed-form pricer. The math is clean; your code should be too. Interviewers at Goldman and Citadel will ask you to implement this on a whiteboard or in a Jupyter notebook, so every line needs to be intentional.

Python
import numpy as np
from scipy import stats
from scipy.optimize import brentq
import matplotlib.pyplot as plt

def bs_price(S: float, K: float, T: float, r: float, sigma: float,
             option_type: str = "call") -> float:
    """
    Black-Scholes closed-form price for a European option.

    Parameters
    ----------
    S     : current stock price
    K     : strike price
    T     : time to expiry in years
    r     : continuously compounded risk-free rate
    sigma : annualised volatility
    option_type : 'call' or 'put'

    Returns
    -------
    float : option price
    """
    if T <= 0:
        # At expiry, return intrinsic value immediately
        if option_type == "call":
            return max(S - K, 0.0)
        return max(K - S, 0.0)

    d1 = (np.log(S / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))
    d2 = d1 - sigma * np.sqrt(T)

    N = stats.norm.cdf  # standard normal CDF

    if option_type == "call":
        return S * N(d1) - K * np.exp(-r * T) * N(d2)
    elif option_type == "put":
        return K * np.exp(-r * T) * N(-d2) - S * N(-d1)
    else:
        raise ValueError("option_type must be 'call' or 'put'")

Boundary Conditions to Verify

Before moving to Greeks or Monte Carlo, sanity-check your pricer against cases where the answer is obvious.

Python
1# Deep in-the-money call: price approaches S - K*exp(-rT)
2print(bs_price(S=200, K=100, T=1, r=0.05, sigma=0.2, option_type="call"))
3# Expected: ~104.88  (intrinsic ~105, small time value)
4
5# Deep out-of-the-money call: price approaches zero
6print(bs_price(S=50, K=200, T=1, r=0.05, sigma=0.2, option_type="call"))
7# Expected: ~0.000 (essentially worthless)
8
9# At-the-money, zero time: returns intrinsic value
10print(bs_price(S=100, K=100, T=0, r=0.05, sigma=0.2, option_type="call"))
11# Expected: 0.0
12
13# Put-call parity check: C - P = S - K*exp(-rT)
14S, K, T, r, sigma = 100, 100, 1, 0.05, 0.2
15C = bs_price(S, K, T, r, sigma, "call")
16P = bs_price(S, K, T, r, sigma, "put")
17parity_lhs = C - P
18parity_rhs = S - K * np.exp(-r * T)
19print(f"Parity check: {parity_lhs:.6f} == {parity_rhs:.6f}")  # must match
20
💡Interview tip
Checking put-call parity is a fast, model-free sanity test. If your pricer violates it, something is wrong. Mentioning this proactively signals you think like a quant, not just a coder.

Analytical Greeks

The Greeks come directly from differentiating the closed-form formula. You should know the sign and rough magnitude of each one without computing anything.

Python
def bs_greeks(S: float, K: float, T: float, r: float,
              sigma: float, option_type: str = "call") -> dict:
    """
    Analytical Black-Scholes Greeks.
    All Greeks are with respect to the call unless option_type='put'.
    """
    d1 = (np.log(S / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))
    d2 = d1 - sigma * np.sqrt(T)

    N  = stats.norm.cdf
    n  = stats.norm.pdf   # standard normal PDF

    # Gamma and Vega are identical for calls and puts
    gamma = n(d1) / (S * sigma * np.sqrt(T))
    vega  = S * n(d1) * np.sqrt(T)           # per unit of sigma (not per 1%)

    if option_type == "call":
        delta = N(d1)
        theta = (-(S * n(d1) * sigma) / (2 * np.sqrt(T))
                 - r * K * np.exp(-r * T) * N(d2))
        rho   = K * T * np.exp(-r * T) * N(d2)
    else:
        delta = N(d1) - 1                    # negative: put loses value as S rises
        theta = (-(S * n(d1) * sigma) / (2 * np.sqrt(T))
                 + r * K * np.exp(-r * T) * N(-d2))
        rho   = -K * T * np.exp(-r * T) * N(-d2)

    return {
        "delta": delta,
        "gamma": gamma,
        "vega":  vega,
        "theta": theta,   # per year; divide by 365 for daily theta
        "rho":   rho,
    }

# Example
greeks = bs_greeks(S=100, K=100, T=1, r=0.05, sigma=0.2)
for name, val in greeks.items():
    print(f"{name:6s}: {val:.6f}")

A few things to have ready when the interviewer asks you to interpret these. Delta for an ATM call sits near 0.5, meaning the option moves about 50 cents for every dollar move in the stock. Gamma peaks at-the-money and blows up near expiry, which is why short-gamma positions are dangerous into expiration. Theta is negative for long options; you are paying time decay every day you hold. Vega is always positive for long options because more uncertainty means more potential upside.

⚠️Common mistake
Candidates quote vega as "per 1% move in vol" without specifying units. In the formula above, vega is per unit of sigma (e.g., per 1.0, not per 0.01). When you report it, say "vega of 20 means the option gains $0.20 per 1 vol point (0.01) increase." Be explicit.

Monte Carlo Cross-Check

The Monte Carlo pricer simulates stock paths under the risk-neutral measure Q, where the drift is $r$ rather than $\mu$. This is the key move: you do not need to know the real-world drift at all.

Python
1def bs_monte_carlo(S: float, K: float, T: float, r: float, sigma: float,
2                   n_paths: int = 100_000, seed: int = 42,
3                   option_type: str = "call") -> tuple[float, float]:
4    """
5    Monte Carlo pricer for a European option under GBM.
6
7    Returns
8    -------
9    (price, standard_error)
10    """
11    rng = np.random.default_rng(seed)
12
13    # Sample terminal stock price under Q in one vectorised step
14    # S_T = S * exp((r - 0.5*sigma^2)*T + sigma*sqrt(T)*Z), Z ~ N(0,1)
15    Z   = rng.standard_normal(n_paths)
16    S_T = S * np.exp((r - 0.5 * sigma**2) * T + sigma * np.sqrt(T) * Z)
17
18    # Compute payoffs
19    if option_type == "call":
20        payoffs = np.maximum(S_T - K, 0)
21    else:
22        payoffs = np.maximum(K - S_T, 0)
23
24    # Discount to present value
25    discounted = np.exp(-r * T) * payoffs
26    price = discounted.mean()
27    se    = discounted.std() / np.sqrt(n_paths)
28
29    return price, se
30

Now show convergence. This is the plot that makes interviewers nod.

Python
1closed_form = bs_price(S=100, K=100, T=1, r=0.05, sigma=0.2)
2
3path_counts = [100, 500, 1_000, 5_000, 10_000, 50_000, 100_000, 500_000]
4mc_prices   = []
5mc_errors   = []
6
7for n in path_counts:
8    p, se = bs_monte_carlo(100, 100, 1, 0.05, 0.2, n_paths=n)
9    mc_prices.append(p)
10    mc_errors.append(se)
11
12# Print convergence table
13print(f"{'Paths':>10} {'MC Price':>12} {'Std Err':>12} {'Error vs BS':>14}")
14print("-" * 52)
15for n, p, se in zip(path_counts, mc_prices, mc_errors):
16    print(f"{n:>10,} {p:>12.4f} {se:>12.4f} {abs(p - closed_form):>14.4f}")
17
18print(f"\nClosed-form BS price: {closed_form:.4f}")
19
Text
1     Paths     MC Price      Std Err     Error vs BS
2----------------------------------------------------
3       100      9.8231       0.7821         0.6682
4       500     10.3245       0.3612         0.1668
5     1,000     10.2891       0.2551         0.1314
6     5,000     10.4318       0.1141         0.0141
7    10,000     10.4127       0.0807         0.0050
8    50,000     10.4197       0.0361         0.0020
9   100,000     10.4175       0.0255         0.0002
10   500,000     10.4177       0.0114         0.0000
11

Monte Carlo error shrinks as $1/\sqrt{N}$. To halve the error, you need four times as many paths. That is the fundamental cost of simulation, and it comes up constantly in quant interviews when discussing variance reduction techniques like antithetic variates or control variates.

Implied Volatility Inversion

Black-Scholes is not really a pricing model in practice. It is a quoting convention. Traders observe market prices and back out the implied volatility, then trade that number. Every quant desk does this.

Python
def implied_vol(C_mkt: float, S: float, K: float, T: float, r: float,
                option_type: str = "call",
                vol_bounds: tuple = (1e-6, 10.0)) -> float:
    """
    Invert Black-Scholes to find the implied volatility.

    Uses Brent's method, which is bracketed and guaranteed to converge
    as long as the market price lies within the no-arbitrage bounds.
    """
    # Objective: find sigma such that BS(sigma) - C_mkt = 0
    def objective(sigma):
        return bs_price(S, K, T, r, sigma, option_type) - C_mkt

    # Check that the market price is within arbitrage bounds
    intrinsic = max(S - K * np.exp(-r * T), 0) if option_type == "call" else max(K * np.exp(-r * T) - S, 0)
    if C_mkt < intrinsic:
        raise ValueError("Market price below intrinsic value: arbitrage opportunity.")

    return brentq(objective, *vol_bounds, xtol=1e-8, maxiter=500)

# Example: recover sigma from a known BS price
true_sigma = 0.25
market_price = bs_price(S=100, K=105, T=0.5, r=0.05, sigma=true_sigma)
recovered_sigma = implied_vol(market_price, S=100, K=105, T=0.5, r=0.05)
print(f"True sigma: {true_sigma:.4f}, Recovered: {recovered_sigma:.4f}")
# True sigma: 0.2500, Recovered: 0.2500

brentq is the right tool here because it is bracketed (you give it a lower and upper bound on vol) and does not require a derivative. Newton-Raphson is faster per iteration but can diverge if your initial guess is bad. For a production system you would use Newton with a Brent fallback.

The Volatility Smile

If Black-Scholes were literally true, implied vol would be flat across strikes. It never is.

Python
def plot_vol_smile(S: float, T: float, r: float, base_sigma: float = 0.20):
    """
    Simulate a volatility smile by adding a quadratic skew to market prices,
    then recovering implied vols. This mimics real equity market behaviour.
    """
    strikes = np.linspace(70, 130, 50)

    # Simulate "true" market vols with a smile: higher vol for OTM options
    # This is a toy model; real smiles come from jump risk and skewness
    moneyness    = np.log(strikes / S)
    market_sigma = base_sigma + 0.05 * moneyness**2 - 0.02 * moneyness  # skew + smile

    # Generate market prices using these varying vols
    market_prices = np.array([
        bs_price(S, K, T, r, sig) for K, sig in zip(strikes, market_sigma)
    ])

    # Now recover implied vols using flat BS (as if we didn't know the smile)
    implied_vols = np.array([
        implied_vol(C, S, K, T, r) for C, K in zip(market_prices, strikes)
    ])

    plt.figure(figsize=(9, 5))
    plt.plot(strikes, implied_vols * 100, color="steelblue", linewidth=2)
    plt.axhline(base_sigma * 100, color="gray", linestyle="--", label="Flat BS vol (20%)")
    plt.axvline(S, color="salmon", linestyle=":", label=f"ATM (S={S})")
    plt.xlabel("Strike")
    plt.ylabel("Implied Volatility (%)")
    plt.title(f"Volatility Smile  |  S={S}, T={T}y, r={r}")
    plt.legend()
    plt.tight_layout()
    plt.savefig("vol_smile.png", dpi=150)
    plt.show()

plot_vol_smile(S=100, T=1.0, r=0.05)

The smile tells you something real: out-of-the-money puts are expensive relative to what Black-Scholes predicts, because the market prices in crash risk. Black-Scholes assumes log-normal returns with no fat tails. The market knows better.

🔑Key insight
When an interviewer asks "what are the limitations of Black-Scholes?", the smile is your first answer. But go further: say that practitioners use BS as a quoting tool (implied vol surface) and then build stochastic vol models like Heston or local vol to actually price and hedge consistently across the surface.

Sensitivity Analysis

A quick parameter sweep makes the model's behaviour concrete.

Python
1base = dict(S=100, K=100, T=1.0, r=0.05, sigma=0.20)
2
3# Vary one parameter at a time
4params = {
5    "sigma": np.linspace(0.05, 0.80, 8),
6    "T":     np.linspace(0.1, 3.0, 8),
7    "S":     np.linspace(70, 130, 8),
8}
9
10print(f"{'sigma':>8} {'Call Price':>12}   {'T':>6} {'Call Price':>12}   {'S':>6} {'Call Price':>12}")
11print("-" * 65)
12for s, t, sp in zip(params["sigma"], params["T"], params["S"]):
13    c_sig = bs_price(**{**base, "sigma": s})
14    c_T   = bs_price(**{**base, "T": t})
15    c_S   = bs_price(**{**base, "S": sp})
16    print(f"{s:>8.2f} {c_sig:>12.4f}   {t:>6.2f} {c_T:>12.4f}   {sp:>6.1f} {c_S:>12.4f}")
17

Three things to internalise from this table. First, call price is monotonically increasing in volatility; more uncertainty always helps the long option holder. Second, the relationship with time to expiry is also monotone for calls; more time means more opportunity for the stock to move in your favour. Third, as $S$ falls far below $K$, the call price collapses toward zero exponentially fast, not linearly, because you need the stock to recover a large percentage move.

Near expiry with $T \to 0$, the option price collapses to its intrinsic value and the gamma explodes for near-ATM options. This is the regime where delta-hedging becomes expensive and discrete rebalancing error is largest. If an interviewer asks about hedging in practice, this is the moment to bring that up.

💡Interview tip
Many quant interviews end with "implement this in Python." Having a vectorised, well-commented pricer that you can write from memory (not copy-paste) is the difference between a pass and a hire. Practice writing d1, d2, and the brentq implied vol inversion by hand until it takes you under three minutes.

Extensions and Real-World Considerations

The Black-Scholes formula is not a description of reality. It's a mathematical baseline. Every serious quant interview will eventually push you past the formula itself and into the question of where it breaks, and what you'd do about it.

Model Extensions

Stochastic Volatility: The Heston Model

The single most glaring failure of Black-Scholes is the constant volatility assumption. If it held, implied volatility computed from market prices would be flat across strikes. It isn't. You get a smile or a skew, and the shape varies by expiry.

The Heston model fixes this by letting $\sigma$ itself be a random process. The joint dynamics under the risk-neutral measure $\mathbb{Q}$ are:

$$dS_t = r S_t \, dt + \sqrt{v_t} \, S_t \, dW_t^S$$

$$dv_t = \kappa(\theta - v_t) \, dt + \xi \sqrt{v_t} \, dW_t^v$$

where $v_t = \sigma_t^2$ is the instantaneous variance, $\kappa$ is the mean-reversion speed, $\theta$ is the long-run variance, $\xi$ is the vol-of-vol, and $dW_t^S \, dW_t^v = \rho \, dt$ captures the leverage effect (negative $\rho$ means vol rises when the stock falls).

The critical structural change: you now have two sources of randomness and only one traded asset. The market is incomplete. You can no longer form a perfectly replicating portfolio, so the no-arbitrage argument doesn't pin down a unique price. Instead, you introduce a market price of volatility risk $\lambda(S, v, t)$, which must be specified exogenously. Heston's tractability comes from the fact that the characteristic function of $\log S_T$ under $\mathbb{Q}$ has a closed form, so prices are computed via Fourier inversion rather than PDE grids.

🔑Key insight
In Heston, you're no longer solving for a unique price. You're solving for a price consistent with a chosen risk premium for volatility. That's a fundamentally different philosophical position than Black-Scholes.

Jump Diffusion: The Merton (1976) Model

Geometric Brownian motion produces continuous paths. Real stock prices don't. Earnings surprises, central bank announcements, and credit events cause discontinuous jumps that GBM simply cannot reproduce, no matter how large you make $\sigma$.

Merton's jump-diffusion model adds a compound Poisson process to the stock SDE:

$$dS_t = (\mu - \lambda \bar{k}) S_t \, dt + \sigma S_t \, dW_t + S_{t^-}(e^J - 1) \, dN_t$$

Here $N_t$ is a Poisson process with intensity $\lambda$ (average jumps per year), $J \sim \mathcal{N}(\mu_J, \sigma_J^2)$ is the log-jump size, and $\bar{k} = e^{\mu_J + \frac{1}{2}\sigma_J^2} - 1$ is the expected jump size, included to keep the drift consistent.

The resulting pricing formula is a weighted sum of Black-Scholes prices:

$$C = \sum_{n=0}^{\infty} \frac{e^{-\lambda' T} (\lambda' T)^n}{n!} \cdot C_{BS}!\left(S, K, T, r_n, \sigma_n\right)$$

where $\lambda' = \lambda(1 + \bar{k})$, $r_n = r - \lambda \bar{k} + n \mu_J / T$, and $\sigma_n^2 = \sigma^2 + n \sigma_J^2 / T$. Each term prices the option conditional on exactly $n$ jumps occurring.

Like Heston, the market is incomplete under jump diffusion. The jump risk cannot be hedged away by trading the stock alone, so the model requires an assumption about the jump risk premium.

⚠️Common mistake
Candidates say jump diffusion "adds a jump term to Black-Scholes." That's vague. Be specific: it adds a Poisson process, makes the market incomplete, and turns the closed-form price into an infinite series of BS prices weighted by Poisson probabilities.

Dividends and the Merton Continuous-Yield Adjustment

For a stock paying a continuous dividend yield $q$, the stock SDE under $\mathbb{Q}$ becomes:

$$dS_t = (r - q) S_t \, dt + \sigma S_t \, dW_t^{\mathbb{Q}}$$

The dividend reduces the effective drift because holders receive cash flows that reduce the stock's capital appreciation. The adjustment is mechanical: replace $r$ with $(r - q)$ everywhere in $d_1$ and $d_2$:

$$d_1 = \frac{\ln(S/K) + (r - q + \frac{1}{2}\sigma^2)T}{\sigma\sqrt{T}}, \quad d_2 = d_1 - \sigma\sqrt{T}$$

The call price becomes $C = S e^{-qT} N(d_1) - K e^{-rT} N(d_2)$. The $e^{-qT}$ factor discounts the stock price to reflect the value leaked out as dividends before expiry.

Discrete dividends are messier. The standard approach is to subtract the present value of known future dividends from the current stock price before plugging into the formula, treating the adjusted stock as the underlying. This breaks down for American options, where early exercise becomes optimal just before a large dividend, requiring a binomial tree or finite-difference method instead.


Real-World Complications

Put-Call Parity as a Model-Free Sanity Check

Before worrying about which model to use, there's one relationship that must hold regardless of model: put-call parity.

$$C - P = S e^{-qT} - K e^{-rT}$$

This follows from a pure no-arbitrage argument with no assumptions about dynamics. If you hold a call, short a put (same strike and expiry), and short the forward, you have a zero-payoff portfolio at expiry. So its present value must be zero.

Any pricing model you build must satisfy this. If your Monte Carlo pricer gives you call and put prices that violate put-call parity, you have a bug. Full stop. Interviewers sometimes ask you to derive this from scratch; it's a two-line argument and you should be able to do it cold.

Transaction Costs and the Continuous Hedging Fiction

The delta-hedging argument assumes you can rebalance continuously at zero cost. Neither condition holds in practice.

With transaction costs proportional to trade size, continuous rebalancing is infinitely expensive. Leland (1985) showed that you can incorporate transaction costs by inflating the effective volatility used in the hedge ratio. The modified volatility is:

$$\hat{\sigma}^2 = \sigma^2\left(1 + \frac{2}{\pi} \cdot \frac{k}{\sigma\sqrt{\Delta t}}\right)$$

where $k$ is the round-trip transaction cost per unit traded and $\Delta t$ is the rebalancing interval. The practical implication: you hedge less frequently, accept some residual delta risk, and widen your bid-ask spread to cover expected hedging costs.

Discrete rebalancing introduces hedge error even without transaction costs. The P&L of a delta-hedged position over a small interval is approximately proportional to $\Gamma (\Delta S)^2 - \frac{1}{2}\sigma^2 S^2 \Gamma \, dt$. When realized variance exceeds implied variance, the gamma trader profits; when it falls short, they lose. This is the core intuition behind variance swaps and volatility trading.

Model Risk and Calibration

Model risk is the risk that your pricing model is wrong. In Black-Scholes, the most dangerous assumption is constant volatility, because it's visibly violated by the market every day.

The standard workflow is to treat Black-Scholes not as a pricing model but as a quoting convention. You observe market prices, invert the formula to get implied volatilities, and quote options in vol terms. The smile tells you where the model is wrong. A steep put skew, for instance, tells you the market prices in crash risk that lognormal dynamics can't capture.

Calibration means fitting model parameters to match observed market prices. For Black-Scholes, there's only one free parameter per expiry ($\sigma$), so you match one option price exactly and misprice everything else. For Heston, you have five parameters ($\kappa, \theta, \xi, \rho, v_0$) and can fit the smile across strikes for a given expiry. The calibration is typically done by minimizing the sum of squared differences between model and market implied vols:

$$\min_{\kappa, \theta, \xi, \rho, v_0} \sum_{i} \left(\sigma_i^{\text{model}} - \sigma_i^{\text{market}}\right)^2$$

This is a nonlinear optimization problem. The Heston characteristic function is fast to evaluate, so you can use gradient-based methods. The catch: the calibrated parameters are not stable day-to-day, which means the model is being used as a sophisticated interpolation tool rather than a structural description of the market.


Follow-Up Questions Interviewers Actually Ask

"How would you hedge this option?"

Delta-hedge by holding $\Delta = N(d_1)$ shares of the underlying. This eliminates first-order sensitivity to stock price moves. In practice, you also gamma-hedge using other options to reduce sensitivity to large moves, and vega-hedge to manage exposure to volatility changes. The hedge ratios come directly from the Greeks.

"What if volatility is stochastic?"

Your delta hedge is no longer sufficient. You now have vega exposure that can't be hedged with the stock alone. You need to trade other options to hedge vega, and the hedge ratios depend on your chosen stochastic vol model. Under Heston, the option price depends on both $S$ and $v$, so the replicating portfolio requires two instruments beyond the risk-free asset.

"Derive the Greeks."

Delta is $\partial C / \partial S = N(d_1)$. Gamma is $\partial^2 C / \partial S^2 = \phi(d_1) / (S \sigma \sqrt{T})$ where $\phi$ is the standard normal PDF. Vega is $\partial C / \partial \sigma = S \phi(d_1) \sqrt{T}$. Theta is $\partial C / \partial t = -S \phi(d_1) \sigma / (2\sqrt{T}) - r K e^{-rT} N(d_2)$. Rho is $\partial C / \partial r = K T e^{-rT} N(d_2)$.

The relationship worth memorizing: $\Theta + \frac{1}{2}\sigma^2 S^2 \Gamma + r S \Delta - r C = 0$. This is just the Black-Scholes PDE rewritten in terms of Greeks. If an interviewer asks you to "derive a relationship between the Greeks," this is the answer.

"How does Black-Scholes connect to risk-neutral pricing?"

The Black-Scholes formula is the discounted expectation of the payoff under the risk-neutral measure $\mathbb{Q}$:

$$C = e^{-rT} \mathbb{E}^{\mathbb{Q}}\left[\max(S_T - K, 0)\right]$$

The Feynman-Kac theorem is the formal bridge: it says that if $C(S,t)$ satisfies the Black-Scholes PDE with terminal condition $C(S,T) = \max(S-K,0)$, then $C$ can be represented as exactly this expectation. The PDE approach and the probabilistic approach are two sides of the same coin.

🎯Key takeaway
Black-Scholes is best understood as a quoting convention, not a pricing model. The formula gives you a map from implied volatility to price. The real skill, and what senior interviewers are probing for, is knowing when the map breaks down and what to replace it with.

What is Expected at Each Level

Interviewers at Goldman, Citadel, and Two Sigma calibrate their follow-up questions based on your seniority. Knowing where the bar sits for your level lets you spend your prep time on the right material.

Junior Quant / Analyst

  • Write down the closed-form formula correctly, including the exact expressions for $d_1$ and $d_2$, without needing to be prompted. Getting the signs and the $\frac{1}{2}\sigma^2$ term wrong is a fast way to lose credibility.
  • Explain what $N(d_2)$ represents: the risk-neutral probability that the option expires in the money. Interviewers ask this constantly, and "it's part of the formula" is not an answer.
  • Implement a working pricer in Python from scratch using scipy.stats.norm.cdf. You should also be able to extend it to puts immediately via put-call parity, not by re-deriving a separate formula.
  • Sketch the delta-hedging intuition at a high level: you hold $\Delta = N(d_1)$ shares to offset the option's exposure to $S$. You don't need to reproduce the full PDE derivation, but you should know where the formula comes from conceptually.

Senior Quant / Researcher

  • Derive the Black-Scholes PDE from scratch, starting from the stock SDE, applying Itô's lemma to $C(S, t)$, constructing the delta-hedge portfolio, and invoking no-arbitrage. No hints, no scaffolding.
  • Articulate the Girsanov measure change clearly: under the physical measure $\mathbb{P}$, the stock drifts at $\mu$; under $\mathbb{Q}$, it drifts at $r$. The pricing formula holds under $\mathbb{Q}$ precisely because $\mu$ drops out, and you should be able to explain why that is non-trivial.
  • Implement implied volatility inversion using a root-finding method like scipy.optimize.brentq, and discuss why Newton-Raphson can fail near zero vega (deep in- or out-of-the-money options with short expiry).
  • Critically assess the constant-volatility assumption. You should be able to say, concisely, why the volatility smile is not a quirk but a direct contradiction of the model's core premise, and name at least one framework (Heston, local vol) that addresses it.

Lead Quant / Portfolio Manager

  • Connect Black-Scholes to the First and Second Fundamental Theorems of Asset Pricing. Market completeness is what makes the replicating portfolio unique; in an incomplete market (jumps, stochastic vol), the hedge is no longer perfect and the price is no longer unique.
  • Discuss model risk in operational terms. Black-Scholes is not used as a pricing model on a trading desk; it is used as a quoting convention. Implied vol is the language traders use to communicate prices, and the smile is the market's correction to the model's inadequacy.
  • Speak to calibration trade-offs: local vol (Dupire) fits the smile exactly but produces poor forward vol dynamics; Heston fits less precisely but gives more realistic term structure behavior. A lead-level candidate knows which matters for which product.
  • Extend to a simple exotic on the fly. For a barrier option or an Asian option, explain why closed-form solutions either don't exist or require additional assumptions, and describe the numerical method you'd reach for first (PDE grid, Monte Carlo with variance reduction, or semi-analytic approximation).
🎯Key takeaway
Black-Scholes is not primarily a pricing formula; it is a no-arbitrage argument. The formula is just what falls out when you solve the PDE. Candidates who understand the argument, not just the output, can handle every follow-up question an interviewer throws at them, because every extension (dividends, stochastic vol, jumps) is just a modification of the same underlying logic.