Warner Bros. Data Scientist Interview Guide

Dan Lee's profile image
Dan LeeData & AI Lead
Last updateFebruary 27, 2026
Warner Bros. Data Scientist Interview

Warner Bros. Data Scientist at a Glance

Total Compensation

$131k - $260k/yr

Interview Rounds

6 rounds

Difficulty

Levels

P1 - P5

Education

PhD

Experience

0–15+ yrs

SQL Python Rtime-series-forecastingcausal-inferencesubscription-streamingproduct-analyticsmedia-entertainmentmarketing-analytics

From hundreds of mock interviews for media-company DS roles, one pattern stands out at Warner Bros. specifically: candidates who nail the LightGBM churn model but can't walk a VP of Content through the SHAP plot explaining why single-franchise viewers are highest risk. WBD treats communication as a gating skill, not a soft bonus, and their interview loop reflects it.

Warner Bros. Data Scientist Role

Primary Focus

time-series-forecastingcausal-inferencesubscription-streamingproduct-analyticsmedia-entertainmentmarketing-analytics

Skill Profile

Math & StatsSoftware EngData & SQLMachine LearningApplied AIInfra & CloudBusinessViz & Comms

Math & Stats

High

Strong grounding in statistical analysis, experimental design/measurement, and predictive modeling to solve business problems; role explicitly calls for statistical models, explainable AI methods (e.g., SHAP), and analytical rigor. Evidence primarily from third-party posting (GetClela) and interview-prep blog; official WBD reqs for this exact title were not accessible in provided sources, so level is a conservative estimate.

Software Eng

High

Expected to build production-oriented frameworks and automation, implement robust anomaly detection/alerting systems, and use deployment packages (e.g., Streamlit/Shiny). Implies strong coding practices beyond notebooks, though not necessarily full SWE ownership. Evidence: GetClela role responsibilities and requirements.

Data & SQL

High

Clear requirement to build data pipelines, advance data automation, implement data-quality monitoring, and work with big-data stack options (Spark/Kafka/Hive). Evidence: GetClela responsibilities and preferred qualifications; interview-review notes mention system design and data flow architecture focus (user-generated, less reliable).

Machine Learning

Expert

Explicit emphasis on advanced ML algorithms and predictive modeling (Random Forest, XGBoost, LightGBM), deep learning, and explainability. Role includes designing and implementing ML models to inform strategy. Evidence: GetClela 'What To Bring' section.

Applied AI

Low

No explicit GenAI/LLM, RAG, prompt engineering, or foundation-model deployment requirements mentioned in provided sources for this role. Any GenAI expectation would be speculative for 2026, so scored low with uncertainty.

Infra & Cloud

Medium

Cloud experience is preferred (AWS/Azure/GCP) rather than required; deployment experience referenced via app deployment packages (Streamlit/Shiny). Data platforms like Databricks/Snowflake suggest cloud-adjacent work but not heavy infra ownership. Evidence: GetClela requirements and preferred qualifications.

Business

High

Role is positioned to influence data strategy for WB Studios, partner with Product/Business stakeholders, and deliver insights tied to customers, products, and business strategy (digital marketing/CRM measurement/testing). Evidence: GetClela responsibilities.

Viz & Comms

High

Strong communication required to convey complex insights; familiarity with BI tools (Looker/Tableau) requested. Stakeholder-facing insights delivery is central. Evidence: GetClela requirements; datainterview.com blog aligns but is secondary.

What You Need

  • Statistical modeling and predictive analytics
  • Machine learning model development (e.g., tree-based models; deep learning exposure)
  • Explainable AI techniques (e.g., SHAP) and model interpretability
  • Exploratory data analysis and feature engineering
  • SQL proficiency
  • Python or R proficiency
  • Building/maintaining data pipelines and automation for data prep
  • Data quality assurance: anomaly detection, alerting, and remediation
  • Stakeholder collaboration (Product/Business) and translating problems into analytical solutions
  • Strong written/verbal communication

Nice to Have

  • Marketing measurement/testing and CRM analytics experience
  • Marketing technology ecosystem experience (CDPs, identity spine vendors like LiveRamp/Neustar, ad platforms Google/Facebook, Salesforce Marketing Cloud, data clean rooms)
  • Big data technologies (Spark, Kafka, Hive)
  • Cloud experience (AWS, Azure, or GCP)
  • Agile workflow experience

Languages

SQLPythonR

Tools & Technologies

DatabricksSnowflakeLookerTableauXGBoostLightGBMStreamlit (incl. R Streamlit, where applicable)ShinySparkKafkaHiveAWSAzureGCPSalesforce Marketing CloudLiveRampNeustarGoogle Ads/Marketing Platform (Google)Meta Ads (Facebook)

Want to ace the interview?

Practice with real questions.

Start Mock Interview

You're sitting between two businesses that run on different clocks: a profitable but shrinking linear TV portfolio (HBO cable, Discovery networks, CNN) and Max, the streaming platform still scaling toward profitability. Your core mandate is time series forecasting and causal modeling for subscription KPIs, things like projecting subscriber growth around an HBO original premiere versus a new sports rights deal, then reporting those projections to leadership in ways that actually shape budget and content decisions.

A Typical Week

A Week in the Life of a Warner Bros. Data Scientist

Typical L5 workweek · Warner Bros.

Weekly time split

Analysis22%Coding20%Meetings18%Writing15%Infrastructure10%Research8%Break7%

Culture notes

  • Pace is steady but ramps up significantly around tentpole content launches (new HBO series, Max live sports events) and quarterly business reviews where leadership wants fresh subscriber health numbers.
  • Warner Bros. Discovery operates on a hybrid model with most NYC-based data science roles expected in-office three days a week at 30 Hudson Yards, with flexibility on which days as long as key syncs are covered.

What jumps out isn't the modeling time. It's how much of your week goes to writing experiment design docs in Confluence, updating SHAP documentation on model cards, and chasing down broken Snowflake loads that feed someone else's Tableau dashboard on the linear TV side. If you picture this role as heads-down Databricks notebooks all day, recalibrate: stakeholder syncs and pipeline firefighting eat real hours.

Projects & Impact Areas

Subscription forecasting anchors the work, where you're building time series models that project Max subscriber counts across different content release cycles and feeding those numbers into leadership reporting. That forecasting naturally connects to causal inference problems: when marketing launches a win-back campaign or a bundle offer, you're designing the uplift analysis to separate incremental conversions from organic resubscribers. Experimentation rounds it out, with A/B test design for Max product changes (ad-tier funnel tweaks, homepage layout shifts) requiring you to own everything from randomization unit selection to minimum detectable effect calculations alongside product managers and staff engineers.

Skills & What's Expected

ML is rated "expert" level, but the skill most candidates under-prepare for is data pipelines, which WBD rates equally high. You'll build and maintain production workflows across Snowflake and Databricks, implement anomaly detection for data quality, and troubleshoot upstream schema changes from the linear TV side. Meanwhile, GenAI/LLM experience is scored low. WBD wants clean SHAP summary plots you can explain to a non-technical exec, not prompt engineering demos.

Levels & Career Growth

Warner Bros. Data Scientist Levels

Each level has different expectations, compensation, and interview focus.

Base

$114k

Stock/yr

$0k

Bonus

$17k

0–2 yrs BS in a quantitative field (CS, Statistics, Math, Engineering, Economics) or equivalent experience; MS preferred for some teams.

What This Level Looks Like

Executes well-scoped analyses and builds initial versions of models/metrics that impact a single product area or business function (e.g., marketing/adtech). Impact is primarily team-level; works with guidance and established patterns.

Day-to-Day Focus

  • Strong fundamentals in SQL, Python, and statistics
  • Data cleaning, validation, and reproducible analysis
  • Clear communication and stakeholder-ready storytelling
  • Learning company data models/definitions and analytics tooling
  • Delivering reliable results for a well-defined business problem

Interview Focus at This Level

Emphasizes SQL (joins, windows, aggregation), Python for data analysis, statistics/experimentation basics, interpreting metrics, and structured case-style product/marketing analytics questions; assesses ability to communicate assumptions and produce clean, reproducible work.

Promotion Path

Promotion to the next level requires independently owning small-to-medium analyses end-to-end, consistently delivering accurate and actionable insights, improving a metric/model/pipeline beyond baseline, demonstrating strong stakeholder management, and showing good engineering hygiene (testing/monitoring/documentation) with decreasing supervision.

Find your level

Practice with questions tailored to your target level.

Start Practicing

Most external hires land at P2 or P3. The P3-to-P4 jump is where people stall, because the bar shifts from "owns a project end-to-end" to "sets measurement standards across teams and influences roadmaps you don't directly control." If you're eyeing Principal, the path rewards domain ownership (becoming the subscriber forecasting authority) over generalist breadth.

Work Culture

WBD operates hybrid, with in-office expectations varying by location. The pace feels media-industry, not Silicon Valley: steady most weeks, then intense around tentpole launches like a new HBO original premiere or Max picking up live sports rights. Genuine cross-brand exposure (HBO, Discovery, CNN teams who think about audiences in fundamentally different ways) is a real perk, though legacy data silos from the merger still create friction when you're reconciling schemas that were never designed to coexist.

Warner Bros. Data Scientist Compensation

WBD's equity component is worth scrutinizing. Stock grants don't appear until P2, and the vesting schedule isn't publicly documented, so ask your recruiter for the exact cliff, back-loading structure, and refresh grant cadence before you evaluate any offer. The safest move is to negotiate for a higher base or a guaranteed first-year sign-on bonus rather than accepting a larger equity allocation you can't fully model.

The biggest negotiation lever most candidates overlook is leveling, not dollars within a band. WBD's P3 and P4 bands overlap significantly, so framing your experience around the scope descriptors in the next level up (owning measurement frameworks end-to-end, mentoring junior scientists, driving cross-functional adoption of model outputs) can shift you into a higher band entirely. Confirm bonus payout timing, year-one bonus guarantee, and hybrid expectations at the Hyderabad office in writing before you sign.

Warner Bros. Data Scientist Interview Process

6 rounds·~3 weeks end to end

Initial Screen

1 round
1

Recruiter Screen

30mPhone

First, you’ll do a short phone screen with a recruiter or HR coordinator to confirm role fit, logistics, and motivation for entertainment/streaming analytics. Expect a resume walkthrough, questions about your preferred tech stack (SQL/Python/R), and a quick check on availability, location/remote expectations, and compensation range.

generalbehavioral

Tips for this round

  • Prepare a 60–90 second narrative connecting your past projects to media/streaming use cases (audience growth, churn, content performance).
  • Have a crisp inventory of your toolkit: SQL (window functions/CTEs), Python (pandas/sklearn), BI (Tableau/Looker), and experimentation exposure.
  • Clarify your preferred domain (marketing analytics, recommendations, content, finance) and the kinds of stakeholders you’ve supported.
  • Bring a compensation anchor based on level and market (base + bonus), but frame it as a range contingent on scope/leveling.
  • Ask what the next steps are (HireVue vs live Zoom, take-home or not) and expected timeline, since candidates report mixed communication cadence.

Technical Assessment

4 rounds
2

Behavioral

30mVideo Call

Next comes a recorded video interview (often HireVue) where you respond to prompts on camera without a live interviewer. You’ll typically face situational and behavioral questions, and the awkwardness is part of the test—clarity and structure matter as much as content.

generalbehavioral

Tips for this round

  • Use a tight STAR format with explicit metrics (e.g., lift, AUC, MAE, churn reduction) to avoid rambling on recorded answers.
  • Practice with a timer and one retake rule; aim for 60–120 seconds per answer with a clear takeaway sentence.
  • Expect prompts like conflict resolution, prioritization, and stakeholder influence—prepare 5–6 stories mapped to these themes.
  • Optimize the recording setup: eye-level camera, good lighting, quiet room, and short notes off-screen with bullet reminders.
  • When asked about failures, include how you validated assumptions (holdouts, backtests, sensitivity checks) and what you changed.

Onsite

1 round
6

Behavioral

180mVideo Call

Finally, you’ll typically have a panel-style set of Zoom or in-person interviews with cross-functional partners and/or the hiring manager. Expect a mix of culture fit, stakeholder management, and scenario questions about turning ambiguous business needs into measurable analyses and dashboards.

behavioralproduct_sensevisualization

Tips for this round

  • Prepare stakeholder stories showing how you influenced decisions without authority (product/marketing/finance/creative partners).
  • Use a repeatable framework for ambiguous asks: clarify objective → define metric tree → data audit → method → recommendation → risks.
  • Bring a portfolio-ready example of a dashboard or visualization; be able to justify chart choices and metric definitions.
  • Demonstrate prioritization: talk through tradeoffs between speed vs rigor, and what you’d ship in week 1 vs month 1.
  • Close by asking calibrated questions about team ownership (streaming vs studio, marketing vs personalization), data maturity, and expectations for 30/60/90 days.

Tips to Stand Out

  • Prepare for HireVue-style prompts. Rehearse concise recorded answers with STAR, include numbers, and end each response with the decision impact to offset the awkward one-way format candidates mention.
  • Anchor your examples in entertainment/streaming. Translate your DS work into Warner Bros.-relevant outcomes like engagement, retention/churn, content discoverability, marketing attribution, and audience segmentation.
  • Be explicit about your analytics methodology. Interviewers tend to reward how you think—state assumptions, validation approach, and how you’d handle confounding, missingness, and biased logging.
  • Show end-to-end ownership. Connect SQL extraction, Python modeling, and stakeholder delivery (dashboards, readouts, PRDs) to prove you can drive projects beyond a notebook.
  • Expect uneven communication and manage it professionally. Ask for timelines at each step, send crisp follow-ups, and keep other processes moving so delays don’t derail your search.
  • Practice recommendation-system thinking even for general DS roles. Many media teams lean on personalization; being able to design candidates/ranking/evaluation makes you stand out.

Common Reasons Candidates Don't Pass

  • Shallow experimentation fundamentals. Candidates get screened out when they can’t choose metrics, discuss power/MDE, or explain why a result is or isn’t practically meaningful.
  • Modeling without rigor. Over-indexing on algorithms while missing leakage, proper splits, calibration, or baseline comparisons signals weak real-world ML judgment.
  • Unstructured communication. Rambling (especially in recorded video interviews) or failing to summarize implications for stakeholders can read as poor partner-facing readiness.
  • Weak product/industry translation. Strong technical skill but no clear mapping to streaming/content/marketing decisions makes it hard to trust your recommendations will drive action.
  • Hand-wavy system design. Vague answers that ignore data pipelines, evaluation, monitoring, or latency/freshness constraints can hurt, particularly if asked to design a recommender.

Offer & Negotiation

Data Scientist offers at a large media/entertainment company like Warner Bros. typically combine base salary plus an annual bonus target; equity may be limited or role/level dependent, and sign-on bonuses can sometimes substitute for equity. The most negotiable levers are base (within band), sign-on, bonus target/guarantee for year 1, title/leveling, and remote/hybrid flexibility. Use your loop performance to ask for level alignment (e.g., DS vs Senior DS) and negotiate based on scope (ownership of recommendations/experimentation), not just years of experience. Get the full package in writing and confirm bonus eligibility, payout timing, and any relocation or return-to-office requirements before accepting.

Candidates report uneven communication between rounds, and the process can drag past the expected window when recruiters go quiet. Send a short follow-up after each stage referencing something specific you discussed, because WBD's hiring coordination across its streaming and linear segments isn't always tightly synced.

Shallow experimentation fundamentals are among the most common reasons candidates get cut. People over-index on ML prep and stumble when asked to walk through power analysis or explain practical vs. statistical significance for something like a Max free-trial offer test. The other quiet killer: the final panel includes non-technical partners from content, marketing, or product who need to believe you can turn a model output into a decision about, say, whether to renew a Discovery+ original. Strong technicals won't save you if those panelists aren't convinced.

Warner Bros. Data Scientist Interview Questions

Time Series Forecasting for Subscription KPIs

Expect questions that force you to forecast subscriber adds/churn and revenue-like KPIs under seasonality, shocks (content drops), and shifting acquisition mix. You’ll be evaluated on model choice, validation strategy, and how you communicate uncertainty for leadership projections.

You own a weekly forecast for HBO Max net adds, and a major franchise premiere causes a one-week spike plus a higher churn rate starting two weeks later. How do you model the spike and the lagged churn effect, and how do you validate that your forecast is not leaking future information?

MediumIntervention Modeling and Backtesting

Sample Answer

Most candidates default to a plain seasonal ARIMA or Prophet with holiday flags, but that fails here because the premiere creates both an immediate level shock and a delayed churn effect with different dynamics. You need explicit intervention features, for example impulse for launch week and a distributed lag for churn uplift over weeks $t+2$ to $t+k$, or a state space model with regressors. Validate with rolling origin backtests that only use features available at forecast time, freeze content calendars and marketing plans as of each cutoff. Compare error by horizon, not just one overall MAPE, and check residual autocorrelation around the event window to confirm the intervention absorbed the shock.

Practice more Time Series Forecasting for Subscription KPIs questions

Causal Inference & Marketing/Product Measurement

Most candidates underestimate how much you’ll be pushed to separate correlation from impact when marketing, pricing, or product changes drive HBO Max outcomes. The focus is on identification, assumptions, and practical designs like DiD, synthetic control, and uplift/incrementality framing.

HBO Max runs a 2-week email reactivation campaign to lapsed subscribers, and you see higher re-subscribe rates among emailed users than non-emailed users. What design and assumptions let you claim incremental lift, and what is the core estimand?

EasyIncrementality and Identification

Sample Answer

Use a randomized holdout (or as close to random assignment as you can get) and estimate the average treatment effect on re-subscribe, defined as $E[Y(1)-Y(0)]$. Randomization makes treatment independent of potential outcomes, so the difference in mean outcomes between treated and control identifies the causal effect. Without that, selection bias dominates because marketing targets higher-intent users. You also need stable exposure (no spillovers) and consistent outcome measurement across groups.

Practice more Causal Inference & Marketing/Product Measurement questions

Machine Learning Modeling & Interpretability

Your ability to reason about applied ML is tested through problems like propensity/churn prediction and driver modeling using tree-based methods (XGBoost/LightGBM) and explainability (e.g., SHAP). Interviewers look for tradeoffs (leakage, calibration, drift), not model-serving architecture.

You are building an XGBoost model to predict 7-day HBO Max churn using daily engagement and marketing touch data. What are two common leakage paths in this setup, and how do you redesign features and splits to prevent them?

EasyLeakage and Validation Design

Sample Answer

You could do a random row split or a time-based, user-level split. Random splits can look great but leak future behavior (for example post-cancel events or later-day engagement) into training, time-based user splits win here because churn is temporal and you care about forward-looking performance. Another leakage path is feature windows that cross the prediction cutoff (like using $t+1$ to $t+7$ engagement to predict churn at $t$), fix it with strict lookback windows anchored at an as-of date and a label horizon that starts after the cutoff. Add explicit as-of joins in your feature pipeline, plus unit tests that fail if max(feature_timestamp) $>$ as_of_timestamp.

Practice more Machine Learning Modeling & Interpretability questions

Data Pipelines, Automation & Data Quality Monitoring

The bar here isn’t whether you know what Spark/Databricks/Snowflake are, it’s whether you can design reliable inputs to forecasting and reporting without silent failures. You’ll need to articulate approaches to anomaly detection, freshness checks, backfills, and metric consistency across dashboards.

Your HBO Max net adds forecast depends on daily paid starts and paid cancels from Snowflake, and the dashboard shows a sudden 20% drop in cancels only for iOS. What exact data quality checks do you add (freshness, completeness, distribution), and what is your triage order to decide data issue versus real product change?

EasyData Quality Monitoring

Sample Answer

Reason through it: Walk through the logic step by step as if thinking out loud. Start with freshness, confirm the iOS cancels table partition for $t$ arrived and row counts are nonzero versus typical. Then completeness, compare cancels by platform across upstream events (app events) and downstream facts (subscription ledger) to see where the drop appears. Finally distribution, check shifts in key fields (cancel_reason, plan_id, country) and join rates, a spike in nulls or a join key change usually explains a platform-only cliff.

Practice more Data Pipelines, Automation & Data Quality Monitoring questions

SQL for Product & Subscription Analytics

In practice, you’ll be asked to compute streaming subscription metrics from event and subscription tables (cohorts, churn, reactivations, trial conversion) with clean, performant SQL. Common pitfalls include double-counting users, mishandling effective-dated subscriptions, and defining KPIs inconsistently.

Given tables subscriptions(user_id, plan_id, status, start_ts, end_ts) and watch_events(user_id, event_ts, title_id), compute daily HBO Max active paid subscribers for the last 30 days where a user is active if they have an active paid subscription on that day and at least one watch event that day. Return day, active_paid_subs.

EasyWindow Functions

Sample Answer

This question is checking whether you can prevent double counting and handle effective dated subscriptions cleanly. You should de duplicate watch activity to user day, then intersect it with paid coverage on that day. Get the join condition right, inclusive start and exclusive end is the usual safe choice. If you join raw events to subscription rows, counts will explode.

SQL
1/*
2Daily active paid subscribers over the last 30 days.
3Assumptions:
4- subscriptions.status = 'paid' means paid entitlement (exclude trials).
5- Subscription is active for timestamps in [start_ts, end_ts), end_ts can be NULL for ongoing.
6- watch_events can have many rows per user per day, so de-duplicate to user-day.
7- SQL written to run on common warehouses (Snowflake, Databricks SQL) with minor syntax differences.
8*/
9
10WITH params AS (
11  SELECT
12    /* Use CURRENT_DATE for date grain, adjust if your warehouse uses CURRENT_DATE() */
13    CURRENT_DATE AS as_of_date,
14    DATEADD(day, -29, CURRENT_DATE) AS start_date
15),
16days AS (
17  SELECT DATEADD(day, seq4(), p.start_date) AS day
18  FROM params p,
19  TABLE(GENERATOR(ROWCOUNT => 30))
20),
21watch_user_day AS (
22  SELECT
23    CAST(event_ts AS DATE) AS day,
24    user_id
25  FROM watch_events
26  WHERE CAST(event_ts AS DATE) BETWEEN (SELECT start_date FROM params) AND (SELECT as_of_date FROM params)
27  GROUP BY 1, 2
28),
29paid_coverage_user_day AS (
30  SELECT
31    d.day,
32    s.user_id
33  FROM days d
34  JOIN subscriptions s
35    ON s.status = 'paid'
36   AND d.day >= CAST(s.start_ts AS DATE)
37   AND d.day < CAST(COALESCE(s.end_ts, DATEADD(day, 1, (SELECT as_of_date FROM params))) AS DATE)
38  GROUP BY 1, 2
39)
40SELECT
41  d.day,
42  COUNT(DISTINCT w.user_id) AS active_paid_subs
43FROM days d
44JOIN watch_user_day w
45  ON w.day = d.day
46JOIN paid_coverage_user_day p
47  ON p.day = d.day
48 AND p.user_id = w.user_id
49GROUP BY 1
50ORDER BY 1;
Practice more SQL for Product & Subscription Analytics questions

Python/R ML Coding (EDA, Feature Engineering, Metrics)

You’ll likely code through data cleaning, feature creation, and model-ready dataset assembly that mirrors real forecasting/causal workflows. What trips people up is writing robust, testable transformations and metric calculations (including time-based splits) rather than fancy algorithms.

You have daily HBO Max subs data with columns date, country, plan, trials_started, paid_starts, cancels, active_subs, and marketing_spend. Write Python to create leakage-safe features (7-day rolling mean of paid_starts and cancels, 7-day lag of marketing_spend, day-of-week), then compute WAPE for a next-28-days forecast per (country, plan) given y_true and y_pred columns.

EasyFeature Engineering, Time-Based Metrics

Sample Answer

The standard move is to sort by keys and date, then build lags and rolling windows using past-only data, and compute WAPE as $\frac{\sum |y-\hat{y}|}{\sum |y|}$. But here, grouping by (country, plan) matters because cross-series rolling windows quietly leak information across markets, and you also need a zero-denominator guard when actuals sum to $0$.

Python
1import numpy as np
2import pandas as pd
3
4
5def add_features_and_wape(df: pd.DataFrame) -> tuple[pd.DataFrame, pd.DataFrame]:
6    """Create leakage-safe time-series features per (country, plan) and compute 28-day WAPE.
7
8    Expected columns:
9      - date, country, plan
10      - trials_started, paid_starts, cancels, active_subs, marketing_spend
11      - y_true, y_pred (for metric)
12
13    Returns:
14      - df_fe: original df with added feature columns
15      - wape_28d: WAPE aggregated per (country, plan) over the last 28 dates in the frame
16    """
17    df = df.copy()
18
19    # Parse and sort for deterministic rolling/lag behavior.
20    df["date"] = pd.to_datetime(df["date"])
21    df = df.sort_values(["country", "plan", "date"]).reset_index(drop=True)
22
23    grp = df.groupby(["country", "plan"], sort=False)
24
25    # Calendar feature.
26    df["dow"] = df["date"].dt.dayofweek.astype("int16")  # 0=Mon ... 6=Sun
27
28    # Leakage-safe rolling means: use shift(1) so today's target does not enter today's features.
29    for col in ["paid_starts", "cancels"]:
30        df[f"{col}_roll7_mean"] = (
31            grp[col]
32            .transform(lambda s: s.shift(1).rolling(window=7, min_periods=1).mean())
33            .astype("float32")
34        )
35
36    # Pure lag feature.
37    df["marketing_spend_lag7"] = grp["marketing_spend"].transform(lambda s: s.shift(7)).astype("float32")
38
39    # Metric: WAPE over the last 28 days per series.
40    # Keep only the last 28 dates per (country, plan) based on ordering.
41    df["_row_num"] = grp.cumcount()
42    df["_n"] = grp["_row_num"].transform("max") + 1
43    df["_is_last_28"] = df["_row_num"] >= (df["_n"] - 28)
44
45    metric_df = df.loc[df["_is_last_28"] & df["y_true"].notna() & df["y_pred"].notna(),
46                       ["country", "plan", "y_true", "y_pred"]].copy()
47
48    metric_df["abs_err"] = (metric_df["y_true"] - metric_df["y_pred"]).abs()
49    metric_df["abs_true"] = metric_df["y_true"].abs()
50
51    agg = metric_df.groupby(["country", "plan"], as_index=False).agg(
52        sum_abs_err=("abs_err", "sum"),
53        sum_abs_true=("abs_true", "sum"),
54        n=("abs_err", "size"),
55    )
56
57    # Guard against divide-by-zero when the 28-day actual sum is zero.
58    agg["wape_28d"] = np.where(agg["sum_abs_true"] > 0, agg["sum_abs_err"] / agg["sum_abs_true"], np.nan)
59
60    # Cleanup helper cols.
61    df_fe = df.drop(columns=["_row_num", "_n", "_is_last_28"])
62    return df_fe, agg
63
64
65# Example usage (df must already exist):
66# df_fe, wape_28d = add_features_and_wape(df)
67# print(wape_28d.sort_values("wape_28d"))
68
Practice more Python/R ML Coding (EDA, Feature Engineering, Metrics) questions

The distribution skews heavily toward forecasting and causal reasoning, which tells you WBD wants people who can answer "what will happen to churn when we drop a new HBO original?" and then prove whether that content drop actually caused the retention lift. The compounding trap is that WBD's subscription data carries structural breaks (price tier launches, sports rights deals like the Olympics) that make both forecasting and causal identification harder simultaneously, so prepping these as separate textbook topics leaves you exposed when an interviewer hands you a messy Max scenario that demands both. Pipeline and SQL questions may look like the lighter slice, but they're flavored around subscription event schemas with reactivation edge cases and silent data failures, not the generic warehouse design problems most candidates drill.

Practice Warner Bros. questions across all six topic areas at datainterview.com/questions.

How to Prepare for Warner Bros. Data Scientist Interviews

Know the Business

Updated Q1 2026

Official mission

to be the world's best storytellers, creating world-class products for consumers.

What it actually means

Warner Bros. Discovery aims to be a global content powerhouse by creating world-class entertainment across film, television, sports, news, and games, while strategically transitioning to streaming dominance and driving profitability.

New York, New YorkHybrid - Flexible

Key Business Metrics

Revenue

$38B

-6% YoY

Market Cap

$72B

+159% YoY

Employees

35K

-1% YoY

Business Segments and Where DS Fits

Global Linear Networks

Operates traditional television channels and linear properties, including brands like Adult Swim, Bleacher Report, CNN, Discovery, Food Network, HGTV, Investigation Discovery (ID), Magnolia, OWN, TBS, TLC, TNT Sports, and Eurosport. It also represents domestic advertising inventory for Warner Bros. linear properties.

DS focus: Advanced targeting strategies, ad tech innovation, data-driven solutions for advertisers

Streaming & Studios

Manages streaming platforms such as HBO Max and discovery+, and content production studios including Warner Bros. Television, Warner Bros. Motion Picture Group, and DC Studios.

DS focus: Advanced targeting strategies, ad tech innovation, data-driven solutions for advertisers, streaming engagement features (e.g., Olympics Multiview, Gold Medal Alerts, Timeline Markers, personalized watch lists)

Current Strategic Priorities

  • Affirm position as a one-stop shop for advertisers heading into the 2026/2027 marketplace
  • Deepen connections between people and the world through bold storytelling and engaging stories
  • Deliver innovative, data-driven solutions that help brands engage meaningfully with a passionate global audience
  • Enhance strategic flexibility and create potential value creation opportunities through a new corporate structure comprising Global Linear Networks and Streaming & Studios divisions
  • Expand the Harry Potter universe through licensed toys & games and a new HBO Original series
  • Achieve substantial streaming viewership and engagement growth for major sports events, building on the foundation set by the 2026 Winter Olympics

Competitive Moat

Vast content catalogueBlockbuster filmsPrestige televisionFactual programmingIconic franchises

Warner Bros. Discovery (the full corporate entity behind the "Warner Bros." brand) formally split into two divisions in 2025: Global Linear Networks and Streaming & Studios. That restructuring is the single most important context for your prep, because DS teams now operate within business units that have different data schemas, different KPIs, and different stakeholders. The streaming side posted record viewership during the 2026 Winter Olympics, while the linear side is focused on positioning itself as a one-stop shop for advertisers heading into the 2026/2027 marketplace.

Your "why Warner Bros. Discovery?" answer should center on that dual-segment data problem. Talk about how the DAISY text-to-SQL tool was built to let analysts query across heterogeneous sources, or how the recommendation engine described in their Stack Overflow podcast has to serve both Max originals and Discovery+ unscripted catalogs. That specificity signals you've studied the actual infrastructure, not just the content library.

Try a Real Interview Question

7-day holdout retention after a price change (DiD-ready cohorting)

sql

Given subscriber events and a price change date per region, compute for each region the $7$-day holdout retention: among users who had an active subscription on the day before the price change, return the share who are still active on day $+7$. Output columns: region, price_change_date, cohort_size, retained_7d, retention_rate where $retention\_rate = retained\_7d / cohort\_size$.

subscription_events
user_idregionevent_dateevent_typeplan_type
101US2024-01-14subscribead_free
101US2024-01-22cancelad_free
102US2024-01-01subscribead_free
103US2024-01-10subscribead_lite
103US2024-01-21cancelad_lite
price_changes
regionchange_date
US2024-01-15
LATAM2024-02-01
EMEA2024-03-10

700+ ML coding problems with a live Python executor.

Practice in the Engine

Warner Bros. Discovery's DS interviews lean into subscription and engagement analytics, so the SQL you'll write looks like real Max business logic: cohort retention, rolling churn windows, funnel breakdowns segmented by content type. Sharpen that muscle at datainterview.com/coding, focusing on queries where the business context matters as much as the syntax.

Test Your Readiness

How Ready Are You for Warner Bros. Data Scientist?

1 / 10
Time Series Forecasting

Can you build a weekly forecast for subscription KPIs like net adds, churn, and ARPU, choosing between ARIMA/ETS/Prophet/state space models, and justify your choice using residual diagnostics and backtesting?

Identify your blind spots before the real loop does. datainterview.com/questions lets you drill the specific topic areas Warner Bros. Discovery weights most heavily.

Frequently Asked Questions

How long does the Warner Bros. Data Scientist interview process take?

Expect roughly 4 to 6 weeks from initial recruiter screen to offer. The process typically starts with a recruiter call, moves to a technical phone screen (SQL and Python), then a multi-round onsite or virtual loop. Scheduling can stretch longer depending on the team and hiring manager availability, so don't panic if there's a quiet week in between rounds.

What technical skills are tested in the Warner Bros. Data Scientist interview?

SQL is non-negotiable at every level. You'll also need solid Python (or R) skills for data analysis and modeling. Beyond that, they test statistical modeling, predictive analytics, machine learning model development (tree-based models especially), feature engineering, and data pipeline work. At senior levels and above, expect questions on explainable AI techniques like SHAP, model interpretability, and data quality assurance including anomaly detection.

How should I tailor my resume for a Warner Bros. Data Scientist role?

Lead with measurable impact. Warner Bros. cares about translating business problems into analytical solutions, so frame your bullets around outcomes: revenue lifted, engagement improved, churn reduced. Mention SQL, Python, and any ML model development explicitly since those get keyword-scanned. If you've worked in media, entertainment, streaming, or content analytics, put that front and center. Even tangential experience like marketing analytics or recommendation systems will resonate given their streaming focus.

What is the total compensation for a Warner Bros. Data Scientist?

At the junior level (P1, 0-2 years experience), total comp averages around $131,250 with a base of $114,000. Mid-level (P2) jumps to about $165,000 TC on a $145,000 base. Senior (P3) averages $205,000 TC, Staff (P4) hits $230,000, and Principal (P5) reaches roughly $260,000 in total comp. Ranges are wide though. A P4 can go anywhere from $170,000 to $300,000 depending on the team and negotiation.

How do I prepare for the behavioral interview at Warner Bros. Discovery?

Study their core values: Act as One Team, Create What's Next, Empower Storytelling, Champion Inclusion, and Dream It & Own It. I've seen candidates get tripped up because they prep generic behavioral answers without connecting to the company's culture. Prepare stories about cross-functional collaboration (product and business stakeholders especially), handling ambiguity, and championing new ideas. At senior levels, they really dig into how you influence stakeholders and communicate complex findings to non-technical audiences.

How hard are the SQL questions in the Warner Bros. Data Scientist interview?

For junior roles, expect medium-difficulty SQL covering joins, window functions, and aggregations. Nothing obscure, but you need to be fast and accurate. Mid-level and above, the questions layer in more analytical problem solving, so you might need to compute retention metrics or build cohort analyses in SQL on the spot. Practice at datainterview.com/questions to get comfortable with the media and entertainment style of analytics problems.

What machine learning and statistics concepts should I know for Warner Bros.?

Statistics and experimentation come up at every level. Know A/B testing fundamentals: power analysis, statistical significance, common pitfalls like peeking. For ML, focus on tree-based models (random forests, gradient boosting) and be ready to discuss model evaluation tradeoffs. Senior candidates should understand causal reasoning, end-to-end ML system design, and explainable AI techniques like SHAP values. Deep learning exposure is a plus but not the main focus.

What format should I use to answer behavioral questions at Warner Bros.?

Use the STAR format (Situation, Task, Action, Result) but keep it tight. Don't spend two minutes on context. I recommend a 30-second setup, then spend most of your time on what you specifically did and the measurable result. Warner Bros. values storytelling (it's literally one of their core values), so make your answers compelling. Quantify outcomes whenever possible, and always tie back to business impact rather than just technical achievement.

What happens during the Warner Bros. Data Scientist onsite interview?

The onsite (or virtual loop) typically includes a SQL and coding round, a statistics and experimentation round, a case-style product or business analytics discussion, and a behavioral round. For senior and staff levels, add a deep dive into a past project where you'll walk through end-to-end decisions, ambiguity handling, and measurable impact. Expect 4 to 5 sessions total, each around 45 to 60 minutes. Multiple interviewers will assess both technical depth and communication skills.

What business metrics and product concepts should I know for a Warner Bros. Data Scientist interview?

Think streaming and content. Know metrics like subscriber growth, churn rate, engagement (watch time, completion rate), content performance, and lifetime value. Warner Bros. Discovery is in the middle of a major streaming transition, so understanding acquisition funnels, retention drivers, and content recommendation logic is valuable. At senior levels, they'll test your ability to define the right metric for an ambiguous business problem, not just compute one you're given.

What education do I need to get hired as a Data Scientist at Warner Bros.?

A BS in a quantitative field like CS, Statistics, Math, Engineering, or Economics is the baseline. For mid-level and above, an MS or PhD is often preferred, especially for modeling-heavy roles. That said, strong industry experience can substitute for advanced degrees at most levels. If you don't have a graduate degree, make sure your resume clearly demonstrates applied ML and statistical work with real business outcomes.

What are common mistakes candidates make in the Warner Bros. Data Scientist interview?

The biggest one I see is going too deep on technical details without connecting to business value. Warner Bros. explicitly looks for people who can translate problems into analytical solutions and communicate findings to non-technical stakeholders. Another common mistake is underpreparing for the product sense and case-style questions, which are a real part of the loop, not just filler. Finally, don't neglect SQL practice. Candidates sometimes over-index on ML prep and then stumble on a window function question. Get reps in at datainterview.com/coding.

Dan Lee's profile image

Written by

Dan Lee

Data & AI Lead

Dan is a seasoned data scientist and ML coach with 10+ years of experience at Google, PayPal, and startups. He has helped candidates land top-paying roles and offers personalized guidance to accelerate your data career.

Connect on LinkedIn