Cruise Data Scientist Interview Guide

Dan Lee's profile image
Dan LeeData & AI Lead
Last updateFebruary 26, 2026
Cruise Data Scientist Interview

Cruise Data Scientist at a Glance

Total Compensation

$230k - $520k/yr

Interview Rounds

6 rounds

Difficulty

Levels

L3 - L7

Education

PhD

Experience

0–20+ yrs

Python SQL R (possible/acceptable alternative) Java (plus/acceptable per some postings) C++ (plus/acceptable per some postings) Scala (plus/acceptable per some postings)applied-mlpredictive-modelingoperations-analyticscustomer-analyticspricing-and-promotionsdemand-forecastingdata-pipelines-etldata-visualization-bi

From hundreds of mock interviews, one pattern stands out with Cruise candidates: they prep like it's a pure modeling role and get blindsided by how much pipeline and infrastructure work the job demands. Cruise rates software engineering, data architecture, and ML all as "high," and the interview loop tests all three with equal seriousness.

Cruise Data Scientist Role

Primary Focus

applied-mlpredictive-modelingoperations-analyticscustomer-analyticspricing-and-promotionsdemand-forecastingdata-pipelines-etldata-visualization-bi

Skill Profile

Math & StatsSoftware EngData & SQLMachine LearningApplied AIInfra & CloudBusinessViz & Comms

Math & Stats

High

Strong applied statistics and operations research foundations: time series forecasting, regression, hierarchical forecasting, hypothesis testing/ANOVA, experiment design, power analysis, quasi-experimental methods (e.g., diff-in-diff, propensity scoring), and optimization techniques (linear/mixed-integer programming, heuristics).

Software Eng

High

Software-centric analytics with production expectations: modular coding, unit testing, Git-based versioning, reproducibility/documentation standards, CI/CD for model code, building APIs/batch processes, and integrating models into business workflows (often with engineering/IT).

Data & SQL

High

Designing analytics-ready datasets from multiple enterprise sources (ERP, planning/logistics platforms, external signals), SQL/Python-based data wrangling, repeatable production-grade preparation workflows, data quality checks, and training/inference/scoring/monitoring pipelines (e.g., Databricks/Spark).

Machine Learning

High

End-to-end ML development: EDA, feature engineering (leakage prevention), model building/tuning across regression/classification/clustering/ensembles and time-series, validation (cross-validation/bootstrapping/hyperparameter search), error analysis, and interpretability (e.g., SHAP/LIME).

Applied AI

Medium

Some roles include NLP, embeddings, and mention of generative AI agents; exposure to deep learning (PyTorch/TensorFlow) and NLP tooling (spaCy, Hugging Face) is listed as a plus. GenAI depth may vary by team; estimate is conservative due to limited Cruise-specific evidence in provided sources.

Infra & Cloud

High

Cloud and MLOps-oriented deployment: Azure ML (preferred in sources), Databricks, MLflow tracking/lineage, model monitoring/retraining, and supporting deployment surfaces such as containerized apps and lightweight model-serving frameworks (e.g., FastAPI/Flask/Streamlit/Dash).

Business

High

Strong business partnership and value orientation: translating Finance/Supply Chain or broader business problems into DS work, defining success metrics, quantifying financial/operational impact (cost savings, working capital, service levels), supporting scenario planning, and driving adoption/change management.

Viz & Comms

High

Clear stakeholder communication and decision support: structured reports/dashboards, model performance monitoring and drift reporting, executive-ready readouts/memos, and visualization tools (Power BI, matplotlib/seaborn/Plotly) with the ability to explain trade-offs to non-technical audiences.

What You Need

  • Python for data science (pandas/NumPy) and model development
  • SQL for extraction, transformation, and analysis
  • Machine learning model development (regression, classification, clustering, ensembles, time series)
  • Feature engineering and data quality validation on messy, real-world datasets
  • Model validation and experimentation (cross-validation, backtesting; A/B testing when applicable)
  • Optimization/operations research for supply chain/operations use cases (e.g., linear/mixed-integer programming) where applicable
  • Production mindset: reproducibility, documentation, Git-based version control
  • Stakeholder partnership: problem framing, success metrics, and impact measurement

Nice to Have

  • MLOps practices (CI/CD, monitoring, retraining, model lineage; MLflow)
  • Databricks/Spark for distributed data processing
  • Azure ML (or similar cloud ML platforms)
  • Explainable AI tooling (SHAP, LIME, partial dependence plots)
  • NLP, embeddings, and/or deep learning (PyTorch/TensorFlow; spaCy/Hugging Face)
  • API development and lightweight model serving (FastAPI/Flask) and/or app prototyping (Streamlit/Dash)
  • Operations/supply chain domain experience (forecasting, inventory, procurement, logistics)
  • Agile collaboration tooling and practices (Jira/Confluence; Agile/Waterfall exposure)

Languages

PythonSQLR (possible/acceptable alternative)Java (plus/acceptable per some postings)C++ (plus/acceptable per some postings)Scala (plus/acceptable per some postings)

Tools & Technologies

Azure MLDatabricksApache SparkMLflowGit/GitHubscikit-learnXGBoostLightGBMstatsmodelsPyTorch (plus)TensorFlow (plus)spaCy (plus)Hugging Face (plus)Power BImatplotlibseabornPlotlyFastAPI (plus)Flask (plus)Streamlit (plus)

Want to ace the interview?

Practice with real questions.

Start Mock Interview

You're joining a team that builds models and measurement systems for operations, demand forecasting, and customer analytics across Cruise's business. The tech stack centers on Databricks and Spark for distributed data processing, Azure ML for compute and deployment, and MLflow for experiment tracking and model lineage. Success after year one looks like owning a model end-to-end, from feature engineering through production monitoring, and having it adopted by cross-functional partners in operations or finance.

A Typical Week

A Week in the Life of a Cruise Data Scientist

Typical L5 workweek · Cruise

Weekly time split

Analysis22%Coding20%Meetings18%Writing17%Research8%Break8%Infrastructure7%

Culture notes

  • Cruise operates at a fast but deliberate pace — the safety-critical nature of autonomous vehicles means rigorous validation is expected, but timelines are still ambitious and the work feels urgent.
  • The team is hybrid with a strong pull toward in-office days at the SF headquarters (typically 3+ days per week), especially for cross-functional collaboration with engineering teams.

The thing that catches people off guard isn't the modeling. It's how much of your week goes to activities that don't feel like "data science": writing stakeholder decks, grooming Jira backlogs, patching upstream data issues before they poison a training set. If you're coming from a role where an engineer productionized your work, expect an adjustment period.

Projects & Impact Areas

Demand forecasting and pricing optimization are core workstreams, feeding directly into revenue and fleet utilization decisions. Those prediction problems sit alongside segmentation and cohort analysis for understanding customer behavior and informing geographic or operational expansion. Data quality and pipeline reliability tie everything together, because models built on enterprise sources (ERP, logistics platforms, external signals) break when schemas drift or upstream joins go stale.

Skills & What's Expected

Software engineering is the most underrated skill for this role. The widget shows it rated "high," same as ML and statistics, but candidates consistently under-prepare for it. Cruise expects production-quality Python with unit tests, Git workflows, and reproducibility standards. GenAI knowledge is rated only "medium," so don't burn prep time on LLM architectures. Classical ML, time-series forecasting, and strong experimental design will serve you far better here.

Levels & Career Growth

Cruise Data Scientist Levels

Each level has different expectations, compensation, and interview focus.

Base

$145k

Stock/yr

$55k

Bonus

$30k

0–2 yrs Typically BS in a quantitative field (CS/Stats/Math/Engineering/Economics) or equivalent practical experience; MS/PhD common but not required for entry-level.

What This Level Looks Like

Owns well-scoped analyses and small modeling components that influence a feature, metric, or operational decision for a team; impact is local to a product area with clear guidance and defined success metrics.

Day-to-Day Focus

  • Data wrangling and correctness (SQL proficiency, definitions, instrumentation)
  • Sound statistical reasoning and experiment analysis
  • Basic modeling and evaluation fundamentals
  • Clear written communication and stakeholder management for scoped work
  • Reproducibility (versioned code, documentation, peer review)

Interview Focus at This Level

Emphasis on fundamentals: SQL and data manipulation, basic statistics/experiment design, practical analytics case studies, and ability to explain tradeoffs and assumptions; some roles may include light coding (Python) and introductory ML concepts rather than deep system design.

Promotion Path

Promotion to L4 requires consistently delivering end-to-end analyses/models with minimal guidance, improving metric definitions/data quality, demonstrating strong statistical judgment, and driving measurable impact beyond a single task (owning a small project, influencing roadmap decisions, and reliably communicating to cross-functional partners).

Find your level

Practice with questions tailored to your target level.

Start Practicing

The jump to L6 (Staff) is where people stall, because it requires cross-team influence and setting measurement standards that other pods adopt, not just building better models. Cruise's post-2023 restructuring made the org leaner, which cuts both ways: broader ownership and faster visibility for those who stay, but less certainty about long-term team stability. Ask your hiring manager directly about headcount plans for the next 12 months.

Work Culture

The culture_notes in Cruise's own materials describe a hybrid setup with a strong pull toward in-office days, particularly for cross-functional collaboration. Working on safety-adjacent systems creates a more rigorous review culture than you'd find at a typical e-commerce or ad-tech DS team: code reviews are thorough, validation expectations are high, and documentation isn't optional. Worth asking your recruiter about current in-office expectations, since the physical footprint has shifted post-restructuring.

Cruise Data Scientist Compensation

The stock component in Cruise offers is the line item you understand least before asking. The widget shows annual stock figures by level, but the underlying instrument (RSUs, options, phantom equity, or something else) isn't publicly documented. Neither is the vesting schedule, cliff structure, or refresh grant cadence. Before you evaluate any offer, ask your recruiter to spell out exactly what form the equity takes and how it vests, then get those details confirmed in the offer letter.

Negotiation at Cruise comes down to knowing which levers move. The offer notes suggest level, base salary, sign-on bonus, and equity refresh are where you have room, while bonus targets are tied to level and rarely flex. If the equity instrument carries uncertainty you're not comfortable with, ask to shift value toward a larger sign-on or higher base instead.

Cruise Data Scientist Interview Process

6 rounds·~4 weeks end to end

Initial Screen

2 rounds
1

Recruiter Screen

30mPhone

To start, you’ll have a recruiter conversation focused on role fit, team alignment, and your motivation for autonomous vehicles and safety-critical work. Expect resume walkthrough, timeline/leveling discussion, and compensation band alignment. You may also get light calibration questions on your DS toolkit (SQL, experimentation, ML) to route you to the right interview loop.

generalbehavioralproduct_sensemachine_learning

Tips for this round

  • Prepare a 90-second narrative tying your past work to autonomy themes (safety, reliability, real-world constraints, edge cases).
  • Have crisp examples of impact with metrics (e.g., latency reduced, false positive rate improved, incident rate reduced) and your exact role.
  • Be ready to state preferred teams/domains (mapping/localization, perception analytics, fleet ops, safety, product analytics) and why.
  • Confirm logistics early: remote vs onsite, number of rounds, whether there is a coding screen, and expected decision timeline.
  • Share a realistic compensation range anchored to level and location, and ask what components are in-scope (base, bonus, equity/RSUs).

Technical Assessment

3 rounds
3

SQL & Data Modeling

60mLive

Expect a live SQL session where you write queries under time pressure and explain your reasoning out loud. You’ll likely handle joins, aggregations, window functions, and building metrics from event-style data. A portion may test how you model tables or reason about data quality issues common in telemetry and logging systems.

databasedata_modelingstatisticsproduct_sense

Tips for this round

  • Drill window functions (ROW_NUMBER, LAG/LEAD, rolling averages) and be able to justify partition/order choices.
  • Practice join hygiene: detect fanout, handle NULLs, choose correct join type (inner/left/right) and explain why.
  • Use CTEs to keep logic readable; narrate intermediate outputs you expect at each step to show debugging skill.
  • Be explicit about time zones, deduping events, and late-arriving data—common pitfalls in vehicle/event telemetry.
  • If asked to model data, propose a clear grain (trip, segment, disengagement, intervention) and state primary keys and invariants.

Onsite

1 round
6

Behavioral

60mVideo Call

In the final stage, you’ll go through a behavioral interview that emphasizes collaboration, ownership, and judgment in high-stakes contexts. Expect deep dives on conflict, prioritization, and how you communicate tradeoffs to technical and non-technical stakeholders. Some questions may be reflective (e.g., what you would change about the company/team) and assess your ability to give constructive, specific feedback.

behavioralgeneralproduct_senseengineering

Tips for this round

  • Use STAR with quantified outcomes; include what you personally did, not just what the team delivered.
  • Have a crisp story about handling ambiguity (missing labels, shifting requirements, incomplete telemetry) and how you de-risked it.
  • Prepare examples of pushing back on flawed metrics or unsafe conclusions, and how you influenced the decision.
  • Show strong cross-functional communication: writeups, dashboards, review meetings, and aligning on definitions.
  • Practice a thoughtful “what would you change” answer: pick a real theme (process clarity, experimentation rigor) and propose a measured improvement.

Tips to Stand Out

  • Anchor your stories in autonomy realities. Tie projects to safety, reliability, latency, and rare-edge-case performance; explain how you evaluated and monitored models/metrics in production-like conditions.
  • Demonstrate SQL fluency with telemetry-style data. Be comfortable with event logs, window functions, deduping, and defining metric grains (per mile/per trip/per intervention) while calling out data quality pitfalls.
  • Use a rigorous metrics framework. For any product/ops question, lead with Goal → North Star metric → guardrails → segments → decision rule; highlight tradeoffs (precision/recall, false positives vs false negatives).
  • Show experimental and causal maturity. Clearly state when A/B testing is feasible, when it’s not, and what causal alternatives you’d use along with the assumptions you’d validate.
  • Communicate like a cross-functional DS partner. Narrate your thinking, define terms, and propose lightweight artifacts (one-pagers, dashboards, alerts) that help engineering/product/safety act on insights.
  • Practice coding for correctness under pressure. Prioritize clean Python, edge-case handling, and quick testing; don’t over-optimize prematurely unless the interviewer asks.

Common Reasons Candidates Don't Pass

  • Shallow problem framing. Candidates jump into modeling or querying without clarifying the decision, metric definitions, constraints, and safety/operational guardrails.
  • Weak SQL fundamentals. Incorrect joins, missed fanout, inability to use window functions, or failure to reason about table grain and data quality leads to low confidence in day-to-day execution.
  • Statistics without judgment. Knowing formulas but missing assumptions, confounders, power considerations, or rare-event uncertainty makes results seem unreliable for safety-critical decisions.
  • Coding that doesn’t reach a correct solution. Poor edge-case coverage, unreadable code, or inability to test/debug in real time signals execution risk.
  • Cross-functional friction. Blaming stakeholders, unclear communication, or inability to influence decisions without authority suggests poor fit for interdisciplinary autonomy teams.
  • Overconfidence about model performance. Not addressing monitoring, drift, labeling/ground truth limitations, and offline-vs-online gaps raises concerns about production readiness.

Offer & Negotiation

For Data Scientist offers at a company like Cruise, compensation is typically a mix of base salary + annual bonus target + equity (often RSUs with multi-year vesting, commonly 4 years with a 1-year cliff and periodic vest thereafter). The most negotiable levers are level/title, base within band, sign-on bonus, and equity refresh amount; bonus target is usually less flexible. Use competing offers or calibrated market data to justify the level you’re targeting, and ask to optimize for your priorities (cash via sign-on/base vs long-term upside via equity), while confirming vesting schedule and any performance/retention conditions.

Four weeks is the typical timeline from recruiter call to offer. The most common rejection pattern is shallow problem framing: candidates jump into a query or model without first clarifying the decision at stake, the metric definitions, or the safety guardrails that constrain the answer space. At Cruise, where an analytical misstep can ripple into vehicle behavior decisions, skipping the scoping step lands harder than an unfinished solution.

The standalone Statistics & Probability round catches people off guard. Most DS loops at other companies fold stats questions into a case study or ML discussion, but Cruise dedicates a full 60 minutes to probability, experimental design, and causal reasoning tied to AV scenarios like rare-event detection and geo-based experimentation where rider-level randomization isn't feasible. "Statistics without judgment" shows up repeatedly as a rejection reason, so if your prep plan splits time evenly across rounds, shift more toward stats.

Cruise Data Scientist Interview Questions

Applied Machine Learning (Forecasting, Prediction, Segmentation)

Expect questions that force you to choose and defend modeling approaches for messy operational and guest data (forecasting, regression/classification, clustering). The challenge is demonstrating you can prevent leakage, pick the right metrics, and explain tradeoffs in ways operators can act on.

You need a weekly forecast of guest-service ticket volume per ship for the next 8 weeks, but new ships have only 3 to 6 weeks of history and sailing calendars change mid-season. What model family and backtesting scheme do you use, and how do you prevent leakage from schedule and staffing fields that are updated after the week ends?

MediumForecasting and Leakage

Sample Answer

Most candidates default to a single global XGBoost regressor with random cross-validation, but that fails here because it leaks future information and overweights long-history ships. You need time-based backtesting (rolling-origin) with ship-aware splits so each validation fold mimics forecasting unseen weeks. Treat schedule and staffing as versioned snapshots, only features known at forecast creation time are allowed. For sparse ships, use a hierarchical or pooled model (global with ship random effects or embeddings) so cold-start ships borrow strength without pretending you have history you do not.

Practice more Applied Machine Learning (Forecasting, Prediction, Segmentation) questions

Statistics & Experimentation

Most candidates underestimate how much rigor you need around uncertainty: hypothesis tests, power, variance, and interpreting noisy results. You’ll be evaluated on turning business questions into testable statistical statements and avoiding common pitfalls like peeking, multiple testing, and Simpson’s paradox.

You A/B test a new rider pickup UI in the Cruise guest app and see conversion from "open app" to "request ride" increase from $10.0\%$ to $10.6\%$ with $n=200{,}000$ sessions per arm. What hypothesis test do you run, what is the test statistic, and what assumptions must hold for the $p$-value to be valid?

EasyHypothesis Testing for Proportions

Sample Answer

Run a two-sample $z$-test for proportions using a pooled variance estimate under $H_0: p_T=p_C$. The statistic is $$z=\frac{\hat p_T-\hat p_C}{\sqrt{\hat p(1-\hat p)\left(\frac{1}{n_T}+\frac{1}{n_C}\right)}}$$ where $\hat p$ is the pooled conversion across arms. You need random assignment, independent sessions (or you cluster by user if sessions repeat), and enough effective sample size for the normal approximation. If you peeked or ran many metrics, the raw $p$-value is no longer interpretable without correction.

Practice more Statistics & Experimentation questions

SQL Analytics (Joins, Window Functions, Cohorts)

In practice you’ll be asked to extract truth from multiple enterprise tables, so fluency with joins, window functions, and cohort-style logic matters. Getting this right is often about careful definitions (active guest, booking, cancellation, voyage) and handling duplicates, late-arriving data, and grain mismatches.

For each voyage_id, return the top 3 shore excursions by attach rate, where attach rate is unique bookings that purchased the excursion divided by unique bookings on that voyage (exclude cancelled bookings). Use a deterministic tie break by excursion_id.

EasyJoins and Window Functions

Sample Answer

You could compute attach rate with a pre-aggregated excursion-by-voyage table joined to a booking-by-voyage table, or by joining everything at the line-item grain and using distinct counts. The pre-aggregation approach wins here because it avoids fanout bugs and makes the denominator (bookings per voyage) unambiguous before ranking.

SQL
1WITH active_bookings AS (
2  -- Keep the booking grain clean: one row per active booking on a voyage
3  SELECT
4    b.booking_id,
5    b.voyage_id
6  FROM bookings b
7  WHERE b.status <> 'CANCELLED'
8),
9bookings_per_voyage AS (
10  SELECT
11    ab.voyage_id,
12    COUNT(*) AS active_bookings
13  FROM active_bookings ab
14  GROUP BY ab.voyage_id
15),
16excursion_bookings AS (
17  -- Reduce purchases to the booking-excursion grain to avoid duplicates
18  SELECT DISTINCT
19    ab.voyage_id,
20    ep.excursion_id,
21    ab.booking_id
22  FROM excursion_purchases ep
23  JOIN active_bookings ab
24    ON ab.booking_id = ep.booking_id
25  WHERE ep.purchase_status = 'PURCHASED'
26),
27excursion_attach AS (
28  SELECT
29    eb.voyage_id,
30    eb.excursion_id,
31    COUNT(*) AS bookings_with_excursion
32  FROM excursion_bookings eb
33  GROUP BY eb.voyage_id, eb.excursion_id
34),
35scored AS (
36  SELECT
37    ea.voyage_id,
38    ea.excursion_id,
39    ea.bookings_with_excursion,
40    bpv.active_bookings,
41    (ea.bookings_with_excursion * 1.0) / NULLIF(bpv.active_bookings, 0) AS attach_rate,
42    ROW_NUMBER() OVER (
43      PARTITION BY ea.voyage_id
44      ORDER BY (ea.bookings_with_excursion * 1.0) / NULLIF(bpv.active_bookings, 0) DESC,
45               ea.excursion_id ASC
46    ) AS rn
47  FROM excursion_attach ea
48  JOIN bookings_per_voyage bpv
49    ON bpv.voyage_id = ea.voyage_id
50)
51SELECT
52  voyage_id,
53  excursion_id,
54  attach_rate,
55  bookings_with_excursion,
56  active_bookings
57FROM scored
58WHERE rn <= 3
59ORDER BY voyage_id, rn;
Practice more SQL Analytics (Joins, Window Functions, Cohorts) questions

Data Pipelines & Data Quality for Analytics

Your ability to reason about how raw operational systems become reliable model-ready datasets is a core differentiator. You’ll need to show you can design repeatable transformations, validate data quality, and build monitoring checks that prevent silent metric/model regressions.

Cruise launches a new Power BI dashboard for on-time departure rate by port and sailing date, but the metric spikes to 120% after a schema change in the source ops table. What data quality checks do you add to the daily pipeline so this cannot silently ship again?

EasyData Quality Monitoring

Sample Answer

Reason through it: Start by locking down metric definitions, then enumerate the minimum invariants that must always hold. Add freshness checks (expected partitions for each sailing date), volume checks (row count within a band), and validity checks (rate bounded in $[0, 1]$ and denominators greater than $0$). Add referential integrity checks (each departure event joins to exactly one sailing and port), plus uniqueness on the event id to catch accidental duplication from schema changes. Finally, wire these checks to fail the job and alert owners, and backfill only after the root cause is fixed.

Practice more Data Pipelines & Data Quality for Analytics questions

ML Coding (Python/pandas, Modeling Workflow)

The bar here isn’t whether you know Python syntax, it’s whether you can write clean, testable analysis code under time pressure. You’ll likely implement feature/label construction, metric computations, and a small modeling loop while keeping reproducibility and edge cases in mind.

You have ride request logs for Cruise robotaxis with columns [request_id, city, requested_at, pickup_at, canceled_at, status]. Build a pandas function that returns daily per-city metrics: request_count, completed_count, cancel_rate, and p90_wait_seconds where wait is pickup_at minus requested_at for completed rides only.

Easypandas Feature Engineering and Metrics

Sample Answer

This question is checking whether you can turn messy event logs into stable, leakage-free features and metrics. It also checks whether you handle null timestamps, status logic, and percentile math correctly. Most people fail on time parsing and accidentally include canceled rides in wait-time percentiles.

Python
1import pandas as pd
2import numpy as np
3
4
5def daily_city_ride_metrics(df: pd.DataFrame) -> pd.DataFrame:
6    """Compute daily, per-city operational metrics from ride request logs.
7
8    Expected columns:
9      - request_id: unique id
10      - city: str
11      - requested_at: timestamp-like
12      - pickup_at: timestamp-like, null if not picked up
13      - canceled_at: timestamp-like, null if not canceled
14      - status: e.g., 'completed', 'canceled', 'no_show', etc.
15
16    Returns a DataFrame with columns:
17      - date
18      - city
19      - request_count
20      - completed_count
21      - cancel_rate
22      - p90_wait_seconds (completed rides only)
23    """
24
25    required = {"request_id", "city", "requested_at", "pickup_at", "canceled_at", "status"}
26    missing = required - set(df.columns)
27    if missing:
28        raise ValueError(f"Missing required columns: {sorted(missing)}")
29
30    x = df.copy()
31
32    # Parse timestamps defensively.
33    for col in ["requested_at", "pickup_at", "canceled_at"]:
34        x[col] = pd.to_datetime(x[col], errors="coerce", utc=True)
35
36    # Define the daily grain on request time to avoid post-treatment leakage.
37    x["date"] = x["requested_at"].dt.floor("D")
38
39    # Base counts.
40    grp = ["date", "city"]
41    out = x.groupby(grp, dropna=False).agg(
42        request_count=("request_id", "nunique"),
43        completed_count=("status", lambda s: (s == "completed").sum()),
44        canceled_count=("status", lambda s: (s == "canceled").sum()),
45    )
46
47    # Cancel rate over all requests.
48    out["cancel_rate"] = np.where(out["request_count"] > 0, out["canceled_count"] / out["request_count"], np.nan)
49
50    # Wait time for completed rides only. Require pickup_at and requested_at.
51    completed = x.loc[(x["status"] == "completed") & x["pickup_at"].notna() & x["requested_at"].notna(), grp + ["pickup_at", "requested_at"]].copy()
52    completed["wait_seconds"] = (completed["pickup_at"] - completed["requested_at"]).dt.total_seconds()
53
54    # Guard against negative waits from clock skew or bad ingestion.
55    completed.loc[completed["wait_seconds"] < 0, "wait_seconds"] = np.nan
56
57    p90 = (
58        completed.groupby(grp)["wait_seconds"]
59        .quantile(0.9, interpolation="linear")
60        .rename("p90_wait_seconds")
61        .to_frame()
62    )
63
64    out = out.join(p90, how="left")
65
66    # Clean final columns.
67    out = out.reset_index()
68    out = out.drop(columns=["canceled_count"])
69
70    return out
71
Practice more ML Coding (Python/pandas, Modeling Workflow) questions

Business Problem Framing & Stakeholder Communication

When you translate ambiguous operator or commercial asks into measurable goals, you show senior-level impact. Interviewers will look for crisp success metrics, back-of-the-envelope ROI thinking (cost, utilization, revenue), and a clear narrative that drives decisions rather than just reporting results.

Ops asks, "Reduce ride cancellations this quarter," for a Cruise driverless ride product in Phoenix. What 3 success metrics do you lock in, and how do you translate them into an expected weekly dollar impact?

EasyMetrics and ROI framing

Sample Answer

The standard move is to anchor on a primary outcome metric (cancellation rate per eligible ride request), plus two guardrails (ETA p95, incident and safety intervention rate). But here, capacity and demand mix matter because cancellations can drop by simply rejecting more requests, so you also lock in acceptance rate or completed rides per vehicle-hour. Dollar impact comes from $\Delta\text{completed rides} \times \text{net contribution per ride}$, plus any call-center or re-routing costs avoided.

Practice more Business Problem Framing & Stakeholder Communication questions

Applied ML and Stats & Experimentation together form the heaviest slice, but the compounding difficulty comes from how they overlap at Cruise: you'll propose a forecasting model for ride demand across SF zones, then immediately need to design a geo-based experiment to validate it when rider-level randomization isn't feasible. The biggest prep mistake is treating Data Pipelines & Data Quality as an afterthought, because Cruise's Terra platform and petabyte-scale sensor ingestion mean interviewers will probe schema evolution and late-arriving event handling alongside your modeling answers, not in isolation.

Practice questions across all six areas at datainterview.com/questions.

How to Prepare for Cruise Data Scientist Interviews

Know the Business

Updated Q1 2026

Cruise's real mission is to develop and deploy self-driving car technology to provide autonomous vehicle services, primarily robotaxis, aiming to transform urban transportation.

San Francisco, CaliforniaHybrid - Flexible

Key Business Metrics

Revenue

$10B

+5% YoY

Market Cap

$11B

-2% YoY

Employees

42K

+2% YoY

Current Strategic Priorities

  • Diversifying cruise offerings to cater to varied passenger profiles
  • Developing ships as primary destinations rather than just transport
  • Expanding luxury and smaller-scale cruise experiences
  • Targeting specific regional markets, such as Asia, with purpose-built ships
  • Responding to rising costs and shifting regional demand

Cruise is GM's autonomous vehicle subsidiary, and everything a data scientist touches here ties back to one question: is the car safe enough to carry a passenger? Their in-house Spark-based platform, Terra, ingests petabytes of sensor and ride telemetry, so pipeline fluency isn't optional. You'll spend real hours on schema evolution and data quality, not just modeling.

Most candidates answer "why Cruise?" with some version of "self-driving cars are the future." That's forgettable. Show instead that you grasp the constraint that separates this role from every other DS job: a model error here isn't a bad recommendation, it's a safety incident. Connect Cruise's emphasis on people and safety culture to a concrete example of how you've handled high-consequence decisions, or explain specifically how your validation process would change when the cost of a false negative is a collision. After subleasing SoMa office space in 2024, the physical footprint shrank, so ask your recruiter about current in-office expectations before your first screen.

Try a Real Interview Question

Promo uplift vs baseline bookings by sailing

sql

Compute incremental bookings attributed to a promo for each sailing by comparing observed promo-period daily bookings $y$ to a baseline defined as the mean daily bookings $\bar{x}$ over the 7 days immediately before the promo starts for the same sailing. Output one row per sailing with promo start and end dates, baseline mean, promo-period total bookings, and uplift defined as $\sum y - \bar{x} \cdot n$ where $n$ is the number of promo days.

daily_bookings
booking_datesailing_idbookings
2025-01-01S110
2025-01-02S112
2025-01-03S111
2025-01-08S120
2025-01-09S122
promos
promo_idsailing_idstart_dateend_date
P1S12025-01-082025-01-10
P2S22025-02-052025-02-07
P3S32025-03-122025-03-14

700+ ML coding problems with a live Python executor.

Practice in the Engine

Cruise's coding round, from what candidates report, leans toward end-to-end data workflows in pandas and scikit-learn rather than pure algorithm puzzles. Expect to load messy data, engineer features, and evaluate a model in a single notebook. Build that muscle at datainterview.com/coding by practicing full modeling pipelines, not isolated functions.

Test Your Readiness

How Ready Are You for Cruise Data Scientist?

1 / 10
Machine Learning

Can you design a demand forecasting approach for rides per zone and hour (including seasonality, holidays, and event effects), choose an evaluation metric, and explain how you would backtest it without leakage?

Sharpen your weakest topic area, particularly the probability and experimental design questions that catch people off guard, at datainterview.com/questions.

Frequently Asked Questions

How long does the Cruise Data Scientist interview process take?

Most candidates report the Cruise Data Scientist process taking around 4 to 6 weeks from first recruiter call to offer. You'll typically start with a recruiter screen, move to a technical phone screen (SQL and Python focused), and then an onsite or virtual onsite loop. Scheduling can stretch things out, especially if the team is busy, so don't be surprised if it takes a bit longer.

What technical skills are tested in the Cruise Data Scientist interview?

SQL and Python are non-negotiable. You'll be tested on data manipulation with pandas and NumPy, SQL extraction and transformation, and machine learning model development (regression, classification, clustering, ensembles, time series). Feature engineering on messy real-world data comes up a lot, which makes sense given Cruise deals with autonomous vehicle sensor data. Some roles also test optimization and operations research concepts like linear or mixed-integer programming. Knowing Git and having a production mindset around reproducibility will help you stand out.

How should I tailor my resume for a Cruise Data Scientist role?

Lead with impact metrics, not just tools. Cruise cares about problem framing and stakeholder partnership, so show examples where you defined success metrics and measured real business impact. Highlight experience with messy, real-world datasets and any work in experimentation or A/B testing. If you've done anything related to autonomous vehicles, robotics, or sensor data, put that front and center. Keep it to one page for L3/L4, two pages max for senior levels.

What is the total compensation for a Cruise Data Scientist?

Compensation at Cruise is strong, especially given the San Francisco location. At L3 (Junior, 0-2 years), total comp averages $230K with a range of $190K to $270K and base around $145K. L4 (Mid, 3-7 years) averages $283K total with base around $171K. L5 (Senior, 5-9 years) averages about $332K total with base around $190K. At the Principal level (L7, 10-20 years), total comp can hit $520K on average, ranging from $380K to $700K. Equity is included in these numbers, though specific vesting details aren't publicly documented.

How do I prepare for the behavioral interview at Cruise?

Cruise values collaboration, continuous learning, and innovation, so your stories should reflect those themes. Prepare 5 to 6 stories that show you working across teams, learning from failure, and driving impact in ambiguous situations. For senior levels (L5+), expect deep questions about stakeholder management and cross-functional decision-making. I'd recommend framing answers around Cruise's mission of transforming urban transportation with self-driving technology. Show genuine excitement about autonomous vehicles.

How hard are the SQL questions in the Cruise Data Scientist interview?

At L3, expect solid intermediate SQL: window functions, CTEs, joins across multiple tables, and aggregation with filtering. By L4 and L5, you'll see harder data wrangling problems that require you to clean and transform messy datasets, not just query clean tables. The questions tend to be practical rather than tricky for the sake of being tricky. I've seen candidates underestimate the SQL round and regret it. Practice with realistic data problems at datainterview.com/questions to get comfortable with the style.

What ML and statistics concepts should I know for a Cruise Data Scientist interview?

Statistics is huge here. At every level, you need hypothesis testing, confidence intervals, statistical power, and experiment design. For L4+, add causal inference and causal reasoning to that list. On the ML side, know regression, classification, clustering, ensemble methods, and time series modeling. Be ready to explain tradeoffs between models, not just how they work. At senior levels (L5, L6, L7), expect questions on structuring ambiguous problems and making pragmatic modeling choices with real constraints.

What is the best format for answering behavioral questions at Cruise?

Use the STAR format (Situation, Task, Action, Result) but keep it tight. Spend about 20% on setup and 80% on what you actually did and what happened. Quantify results whenever possible. For Cruise specifically, tie your answers back to collaboration and continuous learning since those are core values. At senior levels, emphasize how you influenced decisions across teams. Don't ramble. Two minutes per answer is the sweet spot.

What happens during the Cruise Data Scientist onsite interview?

The onsite (often virtual) typically includes 4 to 5 rounds. Expect a SQL and data wrangling round, a Python coding or modeling round, a statistics and experimentation round, and at least one behavioral round. For senior candidates, there's usually a problem framing or case study round where you structure an ambiguous data science problem end to end. At L6 and L7, expect deeper probing on cross-team impact and methodological depth. Each round is roughly 45 to 60 minutes.

What metrics and business concepts should I know for the Cruise Data Scientist interview?

Think about metrics that matter for an autonomous vehicle company. Ride completion rates, safety metrics, fleet utilization, cost per mile, and rider satisfaction are all fair game for case study questions. You should understand how to define success metrics for a product or feature, how to measure incremental impact, and how to frame tradeoffs between competing metrics. At Cruise, the ability to connect a data science problem to real business or safety outcomes is what separates good candidates from great ones.

What education do I need for a Cruise Data Scientist role?

For L3 (Junior), a BS in a quantitative field like CS, Statistics, Math, Engineering, or Economics is typical. An MS or PhD is common but not required at entry level. By L4 and L5, an MS or PhD is often preferred, especially for modeling-heavy or autonomy-critical teams. At L7 (Principal), you'll typically see MS/PhD in CS, Statistics, Math, or Robotics, though equivalent deep industry experience in applied ML and experimentation at scale can substitute.

What are common mistakes candidates make in the Cruise Data Scientist interview?

The biggest mistake I see is treating it like a generic data science interview. Cruise operates in autonomous vehicles, so you need to show you can handle messy, real-world data and think about safety-critical systems. Another common mistake is weak problem framing. Jumping straight to a model without defining the problem, success metrics, and assumptions will hurt you at any level. Finally, don't skip SQL prep. Candidates who focus only on ML and ignore SQL consistently underperform. Practice both at datainterview.com/coding.

Dan Lee's profile image

Written by

Dan Lee

Data & AI Lead

Dan is a seasoned data scientist and ML coach with 10+ years of experience at Google, PayPal, and startups. He has helped candidates land top-paying roles and offers personalized guidance to accelerate your data career.

Connect on LinkedIn