Airbnb Data Scientist at a Glance
Total Compensation
$237k - $801k/yr
Interview Rounds
7 rounds
Difficulty
Levels
L3 - L7
Education
Bachelor's / Master's / PhD
Experience
0–20+ yrs
Most candidates who bomb Airbnb's DS loop fail on the product sense, not the stats. The interview weights Product Sense & Metrics and Experimentation at a combined 40%, and the questions are grounded in two-sided marketplace tradeoffs (guest conversion vs. host earnings, search ranking vs. listing diversity). Pure ML depth won't save you if you can't reason about why a pricing nudge on the checkout page might cannibalize bookings in a different market segment.
Airbnb Data Scientist Role
Primary Focus
Skill Profile
Math & Stats
ExpertDeep expertise in statistical modeling, advanced causal inference techniques, and experimental design is critical for understanding user behavior and measuring impact in a marketplace setting. A Master's or PhD in a quantitative field is typically required.
Software Eng
HighStrong ability to prototype models, write production-quality code for data science outputs, and collaborate effectively with software engineers to integrate solutions into complex systems. Comfort with developing proof-of-concept prototypes is essential.
Data & SQL
MediumUnderstanding of data products and inference frameworks, with the ability to leverage and contribute to data strategy. While not primarily a data engineering role, familiarity with maintaining data pipelines and data management tools is expected.
Machine Learning
ExpertExtensive experience with various machine learning techniques, including causal ML, computer vision, recommendations, and predictive modeling, to drive business outcomes and enhance user experience. This is a core competency for the role.
Applied AI
HighStrong interest and agility with modern AI, Large Language Models (LLMs), and topic modeling tools, with an expectation to apply a learner's mindset to dynamic AI systems and explore cutting-edge techniques.
Infra & Cloud
LowBasic understanding of how data science outputs are integrated into broader systems; direct cloud deployment or infrastructure management is not a primary responsibility for this role.
Business
ExpertA product-oriented mindset with a deep understanding of marketplace dynamics, user behavior, and the ability to translate data insights into impactful business strategies and solutions that improve guest and host experiences.
Viz & Comms
HighProven ability to communicate complex analytical findings clearly and effectively to diverse audiences, including senior leadership and cross-functional partners, through compelling storytelling and data visualization.
What You Need
- 9+ years of relevant industry experience (for Staff/Senior Staff roles)
- Master’s degree or PhD in a quantitative field (e.g., Computer Science, Statistics, Econometrics)
- Strong fluency in Python for hands-on IC work and advanced data analysis
- Advanced data analysis in SQL at scale
- Experience with causal inference and machine learning techniques (ideally in a marketplace setting)
- Comfort collaborating with software engineers to understand complex systems and abstracted logs
- Proven ability to communicate clearly and effectively to audiences of varying technical levels
- Demonstrated ability to create solutions with a product-oriented mindset over the user experience
- Ability to analyze and interpret large, complex, often unstructured data sets
- Analytical and problem-solving capabilities
- Ability to operate both independently and collaboratively in a team environment
Nice to Have
- Experience in recommendations, computer vision, information retrieval, or merchandising domains
- Strong interest and agility with related AI, LLM and topic modeling tools
- Domain experience in search, UX discovery, personalized evidence systems
Languages
Tools & Technologies
Want to ace the interview?
Practice with real questions.
At Airbnb, data scientists sit embedded within product pods like Search & Matching, Guest Growth, Trust & Safety, and Payments rather than in a centralized analytics org. You own the full lifecycle: scoping an experiment on dynamic pricing nudges, building a doubly-robust estimator for heterogeneous treatment effects across guest segments, then presenting a recommendation to product leadership. Success after year one means you've influenced a major product decision with causal evidence that changed what shipped.
A Typical Week
A Week in the Life of a Airbnb Data Scientist
Typical L5 workweek · Airbnb
Weekly time split
Culture notes
- Airbnb operates at a deliberate but high-accountability pace — weeks are structured around experiment cycles and cross-functional readouts rather than constant firefighting, but the expectation to ship rigorous, well-communicated insights is real.
- Since 2022 Airbnb has embraced a 'Live and Work Anywhere' policy so most data scientists work remotely with optional in-office days, though San Francisco-based teams often cluster in-person on Tuesdays and Wednesdays for cross-functional syncs.
The surprise in the breakdown isn't any single category. It's how much time goes to written artifacts: experiment design docs, readout decks, code documentation. If your mental model of a DS role is "query, model, repeat," the volume of cross-functional storytelling at Airbnb will catch you off guard. Expect upstream schema changes from Data Engineering to break your joins on random afternoons, and nobody else will patch your query for you.
Projects & Impact Areas
Search ranking and pricing algorithms are where DS work most directly shapes Airbnb's marketplace balance, influencing metrics like booking-through rate and listing diversity across 220+ countries. The forecasting team, meanwhile, builds models that finance leadership relies on for revenue planning, a very different flavor of DS work from the experimentation-heavy search pod. Trust & Safety data scientists use LLM-based topic modeling to cluster negative guest reviews and surface emerging pain points around check-in experiences, an area where Airbnb is actively expanding its DS headcount.
Skills & What's Expected
The most underrated skill for this role is software engineering. Airbnb rates it high, meaning you're expected to write production-quality Python with proper docstrings and push reproducible scripts to shared repos. Cloud deployment and infrastructure management are low priority, but data pipelines and data management tools sit at medium, so don't confuse "you won't deploy to prod" with "you can ignore how data flows." The overrated prep move? Over-indexing on MLOps tooling when the interview leans hard on causal inference and product thinking.
Levels & Career Growth
Airbnb Data Scientist Levels
Each level has different expectations, compensation, and interview focus.
$159k
$65k
$17k
What This Level Looks Like
Works on well-defined problems within a single project or feature area. Scope is typically limited to assigned tasks with clear deliverables and requires significant guidance from senior team members. Note: This is an estimate as sources lack specific data.
Day-to-Day Focus
- →Execution of assigned analytical tasks.
- →Learning the team's technical stack, data sources, and business context.
- →Developing core data science skills in areas like SQL, Python/R, statistical analysis, and data visualization.
Interview Focus at This Level
Interviews focus on foundational technical skills, including SQL, probability and statistics, basic machine learning concepts, and product sense. Candidates are assessed on their ability to solve well-defined problems and communicate their thought process clearly. Note: This is an estimate as sources lack specific data.
Promotion Path
Promotion to L4 requires demonstrating the ability to independently own and deliver on small to medium-sized projects, showing a deeper understanding of the business domain, and consistently producing high-quality analytical work with less supervision. Note: This is an estimate as sources lack specific data.
Find your level
Practice with questions tailored to your target level.
The L5-to-L6 jump is where careers stall, because Staff isn't about shipping better analyses. It's about setting org-wide methodology, like defining experimentation standards or building the causal inference framework other teams adopt. The IC track reaches Principal (L7) without managing people, but the scope expectations at that level (per Airbnb's own descriptions) involve driving technical strategy across multiple product areas and partnering with Directors and VPs.
Work Culture
Airbnb's live-and-work-anywhere policy lets you work remotely within your country, with San Francisco teams clustering in-person on Tuesdays and Wednesdays for cross-functional syncs rather than following a fixed RTO schedule. The pace is deliberate but high-accountability: weeks revolve around experiment cycles and readout deadlines, not constant firefighting. Shared experiment registries, collaborative code reviews, and a culture of documenting decisions mean your written communication matters as much as your Python.
Airbnb Data Scientist Compensation
The one-year cliff is the detail that matters most here. Leave before month 12 and you walk away with zero equity, which makes that first year a real retention lock. After the cliff, quarterly vesting smooths things out, but you should evaluate any RSU grant conservatively since share prices can move a lot between offer day and vest day.
When negotiating, the source data confirms that base, RSUs, and sign-on bonus are all movable pieces. RSU count and sign-on bonus tend to have more room than base salary, so lead with those. A sign-on bonus specifically helps offset the cliff year where you're earning no equity, and the offer negotiation process explicitly supports asking for one. Bring a competing offer if you have one; it gives the recruiter concrete numbers to work with when going back to the comp team.
Airbnb Data Scientist Interview Process
7 rounds·~5 weeks end to end
Initial Screen
1 roundRecruiter Screen
This initial conversation with a recruiter will cover your background, experience, and career aspirations. You'll discuss why you're interested in Airbnb and the Data Scientist role, as well as your salary expectations and availability.
Tips for this round
- Clearly articulate your motivation for joining Airbnb and how your skills align with the Data Scientist role.
- Be prepared to discuss your resume highlights and how they demonstrate relevant experience.
- Research Airbnb's mission and recent product developments to show genuine interest.
- Have a clear understanding of your salary expectations and be ready to communicate them.
- Prepare a few questions to ask the recruiter about the role, team, or company culture.
Technical Assessment
1 roundCoding & Algorithms
Expect a technical phone assessment designed to evaluate your foundational coding and data manipulation skills. This round typically involves solving problems using SQL or Python, focusing on data retrieval, transformation, and basic algorithmic thinking.
Tips for this round
- Practice intermediate SQL queries, including joins, aggregations, window functions, and subqueries.
- Review fundamental Python data structures (lists, dictionaries) and common algorithms (sorting, searching).
- Be ready to explain your thought process out loud as you solve the problem.
- Consider edge cases and potential errors in your code or query logic.
- Ensure your code is clean, readable, and well-commented.
Take Home
1 roundTake Home Assignment
You'll be given a real-world dataset and a business problem to solve within a set timeframe, typically 3 hours. This challenge assesses your ability to perform end-to-end data analysis, from cleaning and exploration to modeling and presenting insights.
Tips for this round
- Structure your solution clearly, including problem understanding, data exploration, methodology, results, and conclusions.
- Focus on delivering actionable insights relevant to the business problem, not just technical correctness.
- Demonstrate strong data cleaning, feature engineering, and basic modeling skills.
- Use clear visualizations to support your findings and make your presentation compelling.
- Document your code thoroughly and explain your assumptions and decisions.
Onsite
4 roundsCoding & Algorithms
A 60-minute live session where you'll solve more complex coding problems, often involving data manipulation or algorithmic challenges. The interviewer will observe your problem-solving approach, coding proficiency, and ability to write efficient and clean code.
Tips for this round
- Practice datainterview.com/coding-style problems, focusing on medium difficulty, especially those involving arrays, strings, and hash maps.
- Clearly communicate your approach before writing any code, discussing data structures and algorithms.
- Think out loud throughout the coding process, explaining your choices and handling edge cases.
- Test your code with various inputs and be prepared to debug if necessary.
- Aim for optimal time and space complexity, and be ready to discuss trade-offs.
Product Sense & Metrics
You'll be presented with a product scenario and asked to apply your analytical skills to define metrics, design A/B tests, and interpret results. This round evaluates your ability to translate business problems into data questions and drive product decisions.
System Design
The interviewer will probe your ability to design an end-to-end machine learning system for a given problem, such as a recommendation engine or fraud detection system. You'll need to discuss data sources, model selection, feature engineering, deployment, and monitoring.
Behavioral
This is Airbnb's version of assessing your cultural fit, leadership potential, and how you handle past challenges, successes, and teamwork scenarios. You'll be asked about your experiences, problem-solving approaches, and how you align with Airbnb's values.
Tips to Stand Out
- Master the Fundamentals. Ensure a strong grasp of statistics, probability, SQL, and Python. These are the bedrock of data science at Airbnb and will be tested across multiple rounds.
- Practice Product Sense and A/B Testing. Airbnb is a product-driven company. Be ready to analyze product features, define success metrics, design experiments, and interpret results to drive business impact.
- Sharpen Your ML System Design Skills. For more senior roles, understanding how to design, deploy, and monitor machine learning systems at scale is crucial. Focus on end-to-end thinking, trade-offs, and MLOps concepts.
- Refine Your Communication. Clearly articulate your thought process, assumptions, and conclusions in all technical and behavioral rounds. Data scientists at Airbnb need to effectively communicate complex ideas to diverse audiences.
- Prepare Behavioral Stories. Use the STAR method to craft compelling narratives about your past experiences, highlighting problem-solving, teamwork, leadership, and how you embody Airbnb's values.
- Understand Airbnb's Business. Research Airbnb's products, recent news, and strategic initiatives. Show how your data science skills can contribute to their specific challenges and opportunities.
- Code Cleanly and Efficiently. Whether in a live coding session or a take-home, demonstrate good coding practices, including readability, modularity, and efficiency.
Common Reasons Candidates Don't Pass
- ✗Insufficient Core Technical Skills. Candidates often struggle with foundational statistics, probability, SQL, or Python coding, indicating a lack of readiness for the technical demands of the role.
- ✗Weak Machine Learning Fundamentals. A poor understanding of ML concepts, model validation, regularization, or inability to justify model choices frequently leads to rejection.
- ✗Poor Communication & Product Fit. Inability to clearly articulate thoughts, translate business problems into data questions, or demonstrate product intuition is a significant red flag.
- ✗Inadequate Data Wrangling & ETL Skills. Difficulty cleaning messy data, weak SQL abilities for complex joins/aggregations, or limited experience with real-world data at scale can hinder progress.
- ✗Shallow Experimentation/MLOps Knowledge. Lack of experience in designing robust A/B tests, understanding experiment tracking, or knowledge of ML system deployment and monitoring processes.
- ✗Lack of Real-World Impact/Scale Experience. Candidates who have only worked on toy problems or small datasets and cannot discuss challenges related to data pipelines, latency, or feature stores in a production environment.
Offer & Negotiation
Airbnb's compensation packages typically include a competitive base salary, annual performance bonus, and Restricted Stock Units (RSUs) that vest over a four-year period, often with a one-year cliff. When negotiating, focus on the total compensation package. You can often negotiate the base salary, a sign-on bonus, and the number of RSUs. Leverage any competing offers you may have to strengthen your position, and be prepared to articulate your value and market worth based on your skills and experience.
The loop runs about five weeks end to end. Candidates who get rejected most often share a pattern: strong technical output with weak storytelling. The take-home is where this kills you. Airbnb's take-home asks you to explore a messy dataset and present findings, and reviewers weight narrative clarity and actionable recommendations as heavily as code quality. A notebook full of pandas transformations with no written "so what" about guest conversion or host retention won't advance.
The behavioral round carries real veto power in the final decision. Airbnb's interview materials explicitly ask you to connect your past work to their mission of creating belonging through travel, and interviewers probe for specific examples of building inclusive teams or designing for diverse user populations across their 220+ country footprint. Candidates from pure quant backgrounds sometimes treat this as a formality, then get a rejection despite strong technical scores.
Airbnb Data Scientist Interview Questions
Product Sense & Metrics
Expect questions that force you to translate a fuzzy product goal (e.g., improve guest booking, host quality, or Trust & Safety) into crisp metrics, guardrails, and decision criteria. You’ll be evaluated on marketplace thinking (two-sided incentives, leakage, seasonality) and whether your metric choices survive real-world edge cases.
Airbnb adds an LLM generated listing summary on the listing page to improve guest understanding and conversion. What is your primary success metric and two guardrails, and how do you handle substitution effects across Search, Listing, and Checkout funnels?
Sample Answer
Most candidates default to overall conversion rate, but that fails here because you will misread funnel substitution and quality degradation (for example, more clicks to checkout but worse post-stay outcomes). Use an incremental funnel metric like booking conversion per search session (or per unique visitor) with consistent attribution, plus guardrails on cancellation rate and customer support contact rate per trip. Add a Trust and Safety guardrail for report rate on misrepresentation, since LLM text can hallucinate amenities. Segment by market, length of stay, and device, seasonality will otherwise swamp the signal.
You suspect Instant Book increased bookings but also increased host cancellations due to calendar conflicts. What metric would you optimize, what are your top two guardrails, and what decision rule would you use if bookings go up but cancellations also rise?
Airbnb changes search ranking to push cheaper listings higher to improve affordability. How do you measure impact on marketplace health when guest conversion improves but host earnings and long-term supply might drop?
Experimentation & A/B Testing
Most candidates underestimate how much rigor is expected in experiment design for a marketplace: unit of randomization, interference, exposure logging, and guardrail definition matter as much as p-values. You’ll need to justify power/MDE, interpret messy outcomes, and recommend next actions under ambiguity.
Airbnb tests a new default sort on Search results (Guests), but only sessions with at least one eligible listing are exposed and logged. What is the correct estimand and analysis population, and how do you handle bias from exposure-only logging?
Sample Answer
Use an intent-to-treat estimand on the randomized population, then treat exposure as a post-randomization variable and fix logging so all assigned sessions are measurable. Exposure-only logging breaks comparability because treatment can change who becomes exposed, so conditioning on exposure induces selection bias. Analyze outcomes for all assigned units using missingness-aware instrumentation (for example, log zero results and non-render events) and add a sanity check on assignment balance and missing-rate differences by arm. If you cannot backfill, report the exposure-conditioned estimate as descriptive only, not causal.
You roll out a pricing recommendation badge to Hosts, but the metric is Guest booking conversion and there is interference via shared listings and market-level price competition. How do you design the experiment to get a causal estimate, specify the unit of randomization, and define a primary metric and guardrails?
Causal Inference (Beyond A/B Tests)
Your ability to reason about causality when randomization isn’t possible is a core differentiator at staff levels. Interviews often probe identification strategy choices (DiD, IV, matching/uplift, regression discontinuity), assumptions, and how to diagnose violations in travel/marketplace data.
Airbnb rolls out a new cancellation policy that applies only to listings with flexible cancellation and only in specific EU countries, and you need the causal impact on booking conversion and host earnings. What identification strategy do you use, and what are the top two assumption checks you run before trusting the estimate?
Sample Answer
You could do a difference-in-differences using treated countries and untreated countries, or a matching plus regression adjustment using similar listings across all markets. DiD wins here because you have a clear policy shock and rich pre-period data, so you can difference out stable cross-country and listing-level confounding. You then check parallel trends using pre-policy event-time coefficients, and you probe composition changes (like listings entering, leaving, or changing cancellation settings) that can fake treatment effects.
Trust & Safety introduces an automated identity verification flow, but it is triggered only when a risk score exceeds a threshold and the score also drives manual review intensity. How do you estimate the causal effect of verification on chargebacks while separating it from the risk score and manual review effects?
You want the causal effect of enabling Smart Pricing on host revenue, but hosts self-select and higher-quality listings are more likely to enable it. Propose a credible non-experimental design, state the key identifying assumption, and name one falsification test you would run using Airbnb data.
Statistics (Modeling, Uncertainty, Forecasting Basics)
The bar here isn’t whether you know statistical concepts, it’s whether you can deploy them to make correct product decisions under noise and selection bias. Be ready to discuss variance decomposition, confidence/credible intervals, multiple testing, and practical forecasting considerations like seasonality and cannibalization.
You run an A/B test on a new search ranking change and measure guest conversion (booking sessions divided by search sessions) daily for 14 days, with strong weekend seasonality. How do you compute a 95% interval for lift that is valid under day-to-day correlation and seasonality, and what unit of analysis do you choose?
Sample Answer
Reason through it: You need an interval that matches the randomization unit, so start by checking whether assignment is at user, session, or market level, then aggregate metrics to that unit before inference. Daily ratios are autocorrelated and seasonality breaks i.i.d., so treating 14 days as 14 independent samples underestimates variance. Use a cluster-robust approach (cluster by randomized unit, and optionally block by day-of-week) or a block bootstrap that resamples randomized units while preserving the calendar structure. If you must use time as the unit, use a paired design by day-of-week (or a regression with day-of-week fixed effects and robust SEs) so weekends do not inflate or deflate the lift estimate.
You forecast next month’s total nights booked for a set of cities to plan customer support staffing, and you know price changes and host cancellations can cause structural breaks. Describe a forecasting approach that outputs both a point forecast and a calibrated 80% prediction interval, and how you would detect and handle cannibalization across nearby cities.
SQL at Scale (Analytics Queries)
In practice, you’ll be judged on whether you can pull the right dataset reliably from complex event logs and marketplace tables. Common pitfalls include double-counting due to joins, incorrect windowing for cohorts, and writing queries that don’t align with the experiment exposure definition.
Given guest search events (with a session_id) and subsequent booking events, compute daily search-to-booking conversion rate where a booking is attributed to the most recent search in the same session within 7 days.
Sample Answer
This question is checking whether you can do correct attribution at scale without double counting. You need to pick exactly one search per booking (the latest eligible one) and then aggregate by the search day. Most people fail by joining searches to bookings directly and inflating conversions.
-- Compute daily search-to-booking conversion with last-touch attribution within session and 7 days
-- Assumed tables:
-- analytics.search_events(search_id, user_id, session_id, searched_at, checkin_date, checkout_date, city_id)
-- analytics.booking_events(booking_id, user_id, session_id, booked_at, status)
WITH eligible_bookings AS (
SELECT
b.booking_id,
b.user_id,
b.session_id,
b.booked_at
FROM analytics.booking_events b
WHERE b.status = 'confirmed'
),
search_booking_pairs AS (
SELECT
b.booking_id,
b.booked_at,
s.search_id,
s.searched_at,
-- rank searches per booking by recency
ROW_NUMBER() OVER (
PARTITION BY b.booking_id
ORDER BY s.searched_at DESC
) AS rn
FROM eligible_bookings b
JOIN analytics.search_events s
ON s.session_id = b.session_id
AND s.user_id = b.user_id
AND s.searched_at <= b.booked_at
AND s.searched_at >= b.booked_at - INTERVAL '7 days'
),
last_touch_attribution AS (
SELECT
booking_id,
search_id,
searched_at,
booked_at
FROM search_booking_pairs
WHERE rn = 1
),
searches_by_day AS (
SELECT
DATE_TRUNC('day', searched_at) AS search_day,
COUNT(DISTINCT search_id) AS searches
FROM analytics.search_events
GROUP BY 1
),
bookings_attributed_by_search_day AS (
SELECT
DATE_TRUNC('day', searched_at) AS search_day,
COUNT(DISTINCT booking_id) AS attributed_bookings
FROM last_touch_attribution
GROUP BY 1
)
SELECT
s.search_day,
s.searches,
COALESCE(b.attributed_bookings, 0) AS attributed_bookings,
CASE
WHEN s.searches = 0 THEN 0
ELSE COALESCE(b.attributed_bookings, 0)::DECIMAL / s.searches
END AS search_to_booking_cvr
FROM searches_by_day s
LEFT JOIN bookings_attributed_by_search_day b
ON b.search_day = s.search_day
ORDER BY 1;You run an experiment on the checkout page; using an exposure log, compute per-variant booking conversion where conversion is a confirmed booking within 14 days after first exposure, and users can be exposed multiple times.
Compute monthly host cancellation rate by listing city, where the denominator is bookings created in the month and the numerator is bookings that later become host-cancelled, but count each booking once even if its status changes multiple times in logs.
Applied Machine Learning (Recommendations/Ranking/Product ML)
Rather than optimizing fancy architectures, you’ll be pushed to choose the right modeling approach for a product lever and defend tradeoffs. Expect conversations about offline vs online evaluation, bias/feedback loops in ranking, feature leakage, and how ML ties back to marketplace metrics.
You are launching a new home feed ranker that mixes “probability of booking” with “expected booking value” (nightly price times nights). What offline evaluation setup would you use to compare two rankers when the labels are censored by exposure and users rarely scroll past the first 10 results?
Sample Answer
The standard move is to use counterfactual offline evaluation with propensity weights, for example IPS or SNIPS using logged impression propensities, and report a top-$k$ metric like NDCG@$k$ or expected bookings@$k$. But here, position bias and support mismatch matter because the new policy changes what gets exposed, so your propensities can be near-zero and explode variance, you need clipping, doubly robust estimators, or a small online holdout to validate calibration and directionality.
Your search ranker uses an embedding feature built from the past 30 days of guest to listing interactions, and offline AUC jumps 8 points but online bookings drop and cancellation rate rises. What specific leakage or feedback-loop checks do you run, and what redesign would you propose to prevent the issue while keeping personalization?
Coding & Algorithms (Python for Data Work)
You’ll encounter timed coding that’s less about obscure CS tricks and more about being correct, fast, and clean with data-oriented logic. Strong performance comes from writing readable functions, handling edge cases, and reasoning about complexity when manipulating arrays/maps and computing metrics.
You receive Airbnb search impression logs as a stream of events (listing_id, ts, session_id, is_click). Return the top $k$ listings by click-through rate computed as total clicks divided by total impressions, but only for listings with at least $m$ impressions, breaking ties by higher impression count then smaller listing_id.
Sample Answer
Get this wrong in production and ranking features drift, you end up promoting noisy listings with 1 click on 1 impression. The right call is to aggregate impressions and clicks per listing, filter by the minimum impression threshold, then sort by CTR with deterministic tie-breaks. Guard divide-by-zero by filtering before division. Complexity stays linear in events plus sorting the filtered set.
from __future__ import annotations
from collections import defaultdict
from typing import Iterable, List, Tuple, Dict
def top_k_listings_by_ctr(
events: Iterable[Tuple[int, int, str, bool]],
k: int,
m: int,
) -> List[Tuple[int, float, int]]:
"""Compute top-k listings by CTR.
Args:
events: Iterable of (listing_id, ts, session_id, is_click).
Each event is an impression; is_click indicates whether it was clicked.
k: number of listings to return.
m: minimum number of impressions required to be eligible.
Returns:
List of tuples (listing_id, ctr, impressions) sorted by:
1) ctr descending
2) impressions descending
3) listing_id ascending
"""
if k <= 0:
return []
impressions: Dict[int, int] = defaultdict(int)
clicks: Dict[int, int] = defaultdict(int)
for listing_id, _ts, _session_id, is_click in events:
impressions[listing_id] += 1
if is_click:
clicks[listing_id] += 1
eligible: List[Tuple[int, float, int]] = []
for listing_id, imp in impressions.items():
if imp >= m:
clk = clicks.get(listing_id, 0)
ctr = clk / imp
eligible.append((listing_id, ctr, imp))
eligible.sort(key=lambda x: (-x[1], -x[2], x[0]))
return eligible[:k]
if __name__ == "__main__":
sample_events = [
(101, 1, "s1", False),
(101, 2, "s1", True),
(102, 3, "s2", True),
(102, 4, "s2", False),
(102, 5, "s3", False),
(103, 6, "s4", True),
]
# m=2 removes listing 103 (only 1 impression)
print(top_k_listings_by_ctr(sample_events, k=2, m=2))
Given booking events (host_id, guest_id, ts, is_cancelled), count how many guests are "repeat bookers" for the same host, meaning they have at least 2 non-cancelled bookings with that host that are at least $d$ days apart in time.
You have daily gross booking value (GBV) per city as a list of records (city, date_yyyy_mm_dd, gbv). For each city, compute the maximum 7-day rolling sum of GBV, treating missing dates as $0$, and return the city with the highest such rolling sum (break ties by city name ascending).
The distribution above tells a clear story, but what it doesn't show is how the categories bleed into each other during actual interviews. A question that starts as "define a success metric for Instant Book" can pivot mid-conversation into designing an experiment with marketplace interference, then into "randomization isn't possible because the policy only launched in EU markets, so how do you estimate the causal effect?" That chain (product sense into experimentation into causal inference) is where Airbnb's loop gets uniquely punishing, because you can't compartmentalize your prep into neat buckets.
The most common misallocation of prep time, from what candidates report, is drilling Python algorithms and ML model tuning while treating experimentation and causal inference as "stats I already know." Airbnb's questions in those areas aren't textbook. They're grounded in specific marketplace headaches: interference between hosts and guests in the same experiment, risk-score thresholds that create selection bias in Trust & Safety rollouts, cancellation policy changes that vary by country and listing type.
Practice questions grounded in Airbnb's guest/host dynamics and marketplace experimentation challenges at datainterview.com/questions.
How to Prepare for Airbnb Data Scientist Interviews
Know the Business
Official mission
“Airbnb’s mission is to create a world where anyone can belong anywhere.”
What it actually means
Airbnb's real mission is to facilitate human connection and a sense of belonging globally by providing a platform for unique accommodations and experiences. It aims to build a trusted community that enables people to travel, live, and work anywhere, fostering cultural understanding and local economic opportunities.
Key Business Metrics
$12B
+12% YoY
$77B
-24% YoY
8K
+12% YoY
Current Strategic Priorities
- Achieve more than 1 billion annual guests by 2028
Competitive Moat
Airbnb has publicly committed to reaching more than a billion annual guests by 2028, a goal the company describes as bringing "magical travel to everyone." That ambition, sitting on top of $12.2B in revenue (12% YoY growth), means DS teams are deeply embedded in the mechanics of scaling a two-sided marketplace. The engineering blog's "From Data to Action" series on Airbnb Plus gives you a window into how DSs here move from quality scoring to product decisions, and it's worth reading before your loop.
Don't answer "why Airbnb" with a story about loving travel. Instead, pick a specific DS challenge visible in Airbnb's own public materials. The company is hiring staff-level experimentation roles and forecasting roles embedded in finance, which tells you where the org is stretching. If your background involves causal inference for marketplace dynamics, or building forecasting systems where supply and demand pull in opposite directions, say that. Anchor your answer to a real team you'd want to join, not a vibe.
Try a Real Interview Question
First-time host conversion within 14 days of signup
sqlCompute the conversion rate to first booking for hosts within $14$ days of their signup date, grouped by signup week (week starts Monday). A host is converted if they have at least one booking with status $\text{confirmed}$ and a booking start date in $[\text{signup\_date}, \text{signup\_date} + 14]$. Output: $\text{signup\_week}$, $\text{hosts\_signed\_up}$, $\text{hosts\_converted}$, $\text{conversion\_rate}$.
| hosts | | | |
|------| | | |
| host_id | signup_date | country | acquisition_channel |
|---------|-------------|---------|---------------------|
| 101 | 2024-01-02 | US | seo |
| 102 | 2024-01-05 | US | paid_search |
| 103 | 2024-01-08 | FR | referral |
| 104 | 2024-01-10 | US | seo |
| listings | | |
|----------| | |
| listing_id | host_id | created_date |
|------------|---------|--------------|
| 201 | 101 | 2024-01-03 |
| 202 | 102 | 2024-01-06 |
| 203 | 103 | 2024-01-09 |
| 204 | 104 | 2024-01-20 |
| bookings | | | |
|----------| | | |
| booking_id | listing_id | start_date | status |
|------------|------------|-------------|------------|
| 301 | 201 | 2024-01-12 | confirmed |
| 302 | 202201 | 2024-01-13 | confirmed |
| 303 | 202 | 2024-01-25 | cancelled |
| 304 | 203 | 2024-01-18 | confirmed |700+ ML coding problems with a live Python executor.
Practice in the EngineAirbnb's take-home and coding rounds reward candidates who combine clean, readable code with a clear narrative about what the numbers mean. Practicing with product-flavored data problems (not abstract algorithm puzzles) is the highest-ROI use of your prep time. Build that muscle at datainterview.com/coding.
Test Your Readiness
How Ready Are You for Airbnb Data Scientist?
1 / 10Can I define a north star metric for Airbnb search and booking, explain its tradeoffs, and break it into input metrics (for example, search to listing view rate, booking conversion, cancellation rate, host acceptance rate)?
Ground every practice answer in Airbnb-specific context (guest vs. host incentives, seasonality, the Reserve Now Pay Later feature) rather than giving a generic textbook response. Drill more questions at datainterview.com/questions.
Frequently Asked Questions
How long does the Airbnb Data Scientist interview process take?
From first recruiter screen to offer, expect roughly 4 to 6 weeks. The process typically starts with a recruiter call, then a technical phone screen (usually SQL and stats), followed by a full onsite loop. Scheduling the onsite can take a week or two depending on availability. If things go well, you might hear back within a week after the onsite, but Airbnb's hiring committee review can add a few extra days.
What technical skills are tested in the Airbnb Data Scientist interview?
SQL and Python are non-negotiable. You'll be tested on advanced SQL at scale, probability and statistics, A/B testing, causal inference, and machine learning. For senior levels (L5+), expect deeper questions on experimentation design, causal inference methods, and ML modeling. Product sense comes up at every level. Airbnb operates a two-sided marketplace, so understanding supply/demand dynamics and user experience tradeoffs is important. You can practice SQL and Python problems at datainterview.com/coding.
How should I tailor my resume for an Airbnb Data Scientist role?
Lead with measurable impact. Airbnb cares a lot about product-oriented thinking, so frame your experience around how your work improved a product or user experience, not just that you built a model. Mention marketplace experience if you have it. Highlight A/B testing, causal inference, and Python/SQL fluency explicitly. If you have a Master's or PhD in a quantitative field like statistics, CS, or econometrics, make sure that's prominent. For Staff (L6) and above, emphasize cross-functional leadership and projects you scoped from scratch.
What is the total compensation for Airbnb Data Scientists by level?
Airbnb pays well. At L3 (Junior, 0-3 years experience), total comp averages around $240K with a base of about $159K. L4 (Mid, 2-6 years) is similar at roughly $237K total. L5 (Senior, 2-8 years) jumps to about $334K total with a $210K base. L6 (Staff, 5-12 years) averages $502K total, and L7 (Principal, 12-20 years) hits around $801K. RSUs vest over 4 years with a 1-year cliff, then quarterly after that.
How do I prepare for Airbnb's behavioral and culture-fit interview?
Airbnb takes culture seriously. Their core values are Champion the Mission, Be a Host, Embrace the Adventure, and Be a Cereal Entrepreneur. You need stories that map to these. 'Be a Host' means you put others first and think about belonging. 'Embrace the Adventure' means you've taken risks or handled ambiguity. 'Be a Cereal Entrepreneur' is about scrappiness and resourcefulness (it's a reference to Airbnb's founding story). Prepare 4 to 5 stories that show these values in action, and be genuine about why Airbnb's mission of human connection resonates with you.
How hard are the SQL questions in the Airbnb Data Scientist interview?
They're above average difficulty. Airbnb specifically tests advanced SQL at scale, so expect window functions, complex joins, CTEs, and questions involving large messy datasets. You won't get simple SELECT-FROM-WHERE problems. At L4 and above, you might be asked to write queries that handle edge cases like NULL values or duplicate records in marketplace data. I'd recommend practicing with realistic multi-table problems at datainterview.com/questions to get comfortable with the complexity.
What machine learning and statistics concepts does Airbnb test?
At junior levels (L3), expect probability, basic hypothesis testing, and foundational ML concepts. By L4 and L5, you need strong A/B testing knowledge, causal inference techniques (think difference-in-differences, instrumental variables), and practical ML modeling experience. For Staff level (L6), they go deep into your specialized area, whether that's experimentation, causal inference, or ML. Airbnb is a marketplace, so understanding how to run experiments when there are network effects or interference between treatment groups is a real differentiator.
What format should I use for behavioral answers at Airbnb?
Use the STAR format (Situation, Task, Action, Result) but keep it tight. Airbnb interviewers want specifics, not vague generalities. Spend about 20% on context and 60% on what you actually did. Always quantify results. I've seen candidates lose points by telling long stories with no clear outcome. Also, tie your answers back to Airbnb's values when it feels natural. If you led a scrappy initiative with limited resources, that's a perfect 'Be a Cereal Entrepreneur' moment.
What happens during the Airbnb Data Scientist onsite interview?
The onsite is typically a full loop of 4 to 5 interviews spread across a day. Expect a SQL/coding round, a statistics and experimentation round, a product sense or business case round, and at least one behavioral interview focused on Airbnb's core values. For senior roles (L5+), there may be a system design or project deep-dive round where you walk through past work and how you handled ambiguity. Each interviewer submits independent feedback, and a hiring committee makes the final call.
What metrics and business concepts should I know for the Airbnb Data Scientist interview?
Know Airbnb's business inside and out. Understand key metrics like nights booked, guest-to-host ratio, booking conversion rate, search-to-book funnel, and host activation/retention. Think about both sides of the marketplace. Airbnb generates $12.2B in revenue, so be ready to discuss how pricing, availability, and trust (reviews, verification) drive that. Product sense questions often ask you to define success metrics for a new feature or diagnose a drop in a KPI. Practice breaking down metrics into components and identifying root causes.
What are common mistakes candidates make in the Airbnb Data Scientist interview?
The biggest one I see is ignoring the marketplace context. Airbnb isn't a simple B2C product. If you propose an A/B test without considering how it affects both hosts and guests, that's a red flag. Another common mistake is being too theoretical in stats questions without connecting back to practical business decisions. Also, don't underestimate the behavioral rounds. Candidates who clearly haven't researched Airbnb's values or can't articulate why they want to work there get filtered out, even with strong technical performance.
What education do I need for an Airbnb Data Scientist position?
A Bachelor's in a quantitative field like statistics, CS, economics, or math is the minimum at L3 and L4. A Master's or PhD is common but not strictly required at those levels. By L6 (Staff) and above, most candidates have an advanced degree, and for L7 (Principal), a PhD or MS is typical, though extensive industry experience (12-20 years) can substitute. For Staff roles specifically, Airbnb lists 9+ years of relevant experience as a requirement, so the bar is high regardless of degree.




