Target Data Scientist Guide (2026): Job, Salary & Interviews

Target Data Scientist at a Glance

Total Compensation

$115k - $280k/yr

Interview Rounds

6 rounds

Difficulty

Levels

P2 - P5

Education

PhD

Experience

0–18+ yrs

Python SQL Scala R Java Kotlinretailproduct_analyticsexperimentationpredictive_modelingapplied_machine_learning

Certain Target teams are hiring for recommender systems, reinforcement learning, and LLM-powered search ranking. That makes specific pods within this org feel more like applied ML shops than what you'd expect from a retailer. If you're prepping only for SQL and basic classification questions, you'll be underprepared for the ML depth some of these teams demand.

Target Data Scientist Role

Primary Focus

retailproduct_analyticsexperimentationpredictive_modelingapplied_machine_learning

Skill Profile

Math & Stats

High

Strong foundations expected in linear algebra, probability, statistics, optimization and analytical thinking; applied to data exploration, anomaly detection, and model development (including recommender systems, reinforcement learning, deep learning).

Software Eng

High

Emphasis on production-quality, maintainable code and engineering rigor: unit testing, code organization, CI/CD basics, modular architecture, code reviews, documentation, troubleshooting, and working with large collaborative codebases.

Data & SQL

High

Experience creating and operating data/feature pipelines and working with big data ecosystems (Hadoop, Hive, PySpark); ability to query large databases (SQL/HQL) and generate features/labels at scale for ML systems.

Machine Learning

Expert

Core expectation to design, develop, deploy and maintain ML models in production at scale; includes deep learning, recommender systems, reinforcement learning, NLP/search relevance, evaluation/selection of techniques, and advancing state-of-the-art approaches.

Applied AI

High

For GenAI-focused roles, requires hands-on exposure to programmatic use of LLMs/multimodal LLMs, fine-tuning, prompt engineering, embeddings, and RAG (vector DB/Elastic); responsible AI standards are referenced. Scope varies by team, so this is role-dependent.

Infra & Cloud

Medium

Working knowledge of cloud AI/ML platforms is expected; deployment/productionization experience is emphasized. Containerization (Docker/Kubernetes) appears as preferred rather than required in the GenAI posting.

Business

High

Must translate evolving business needs into data science solutions; leverage retail domain knowledge, understand Target priorities/strategy, identify opportunities from analyses, and produce actionable insights with business partners/product teams.

Viz & Comms

High

Strong written/verbal communication and storytelling required, including clear narratives, appropriate visualizations/graphs, and producing documents that drive actionable insights; regular collaboration across global teams.

What You Need

Production ML model development and deployment at scale (ML Ops orientation)
Applied ML expertise in at least one of: recommender systems, reinforcement learning, search/ranking/NLP (team-dependent)
Large-scale data analysis: cleaning, transforming, manipulating and analyzing large datasets
Big data experience: Hadoop, Hive, PySpark
Strong programming and software engineering best practices (unit tests, CI/CD basics, modular design, documentation, code reviews)
SQL (and/or HQL) for querying large databases
Deep learning model development (PyTorch or TensorFlow noted in GenAI role)
Collaboration with product/business partners; agile development practices

Nice to Have

GenAI/LLM ecosystem experience (programmatic use); multimodal LLMs, fine-tuning
Prompt engineering, embeddings, and Retrieval-Augmented Generation (RAG)
Vector databases and/or Elasticsearch (explicitly mentioned as plus)
Docker and Kubernetes (containerization)
Computer vision methods (preferred in GenAI posting)
Information retrieval/search indexing/ranking experience (search role)

Languages

PythonSQLScalaRJavaKotlin

Tools & Technologies

HadoopHivePySparkCloud AI/ML platforms (unspecified; stated as required working knowledge)PyTorchTensorFlowDockerKubernetesElasticsearchVector databases (unspecified)CI/CD tooling (unspecified; basics expected)

Want to ace the interview?

Practice with real questions.

Start Mock Interview

You're embedded with business teams across personalization, supply chain, search, and pricing. The work ranges from building production ML models (recommender systems for Target Circle, ranking models for Target.com search) to designing experiments and translating results for merchandising partners. What "success" looks like depends heavily on your team and level: a P3 on the MarTech team might ship a production recommender model and show measurable lift in click-through rate, while a P3 on an experimentation-focused team might own the measurement framework for a major product launch.

A Typical Week

A Week in the Life of a Target Data Scientist

Typical L5 workweek · Target

Weekly time split

Coding — 22%Analysis — 18%Meetings — 15%Break — 15%Writing — 13%Research — 10%Infrastructure — 7%

Culture notes

Target runs at a steady corporate pace with genuine respect for work-life balance — most data scientists log off by 5:30 PM and weekend Slack is rare.
The hybrid policy requires three days per week in the Minneapolis headquarters on Nicollet Mall, with most teams anchoring Tuesday through Thursday in-office.

The split between writing code and writing words is closer to even than most candidates expect. Target's engineering culture treats documentation (feature design docs, experiment readout decks, Confluence writeups) as a first-class artifact, so your ability to narrate results matters as much as your ability to produce them. If you're someone who ships a model and moves on without writing anything down, you'll feel friction here.

Projects & Impact Areas

The MarTech applied ML team works on recommender systems and reinforcement learning to personalize what Target Circle members see across app and web. Search and ranking is a separate group applying NLP and LLMs to product discovery on Target.com, a problem that compounds in difficulty every time the assortment expands. Supply chain optimization (including last-mile delivery) and pricing/promotions round things out, and job postings for these teams emphasize production-grade, cost-efficient systems rather than just offline model accuracy.

Skills & What's Expected

ML expertise is the headline (rated expert in the skill dimensions), but the skill that quietly separates successful candidates is software engineering discipline. Target's postings call out unit testing, CI/CD, modular architecture, and code reviews alongside the ML requirements. You're expected to write production Python, build data pipelines in PySpark and Hive, and maintain your models rather than prototype in a notebook and hand off. Communication is rated high too, and the source data defines it broadly: clear narratives, appropriate visualizations, and storytelling that drives action with non-technical partners.

Levels & Career Growth

Target Data Scientist Levels

Each level has different expectations, compensation, and interview focus.

Base

$105k

Stock/yr

$5k

Bonus

$5k

0–2 yrs BS in a quantitative field (CS, Statistics, Math, Engineering, Economics) or equivalent experience; MS is common/preferred for some teams.

What This Level Looks Like

Contributes as an individual contributor on a well-defined problem within a single team or product area. Owns parts of analyses/models/pipelines with close guidance, and impacts local team decisions via reliable insights and measurable improvements to a metric/KPI.

Day-to-Day Focus

→Analytical rigor (data quality checks, clear causal/experimental thinking, sound statistics)
→Strong SQL and practical data manipulation skills
→Clear communication of findings to non-technical partners
→Learning Target-specific data, metrics, and retail domain context
→Execution reliability (on-time delivery, reproducible work, version control basics)

Interview Focus at This Level

Emphasis on fundamentals: SQL querying, basic statistics/experimentation, data interpretation, and clear problem solving. Expect a case-style analytics question (retail/customer/product metrics), a SQL exercise, and discussion of past projects focusing on how you validated data, chose metrics, and communicated tradeoffs.

Promotion Path

Promotion to the next level typically requires consistently delivering end-to-end analyses or small modeling projects with limited guidance, demonstrating strong ownership of a problem area, proactively identifying opportunities, influencing stakeholder decisions with clear recommendations, and improving the team’s standard practices (reusable code, better metrics definitions, stronger experiment analysis).

Find your level

Practice with questions tailored to your target level.

Start Practicing

The widget shows scope and responsibilities at each level. The key insight it doesn't capture is what blocks promotion. P3 to P4 is the hardest jump because it requires shifting from "I delivered this model" to "I set technical direction for a domain and influence partner roadmaps." One realistic growth lever at Target: the retail scale means lateral moves across domains (personalization to pricing to supply chain) are a common way to build the breadth P4 and P5 roles demand.

Work Culture

According to internal culture notes, the hybrid policy has most teams in-office around three days per week at the Minneapolis headquarters on Nicollet Mall, though some senior roles (like the last-mile supply chain lead) are listed as remote-eligible. The pace, from what candidates and employees report, is steady rather than frenetic, with an emphasis on documentation, design reviews, and knowledge-sharing through internal paper reading groups. If you want rigor and a Minneapolis cost of living that makes your comp stretch further than a coastal equivalent, it's worth a serious look.

Target Data Scientist Compensation

Target's RSU vesting is a flat 25% per year across all four years. That's unusually even compared to the backloaded schedules you'll find at many big tech companies, which means your Year 1 TC won't look dramatically different from Year 2 or 3 (though stock price swings and bonus variability will still move the number). The real risk is that TGT stock has been volatile alongside broader retail, so your actual realized comp could drift meaningfully from the grant-date estimate.

No refresh grant details appear in any public source for Target's data science roles, which makes the offer stage your only window to ask about refresh cadence, size, and eligibility criteria. Push on this directly, because the answer determines whether your comp grows or stagnates between promotions.

The single biggest negotiation lever most candidates miss is level calibration. If your experience puts you on the boundary between two levels, anchoring the conversation on the higher level changes the entire comp structure (equity weight shifts dramatically at senior levels, as the widget shows). Beyond that, sign-on bonuses are a realistic ask when you're holding a competing offer from a company recruiting for similar applied ML work in recommender systems or search ranking. Bundle everything into one coordinated counter: level, base, sign-on, and relocation if you're moving to the Minneapolis area.

Target Data Scientist Interview Process

6 rounds·~2 weeks end to end

Initial Screen

2 rounds

Recruiter Screen

30mPhone

First, a brief call focuses on role fit, location/remote expectations, timeline, and compensation guardrails. You’ll be asked to summarize your experience, the kinds of models/analyses you’ve shipped, and what you want next. Expect a few culture and collaboration questions framed around working with business partners in a retail environment.

generalbehavioral

Tips for this round

Prepare a 60–90 second walkthrough that maps your last 1–2 projects to Target-like domains (pricing, inventory, personalization, fulfillment).
Have a crisp answer for your preferred stack (Python, SQL, Spark) and the scale you’ve worked with (rows, partitions, latency).
Bring 2–3 role-aligned keywords from the job description (forecasting, experimentation, optimization) and tie each to a measurable outcome.
Clarify work authorization, location, and start date early to avoid late-stage friction.
State a realistic compensation range with a rationale (leveling, geography, competing offers) and keep it consistent throughout the loop.

Hiring Manager Screen

45mVideo Call

Next comes a live video conversation with the hiring manager about what you’ve built end-to-end and how you choose approaches under constraints. The interviewer will probe your problem framing, how you translate business goals into metrics, and how you partner with engineering/analytics. You may also discuss why you chose specific modeling or experimentation methods and how you handled rollout risks.

machine_learningproduct_sensebehavioraldata_engineering

Tips for this round

Use a structured story for one flagship project: problem → data → baseline → model/approach → validation → deployment → business impact.
Be ready to explain tradeoffs (interpretability vs. accuracy, batch vs. real-time, uplift vs. prediction) using concrete examples.
Show you can define success metrics and guardrails (e.g., conversion, margin, fulfillment SLA, fairness) before modeling.
Expect questions about stakeholder management—prepare an example of pushing back on vague asks and aligning on requirements.
Review retail-specific pitfalls (seasonality, promo effects, stockouts, substitution) and how you’d account for them.

Technical Assessment

3 rounds

SQL & Data Modeling

60mVideo Call

Expect a SQL-heavy session where you’re asked to query realistic retail tables (orders, items, customers, inventory, stores). You’ll write joins, window functions, and aggregations, then interpret the results to answer business questions. Data modeling judgment is often tested via schema assumptions, grain definitions, and how you handle missing/duplicate data.

databasedata_modelingdata_warehousestatistics

Tips for this round

Practice window functions (ROW_NUMBER, LAG/LEAD, rolling sums) and explain why you chose them vs. subqueries.
Always state table grain before you code (e.g., order_id-item_id-day) to avoid double-counting; call out potential fanouts in joins.
Use explicit date filters, CTEs, and naming to keep queries readable under time pressure.
Discuss data quality checks (null handling, deduping keys, late-arriving facts) and how you’d validate outputs.
Be ready to model a simple star schema for a metric (fact_sales + dim_date/store/product) and justify keys and partitions.

Machine Learning & Modeling

60mVideo Call

Then you’ll go through a modeling-focused interview covering algorithm selection, feature engineering, and evaluation for a Target-style use case. You may be asked to sketch a solution for forecasting demand, predicting churn, ranking recommendations, or optimizing fulfillment. The discussion typically includes how you’d validate offline, how you’d run an experiment, and what could go wrong in production.

machine_learningml_codingstatisticsab_testing

Tips for this round

For each use case, start with baselines and leakage checks; explicitly name time-based splits for forecasting or churn.
Match metrics to the business: AUC/logloss for classification, MAP@K/NDCG for ranking, MAPE/WMAPE for forecasting, plus operational guardrails.
Explain feature engineering choices (lags, seasonality, promo flags, price elasticity proxies) and how you’d handle sparse categories.
Outline an A/B test plan: unit of randomization, power considerations, primary metric, guardrails, and duration with seasonality awareness.
Know common failure modes: covariate shift, cold start, feedback loops, and how monitoring/drift detection would work at a high level.

Product Sense & Metrics

45mVideo Call

You’ll be given a business problem and asked to translate it into metrics, hypotheses, and an analytics plan. The interviewer is looking for crisp definitions (north star, leading indicators, guardrails) and how you’d debug metric movement. Expect follow-ups on segmentation, confounding factors in retail, and how you’d decide whether to ship or roll back.

product_senseab_testingguesstimatestatistics

Tips for this round

Use a consistent framework: goal → users → funnel → metrics (primary/secondary/guardrail) → experiment or quasi-experiment plan.
Define metrics precisely (numerator/denominator, inclusion rules, time windows) and call out data latency or attribution issues.
Bring retail intuition: seasonality, promotions, stockouts, substitution, and channel spillover (store vs. online) can distort results.
When asked to investigate a drop, propose a triage plan: data pipeline checks, segment cuts, cohorting, and change-log review.
Show decision-making: specify thresholds for practical significance and how you’d communicate uncertainty to stakeholders.

Onsite

1 round

Behavioral

60mVideo Call

Finally, a more people-focused interview (often part of a virtual onsite) assesses collaboration style, ownership, and how you operate in a cross-functional team. You should expect questions about conflict, prioritization, influencing without authority, and learning from failure. Some candidates also encounter a one-way recorded video segment earlier in the process, but this live round is typically the deeper values-and-impact discussion.

behavioralgeneral

Tips for this round

Prepare 6–8 STAR stories covering disagreement, ambiguity, failure, high-impact delivery, mentoring, and stakeholder alignment.
Quantify outcomes (revenue, cost, time saved, lift, SLA improvements) and your specific role vs. the team’s role.
Demonstrate customer-first thinking with an example where you balanced model performance with user experience or fairness concerns.
Show how you document and communicate: writeups, dashboards, experiment readouts, and decision memos.
Have thoughtful questions about success in the first 90 days, partner teams (engineering, product, analytics), and how impact is measured.

Tips to Stand Out

Prepare for one-way video questions. Many Target candidates report asynchronous online video prompts; practice concise 1–2 minute answers and keep a consistent structure (context → action → impact).
Anchor everything in retail realities. Talk explicitly about seasonality, promos, inventory constraints, and omnichannel effects to show you won’t misread signals in messy commerce data.
Lead with metrics and decisioning. For every project, state the business metric, what you changed, how you validated causally (A/B or quasi-experiment), and what decision was made.
SQL fluency is table-stakes. Be fast with joins and windows, and always communicate grain and double-counting risks—this is a common differentiator.
Show end-to-end pragmatism. Emphasize monitoring, drift checks, retraining cadence, and failure handling even if the role is not labeled “MLOps.”
Communicate like a partner, not a model builder. Practice explaining tradeoffs to non-technical stakeholders and proposing simpler baselines when they’re sufficient.

Common Reasons Candidates Don't Pass

✗Unclear problem framing. Candidates get rejected when they jump into modeling without defining the business objective, constraints, and success metrics in plain language.
✗Weak SQL and data intuition. Struggling with joins/window functions or missing grain/deduping issues signals risk when working with large retail datasets.
✗Evaluation mismatched to the use case. Using the wrong metric, ignoring seasonality/time splits, or failing to address leakage and offline/online gaps is a frequent blocker.
✗Shallow experimentation thinking. Not being able to define guardrails, randomization units, or practical significance makes it hard to trust recommendations for shipping changes.
✗Behavioral gaps in collaboration/ownership. Vague stories, unclear personal contribution, or difficulty handling ambiguity and stakeholder conflict can outweigh technical strength.

Offer & Negotiation

For Target data science roles, offers commonly blend base salary with an annual bonus target and, depending on level, equity/RSUs; higher levels tend to have more equity weighting with multi-year vesting. The most negotiable levers are base (within band), sign-on bonus, level/title calibration, and (when offered) equity refresh—use competing offers and a clear impact narrative to justify the ask. Ask for the full comp breakdown, bonus target %, vesting schedule, and any relocation/remote stipends, then negotiate in one coordinated counter rather than piecemeal changes.

Target's loop moves quickly, so you'll want to be prep-ready before that first recruiter call lands on your calendar. Walk in already able to connect your past work to their specific domains: fulfillment route optimization, Target Circle personalization, or search ranking with LLMs. From what candidates report, the ML & Modeling round tends to be the steepest hurdle, with interviewers pushing past textbook classification into production concerns like cold-start in recommender systems, feedback loops in ranking, and retraining cadence for demand forecasting models that face weekly promo cycles.

The rejection reasons in the data tell a clear story: unclear problem framing and shallow experimentation thinking sink candidates just as often as technical gaps. Vague behavioral answers are a real risk too. Interviewers score independently, so a strong modeling round won't paper over a collaboration story where you can't name the metric you moved or explain how you navigated pushback from a merchant partner on a pricing model rollout.

Target Data Scientist Interview Questions

Machine Learning & Predictive Modeling

Expect questions that force you to choose and evaluate models for retail/product problems (propensity, demand/forecasting, churn, ranking) under real constraints like leakage, drift, and imbalanced labels. You’ll be pushed on metric selection, validation strategy, and how to explain tradeoffs to non-technical partners.

You are building a weekly demand forecast for a specific Target store and DPCI using sales, price, promotions, and inventory. What validation scheme and features would you use to avoid leakage from stockouts and future promotions, and what metric would you report to the replenishment team?

MediumTime Series Forecasting and Leakage

Sample Answer

Most candidates default to random K-fold and throw in contemporaneous inventory and promo flags, but that fails here because it leaks future availability and planned events into the past. Use a time-based split with rolling or expanding windows, and treat stockouts as censored demand (for example, exclude those labels or model a stockout indicator separately). Report a scale-aware metric like WAPE, and segment it by velocity tier and by in-stock rate so the business can see where the model is actually usable.

You built a model to predict whether a guest will use a Target Circle offer in the next 7 days, with a 1% positive rate. Which evaluation metric and thresholding method do you use if marketing can only send 200,000 messages per day, and why?

EasyImbalanced Classification and Decisioning

Sample Answer

Use precision at a fixed send volume (top $k$) and pick the threshold that yields exactly 200,000 sends. PR AUC is fine for model selection, but the operating point is capacity-constrained, so you care about precision and incremental lift among the top-scored guests. Calibrate probabilities (Platt or isotonic) if you need stable cutoffs over time, then set the threshold by ranking and taking the top $k$ daily.

You own a learning-to-rank model for Target.com search that reorders products for a query, and online conversion dropped after a seemingly better offline model shipped. How do you diagnose whether it is offline metric misalignment, position bias, or feature drift, and what would you change in training or evaluation?

HardRanking Evaluation and Bias

Practice more Machine Learning & Predictive Modeling questions

Experimentation & A/B Testing

Most candidates underestimate how much rigor is expected in experiment design and readouts—power/MDE, guardrails, segmentation, and interpreting messy results. You’ll need to show you can turn a business question into a testable hypothesis and avoid common pitfalls (seasonality, novelty effects, interference).

Target wants to A/B test a new checkout UI on the app to reduce cart abandonment. What primary metric, guardrail metric, and minimum detectable effect (MDE) inputs do you define before launch, and why?

EasyExperiment design and metrics

Sample Answer

Define abandonment rate as the primary metric, add guardrails like payment error rate and average order value, and set MDE using baseline rate, desired power, and significance level. Abandonment directly matches the business goal and is sensitive to checkout friction. Guardrails stop you from shipping a UI that “wins” by breaking payments or depressing basket size. MDE inputs come from historical baselines and business value, otherwise you underpower the test and learn nothing.

You run a 2-week A/B test on a personalized deal badge in search results and see a lift in click-through rate but no lift in purchases. How do you decide whether to ship, and what follow-up analysis do you run to rule out false positives and funnel shifts?

MediumExperiment readout and interpretation

Sample Answer

You could ship based on CTR (top-of-funnel) or hold based on purchases (bottom-of-funnel). CTR wins here only if you can show downstream value, for example stable conversion rate or increased revenue per search, otherwise you are just moving clicks around. Purchases wins here if the feature adds cognitive load or misleads shoppers, because that can create long-term harm even when CTR rises. Follow up with funnel decomposition (impressions, CTR, add-to-cart, checkout starts, purchase), segment by intent (brand vs generic queries), and adjust for multiple metrics using a pre-registered hierarchy or FDR.

Target tests a new order pickup notification timing and randomizes at the user level, but many users share a household device and some users place multiple orders. How do you handle interference and repeated measures in the analysis so your $p$-values and confidence intervals are not fake?

HardInterference, clustering, and unit of analysis

Practice more Experimentation & A/B Testing questions

SQL & Analytical Querying

Your ability to pull correct metrics from large retail event/transaction tables is a frequent differentiator in DS loops. Interviews often probe joins, window functions, funnel/retention logic, and building clean cohorts while keeping performance and data quality in mind.

Given tables orders(order_id, guest_id, order_ts, channel, store_id) and order_items(order_id, product_id, qty, unit_price), write SQL to return daily revenue and a 7-day rolling revenue by channel for the last 60 days.

EasyWindow Functions

Sample Answer

You could compute the rolling 7-day revenue with a window function over daily aggregates, or with a self-join of each day to the prior 6 days. The window approach wins here because it is simpler, less error-prone, and usually more efficient on large retail fact tables once you pre-aggregate to day and channel.

SQL

1WITH params AS (
2  SELECT
3    DATEADD(day, -60, CURRENT_DATE) AS start_dt,
4    CURRENT_DATE AS end_dt
5),
6-- Pre-aggregate to the grain needed for the window.
7daily_channel_revenue AS (
8  SELECT
9    CAST(o.order_ts AS DATE) AS order_dt,
10    o.channel,
11    SUM(oi.qty * oi.unit_price) AS revenue
12  FROM orders o
13  JOIN order_items oi
14    ON oi.order_id = o.order_id
15  JOIN params p
16    ON CAST(o.order_ts AS DATE) >= p.start_dt
17   AND CAST(o.order_ts AS DATE) < p.end_dt
18  GROUP BY
19    CAST(o.order_ts AS DATE),
20    o.channel
21)
22SELECT
23  order_dt,
24  channel,
25  revenue AS daily_revenue,
26  SUM(revenue) OVER (
27    PARTITION BY channel
28    ORDER BY order_dt
29    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
30  ) AS revenue_7d_rolling
31FROM daily_channel_revenue
32ORDER BY order_dt, channel;

You own Circle offer performance, compute 30-day repeat rate by a guest's first purchase month for guests whose first-ever Target order used channel = 'APP' (repeat means at least one additional order within 30 days). Use orders(order_id, guest_id, order_ts, channel).

MediumCohorting and Retention

Sample Answer

Walk through the logic step by step as if thinking out loud. Find each guest's first-ever order and keep only those where that first order is APP. Then for those guests, check whether any later order occurs with an order timestamp less than or equal to first_order_ts plus 30 days. Finally, bucket by first purchase month and compute repeats divided by cohort size.

SQL

1WITH first_order AS (
2  SELECT
3    o.guest_id,
4    o.order_id AS first_order_id,
5    o.order_ts AS first_order_ts,
6    o.channel AS first_channel
7  FROM (
8    SELECT
9      o.*,
10      ROW_NUMBER() OVER (PARTITION BY o.guest_id ORDER BY o.order_ts, o.order_id) AS rn
11    FROM orders o
12  ) o
13  WHERE o.rn = 1
14),
15app_cohort AS (
16  SELECT
17    guest_id,
18    first_order_ts,
19    DATE_TRUNC('month', first_order_ts) AS cohort_month
20  FROM first_order
21  WHERE first_channel = 'APP'
22),
23repeat_flag AS (
24  SELECT
25    c.guest_id,
26    c.cohort_month,
27    CASE
28      WHEN EXISTS (
29        SELECT 1
30        FROM orders o2
31        WHERE o2.guest_id = c.guest_id
32          AND o2.order_ts > c.first_order_ts
33          AND o2.order_ts <= DATEADD(day, 30, c.first_order_ts)
34      ) THEN 1
35      ELSE 0
36    END AS is_repeat_30d
37  FROM app_cohort c
38)
39SELECT
40  cohort_month,
41  COUNT(*) AS cohort_guests,
42  SUM(is_repeat_30d) AS repeat_guests_30d,
43  1.0 * SUM(is_repeat_30d) / NULLIF(COUNT(*), 0) AS repeat_rate_30d
44FROM repeat_flag
45GROUP BY cohort_month
46ORDER BY cohort_month;

Target wants a weekly dashboard of out-of-stock impact, compute for each product_id the weekly out_of_stock_rate and lost_sales_units using inventory_snapshots(store_id, product_id, snapshot_ts, on_hand_qty) and sales(store_id, product_id, sale_ts, units). Define out_of_stock_rate as fraction of hourly snapshots with on_hand_qty = 0, and define lost_sales_units as avg hourly units sold when in stock times out-of-stock hours.

HardTime Series Aggregation and Data Quality

Practice more SQL & Analytical Querying questions

Statistics & Probability Foundations

The bar here isn’t whether you can recite formulas, it’s whether you can reason from first principles about uncertainty and variance in real analyses. Be ready to connect distributions, sampling, confidence intervals, hypothesis tests, and bias/variance to practical decision-making.

Target runs 50-50 traffic to two PDP layouts, and Variant B shows a $0.3\%$ higher conversion rate on $n=2{,}000{,}000$ sessions, with baseline $p=0.06$. Is this likely to be practically meaningful, and what quick uncertainty check do you do without running a full test suite?

EasySampling Variability and Practical Significance

Sample Answer

Reason through it: Walk through the logic step by step as if thinking out loud. Translate $0.3\%$ to an absolute lift in probability, that is $\Delta p=0.003$, then compare it to the standard error for a proportion, $\mathrm{SE}\approx\sqrt{p(1-p)/n}$. With $p=0.06$ and $n=2{,}000{,}000$, $\mathrm{SE}$ is tiny, so almost any nonzero lift will be statistically detectable, that is where most people fail. The practical check is to convert to incremental orders and margin impact (lift times sessions times AOV or contribution margin) and sanity check whether the lift clears business thresholds, not whether the $p$-value is small.

You are monitoring daily store-level shrink rate, and you want a 95% interval for the overall chain shrink rate when stores have very different transaction counts. How do you construct the interval, and what goes wrong if you just average store shrink rates equally?

MediumWeighted Estimation and Confidence Intervals

Sample Answer

Start with what the interviewer is really testing: "This question is checking whether you can" choose the right estimator under unequal exposure and quantify uncertainty correctly. You want a transaction-weighted estimate of shrink, for example $\hat{p}=\frac{\sum_i s_i}{\sum_i n_i}$ where $s_i$ is shrink events (or dollars) and $n_i$ is exposure (transactions or sales), then build a CI using a binomial or normal approximation with $n=\sum_i n_i$ (or a delta method if it is a ratio of dollars). If you equally average store rates, small stores get the same influence as high-volume stores, inflating variance and biasing the chain estimate toward noisy small-store behavior. If there is extra heterogeneity beyond binomial noise, you call that out and switch to a hierarchical model or a cluster-robust interval by store.

A guest-level churn model is evaluated on a holdout set drawn from the last 2 weeks, and AUC drops from 0.78 offline to 0.70 in production. Statistically, how do you separate sampling noise from true drift, and what test or interval do you use?

HardModel Evaluation Uncertainty and Drift Detection

Practice more Statistics & Probability Foundations questions

Product Sense & Stakeholder Insights

You’ll likely be asked to frame ambiguous retail/product problems into measurable goals, leading indicators, and an analysis plan that would actually influence a roadmap. Strong answers clarify assumptions, define north-star + guardrail metrics, and anticipate how teams will act on the result.

Target sees higher same-day delivery conversion after adding a new "Get it today" badge on PDP for eligible items, but order cancellations also rise. What north-star metric and 3 guardrail metrics do you choose, and what cuts would you show to decide whether to roll out nationally?

EasyMetrics and KPI Design

Sample Answer

This question is checking whether you can turn a messy product change into metrics that drive a decision, not just a dashboard. You should pick a north-star tied to profit or durable customer value (for example contribution margin per visitor, or net completed orders per visitor), then guardrails that catch harm (cancellation rate, on-time delivery rate, contact rate, return rate, substitution rate). Cuts should map to mechanisms and risk, like fulfillment method (Shipt vs store pickup), promised delivery window, item size and perishability, and new vs returning guests. If you cannot explain what action each metric would trigger, you are not done.

Target wants to personalize the homepage modules in the app to increase digital sales, but store teams worry it will shift demand to items with low in-stock. How do you align stakeholders on success metrics, offline evaluation, and an online test design that prevents shipping stockouts to guests?

HardStakeholder Alignment and Experimentation Strategy

Practice more Product Sense & Stakeholder Insights questions

Engineering Practices for Data Science (Python)

Rather than trick algorithms, coding prompts tend to emphasize readable Python for data work—feature creation, metric computation, and defensive handling of edge cases. You’ll be evaluated on maintainability signals like decomposition, testing mindset, and avoiding subtle bugs.

You receive online order events for Target.com as dicts with keys {order_id, item_id, event_ts, qty, unit_price, promo_id}, including duplicates and occasional negative qty for cancellations; write Python to compute daily net revenue per item_id in UTC and return the top 10 item_ids by 7-day trailing revenue ending on a given date. Handle missing unit_price by skipping those events, and treat duplicates as exact same dict values.

EasyMetrics computation and defensive data handling

Sample Answer

The standard move is to normalize timestamps to UTC date, deduplicate exact events, then aggregate net revenue as $\sum (\text{qty} \cdot \text{unit\_price})$ per item per day. But here, skipping missing prices and honoring negative quantities matters because returns and cancellations must reduce revenue, and bad rows should not silently poison totals. Use a 7-day rolling window on daily totals, then rank item_ids on the window ending at the target date. Keep the code modular so you can unit test parsing, deduping, aggregation, and ranking separately.

Python

1from __future__ import annotations
2
3from dataclasses import dataclass
4from datetime import datetime, date, timedelta, timezone
5from typing import Any, Dict, Iterable, List, Optional, Tuple
6
7
8Event = Dict[str, Any]
9
10
11def _parse_utc_date(ts: Any) -> Optional[date]:
12    """Parse event_ts into a UTC date.
13
14    Supports:
15      - ISO 8601 strings, including trailing 'Z'
16      - datetime objects (naive treated as UTC)
17      - POSIX seconds as int/float
18    Returns None if unparseable.
19    """
20    if ts is None:
21        return None
22
23    if isinstance(ts, datetime):
24        dt = ts
25        if dt.tzinfo is None:
26            dt = dt.replace(tzinfo=timezone.utc)
27        else:
28            dt = dt.astimezone(timezone.utc)
29        return dt.date()
30
31    if isinstance(ts, (int, float)):
32        dt = datetime.fromtimestamp(float(ts), tz=timezone.utc)
33        return dt.date()
34
35    if isinstance(ts, str):
36        s = ts.strip()
37        # Support Zulu time
38        if s.endswith("Z"):
39            s = s[:-1] + "+00:00"
40        try:
41            dt = datetime.fromisoformat(s)
42        except ValueError:
43            return None
44
45        if dt.tzinfo is None:
46            dt = dt.replace(tzinfo=timezone.utc)
47        else:
48            dt = dt.astimezone(timezone.utc)
49        return dt.date()
50
51    return None
52
53
54def _is_exact_duplicate(event: Event) -> Tuple[Tuple[str, Any], ...]:
55    """Create a hashable signature for exact-duplicate removal.
56
57    Exact duplicate means identical key-value pairs.
58    Sorting makes dict ordering irrelevant.
59    """
60    return tuple(sorted(event.items(), key=lambda kv: kv[0]))
61
62
63def compute_top_items_7d_trailing_revenue(
64    events: Iterable[Event],
65    end_date_utc: date,
66    top_n: int = 10,
67) -> List[Tuple[Any, float]]:
68    """Return top items by 7-day trailing net revenue ending on end_date_utc.
69
70    Rules:
71      - Skip events with missing unit_price.
72      - Negative qty is allowed and reduces revenue.
73      - Duplicates are exact same dict values.
74      - Timestamps are normalized to UTC date.
75
76    Returns list of (item_id, trailing_7d_revenue) sorted descending by revenue.
77    """
78    # 1) Deduplicate
79    seen = set()
80    deduped: List[Event] = []
81    for e in events:
82        sig = _is_exact_duplicate(e)
83        if sig in seen:
84            continue
85        seen.add(sig)
86        deduped.append(e)
87
88    # 2) Aggregate daily revenue per item_id
89    # key: (item_id, day) -> revenue
90    daily_rev: Dict[Tuple[Any, date], float] = {}
91    for e in deduped:
92        day = _parse_utc_date(e.get("event_ts"))
93        if day is None:
94            continue
95        item_id = e.get("item_id")
96        if item_id is None:
97            continue
98
99        unit_price = e.get("unit_price")
100        if unit_price is None:
101            continue
102        try:
103            price = float(unit_price)
104        except (TypeError, ValueError):
105            continue
106
107        qty = e.get("qty", 0)
108        try:
109            q = float(qty)
110        except (TypeError, ValueError):
111            continue
112
113        rev = q * price
114        daily_rev[(item_id, day)] = daily_rev.get((item_id, day), 0.0) + rev
115
116    # 3) Compute trailing 7-day sum ending on end_date_utc
117    window_days = [end_date_utc - timedelta(days=i) for i in range(7)]
118    trailing: Dict[Any, float] = {}
119    for (item_id, day), rev in daily_rev.items():
120        if day in window_days:
121            trailing[item_id] = trailing.get(item_id, 0.0) + rev
122
123    # 4) Top N
124    ranked = sorted(trailing.items(), key=lambda kv: kv[1], reverse=True)
125    return ranked[:top_n]
126
127
128# Optional quick self-check
129if __name__ == "__main__":
130    sample = [
131        {"order_id": 1, "item_id": "A", "event_ts": "2026-02-20T10:00:00Z", "qty": 1, "unit_price": 10.0, "promo_id": None},
132        {"order_id": 1, "item_id": "A", "event_ts": "2026-02-20T10:00:00Z", "qty": 1, "unit_price": 10.0, "promo_id": None},  # duplicate
133        {"order_id": 2, "item_id": "A", "event_ts": "2026-02-22T10:00:00+00:00", "qty": -1, "unit_price": 10.0, "promo_id": None},
134        {"order_id": 3, "item_id": "B", "event_ts": "2026-02-24T12:00:00Z", "qty": 2, "unit_price": 7.5, "promo_id": "P1"},
135        {"order_id": 4, "item_id": "C", "event_ts": "bad_ts", "qty": 1, "unit_price": 999.0, "promo_id": None},
136        {"order_id": 5, "item_id": "B", "event_ts": "2026-02-25T12:00:00Z", "qty": 1, "unit_price": None, "promo_id": None},  # skipped
137    ]
138
139    out = compute_top_items_7d_trailing_revenue(sample, end_date_utc=date(2026, 2, 26), top_n=10)
140    print(out)
141

Target wants a reusable feature generator for store-level demand forecasting: given a pandas DataFrame with columns [store_id, item_id, ds (date), units] with one row per day, write Python to add lag features for units at $\{1,7,14\}$ days and a 7-day rolling mean, with no data leakage across store_id-item_id series. Also write 3 unit tests that would catch common leakage and alignment bugs.

HardFeature engineering utilities, leakage prevention, unit tests

Practice more Engineering Practices for Data Science (Python) questions

The distribution skews hard toward applied modeling and experiment rigor, which means your ability to reason about retail-specific constraints (demand forecasting with leakage and drift, evaluating a Target Circle offer model at a 1% positive rate, designing guardrails for a checkout UI test) matters more than textbook recall. Where this gets tricky is the overlap between ML and experimentation: a question about a personalized deal badge in search results quickly becomes a question about whether a click-through lift without a purchase lift is a real win, forcing you to move fluidly between model evaluation and experiment interpretation. Candidates who drill SQL window functions but skip past ranking model validation or MDE reasoning for retail scenarios end up underprepared for the questions that carry the most weight.

Practice Target-style questions covering demand forecasting, Circle offer experimentation, and retail schema querying at datainterview.com/questions.

How to Prepare for Target Data Scientist Interviews

Know the Business

Updated Q1 2026

Official mission

“To help all families discover the joy of everyday life.”

What it actually means

Target aims to be a leading multi-channel retailer, providing affordable, convenient, and enjoyable shopping experiences for families. It also focuses on fostering a positive environment for its team members and contributing to the communities it serves.

Minneapolis, MinnesotaUnknown

Key Business Metrics

Revenue

$107B

-1% YoY

Market Cap

$52B

Current Strategic Priorities

Strengthen leadership as the destination for trend-forward products and everyday wellbeing
Make wellness accessible (fun, easy, affordable, personalized)
Make trend-driven, expert-backed beauty more accessible
Refresh in-store beauty experience and host beauty events

Competitive Moat

Upscale discount positioningHigh-quality and current trend merchandise at feasible pricesExclusive designer partnershipsDiverse merchandise assortmentsCustomer loyalty program

Target is planning to drive more than $15 billion in sales growth by 2030, and their open DS roles map directly to those bets. Job postings call out recommender systems and reinforcement learning for MarTech, search ranking with NLP and LLMs, and last-mile supply chain optimization. Their engineering blog posts on infrastructure showback and platform engineering tell you something else: DS teams here are expected to care about production cost and system reliability, not just model accuracy in a notebook.

Most candidates fumble "why Target" by making it a brand loyalty story. What tends to land better is connecting your ML skills to a specific initiative, like the largest-ever spring beauty assortment with 60+ new brands, and naming the technical challenge it creates (cold-start recommendations for dozens of new brands launching simultaneously, for instance). Read the 2024 annual report, pick two initiatives you'd want to work on, and sketch out what modeling approach you'd try for each.

Try a Real Interview Question

7-day retention by first purchase date

sql

Compute $7$-day retention by first purchase date: for each customer's first purchase date $d$, count how many customers had any subsequent purchase in the window $[d+1, d+7]$ and divide by the number of customers whose first purchase date is $d$. Output columns: first_purchase_date, cohort_size, retained_7d, retention_7d_rate.

orders

customer_id	order_id	order_date	order_total
101	5001	2024-01-02	35.20
101	5002	2024-01-06	12.00
102	5003	2024-01-02	58.10
103	5004	2024-01-03	20.00
103	5005	2024-01-20	9.99

customers

customer_id	signup_date
101	2023-12-15
102	2023-12-20
103	2023-12-28
104	2024-01-01

SQL

1WITH first_purchase AS (
2  SELECT
3    customer_id,
4    MIN(order_date) AS first_purchase_date
5  FROM orders
6  GROUP BY customer_id
7),
8retained AS (
9  SELECT
10    fp.customer_id,
11    fp.first_purchase_date,
12    CASE
13      WHEN EXISTS (
14        SELECT 1
15        FROM orders o2
16        WHERE o2.customer_id = fp.customer_id
17          AND o2.order_date > fp.first_purchase_date
18          AND o2.order_date <= fp.first_purchase_date + INTERVAL '7 day'
19      ) THEN 1 ELSE 0
20    END AS retained_7d
21  FROM first_purchase fp
22)
23SELECT
24  first_purchase_date,
25  COUNT(*) AS cohort_size,
26  SUM(retained_7d) AS retained_7d,
27  CAST(SUM(retained_7d) AS DECIMAL(18,6)) / NULLIF(COUNT(*), 0) AS retention_7d_rate
28FROM retained
29GROUP BY first_purchase_date
30ORDER BY first_purchase_date;

700+ ML coding problems with a live Python executor.

Practice in the Engine

Retail data science interviews tend to reward candidates who reason about data modeling choices (why this join, why this grain) alongside writing correct queries. From what candidates report, Target's SQL round leans into that pattern. Practice more retail-flavored problems at datainterview.com/coding.

Test Your Readiness

How Ready Are You for Target Data Scientist?

1 / 10

Machine Learning

Can you choose an appropriate model (for example logistic regression, gradient boosting, random forest) for a business goal like predicting guest churn, and explain the tradeoffs in interpretability, performance, and operational constraints?

Use datainterview.com/questions to pressure-test the areas where Target's posted roles put the most weight, particularly recommender systems, experimentation design, and causal inference.

Frequently Asked Questions

How long does the Target Data Scientist interview process take?

Most candidates report the process taking about 3 to 5 weeks from initial recruiter screen to offer. You'll typically start with a recruiter call, move to a technical phone screen, and then an onsite (or virtual onsite) loop. Target's recruiting team in Minneapolis tends to move at a steady pace, but holiday seasons and Q4 can slow things down since retail gets busy.

What technical skills are tested in the Target Data Scientist interview?

SQL is non-negotiable at every level. Beyond that, expect questions on production ML model development, big data tools like PySpark and Hive, and applied ML topics like recommender systems, search/ranking, or NLP depending on the team. They also care about software engineering practices like unit testing, CI/CD basics, and modular code design. Python is the primary language, though Scala, R, and Java may come up for specific teams.

What is the salary and total compensation for Target Data Scientists?

At the P2 (Junior) level, total comp averages around $115K with a base of $105K. P3 (Mid-level) jumps to roughly $170K total comp on a $145K base. Staff-level P4 roles average $280K total comp with a $185K base, and the range can stretch up to $350K. Equity vests on a standard 4-year schedule at 25% per year. These numbers reflect Minneapolis cost of living, which is lower than Bay Area or NYC, so the purchasing power is solid.

How should I prepare my resume for a Target Data Scientist role?

Lead with production ML experience. Target cares about ML Ops orientation, so highlight models you've deployed at scale, not just trained in a notebook. Mention specific big data tools (Hadoop, Hive, PySpark) and quantify business impact in retail-relevant terms like revenue lift, conversion rates, or customer engagement. If you've worked on recommender systems or search/ranking, put that front and center. A BS in a quantitative field is required, and an MS or PhD is preferred for modeling-heavy teams.

How hard are the SQL questions in Target Data Scientist interviews?

For P2 (Junior) candidates, SQL questions focus on fundamentals like joins, aggregations, and window functions over large datasets. At P3 and above, expect more complex data wrangling scenarios where you're querying at scale using SQL or HQL. The questions aren't trick questions, but they test whether you can work with messy, large-scale retail data efficiently. I'd recommend practicing on datainterview.com/coding to get comfortable with the style.

What ML and statistics concepts should I know for a Target Data Scientist interview?

Applied statistics and experimentation design come up at every level. For P2, know your basics: hypothesis testing, A/B testing, confidence intervals. At P3 and P4, you need end-to-end ML problem solving, including feature engineering, model evaluation, and error analysis. Staff and principal levels get into causal inference, production ML system design, and deep learning (PyTorch or TensorFlow for GenAI roles). The higher the level, the more they want to see you frame ambiguous problems into measurable hypotheses.

How do I prepare for the behavioral interview at Target?

Target's core values are Care, Grow, Win, along with ethical business practices and community responsibility. Prepare stories that show you collaborating with product and business partners, not just doing solo technical work. They want to see that you can translate business needs into data science solutions. Use the STAR format (Situation, Task, Action, Result) and keep answers under 2 minutes. I've seen candidates stumble by being too technical in behavioral rounds. Show that you're a team player who communicates clearly.

What happens during the Target Data Scientist onsite interview?

The onsite loop typically includes a SQL/coding round, a machine learning or statistics deep dive, a case-style analytics question with retail or customer metrics, and at least one behavioral round. At senior levels (P4, P5), expect a system design or end-to-end ML design session where you frame an ambiguous problem and propose a production-ready solution. You'll also be evaluated on how well you collaborate with cross-functional partners, so expect questions about working with product teams in agile settings.

What retail metrics and business concepts should I know for Target interviews?

Think like a retailer. Know customer lifetime value, conversion rates, basket size, same-store sales, and inventory turnover. Target is a $106.6B revenue multi-channel retailer, so understanding both in-store and online customer behavior matters. Be ready for case questions about product recommendations, pricing optimization, or customer segmentation. At P2, you might interpret a dashboard of retail KPIs. At P3 and above, you'll need to propose how to measure and improve those metrics using experimentation or ML.

What are common mistakes candidates make in Target Data Scientist interviews?

The biggest one I see is treating it like a pure tech company interview. Target wants applied, production-oriented data scientists who understand retail business context. Don't just talk about model accuracy, talk about business impact. Another mistake is ignoring the ML Ops angle. They specifically look for experience deploying models at scale with proper engineering practices. Finally, junior candidates sometimes skip the basics. Nail your SQL and statistics fundamentals before worrying about deep learning.

What level of education do I need for a Target Data Scientist position?

A BS in a quantitative field like CS, Statistics, Math, Engineering, or Economics is the baseline requirement across all levels. For P2 roles, an MS is common and preferred by some teams but not always required. At P3 and above, especially for modeling-heavy positions, an MS or PhD is often preferred. That said, strong equivalent industry experience can substitute. If you have 8+ years of applied ML work with a strong track record, a BS won't hold you back at the P4 or P5 level.

How should I practice coding for a Target Data Scientist interview?

Focus on Python and SQL since those are the two primary languages you'll be tested on. For Python, practice writing clean, modular code with proper documentation. Target values software engineering best practices, so sloppy scripts won't cut it. For SQL, work through problems involving large-scale data manipulation, joins on multiple tables, and window functions. I recommend the practice sets at datainterview.com/questions, which cover the kind of applied analytics problems Target likes to ask.

Target Data Scientist Interview Guide

Target Data Scientist Role

A Typical Week

A Week in the Life of a Target Data Scientist

Weekly time split

Culture notes

Projects & Impact Areas

Skills & What's Expected

Levels & Career Growth

Target Data Scientist Levels

Work Culture

Target Data Scientist Compensation

Target Data Scientist Interview Process

Initial Screen

Recruiter Screen

Hiring Manager Screen

Technical Assessment

SQL & Data Modeling

Machine Learning & Modeling

Product Sense & Metrics

Onsite

Behavioral

Tips to Stand Out

Common Reasons Candidates Don't Pass

Target Data Scientist Interview Questions

Machine Learning & Predictive Modeling

Experimentation & A/B Testing

SQL & Analytical Querying

Statistics & Probability Foundations

Product Sense & Stakeholder Insights

Engineering Practices for Data Science (Python)

How to Prepare for Target Data Scientist Interviews

Try a Real Interview Question

7-day retention by first purchase date

Test Your Readiness

Frequently Asked Questions

Dan Lee

Related Articles

Snap Machine Learning Engineer Interview Guide

Salesforce Data Analyst Interview Guide

xAI AI Engineer Interview Guide