Amazon Data Scientist Interview Guide

This guide was written with the help of data science interviewers of senior-level DS candidates at Amazon. While this guide is written for senior (L6+) DS candidates, much of the interviewing information is still relevant to applicants for other DS roles.

tl;dr

Amazon processes nearly 500k product orders per hour, and as of 2025, Amazon Prime boasts over 200 million worldwide subscribers. Branching beyond books—Amazon’s only product, originally—into being an “everything store” has led to Amazon handling nearly 40% of all U.S. e-commerce, while Amazon’s cloud computing service, AWS, hosts about a third of all traffic on the internet.

Data science interview loops at Amazon are fairly unstandardized across different orgs. Regardless of team or org within Amazon, data science interviewers are looking for candidates with experience with:

ML solutions for forecasting
A/B Testing
SQL queries
Designing, developing, deploying, and improving data-driven models and analytics solutions
Developing data pipelines
Mentoring more junior data scientists

Prepare for your upcoming interviews with Exponent’s Data Science Interview Course, featuring a comprehensive breakdown of popular data science interview questions as well as in-depth interview rubrics and answer frameworks.

What does an Amazon Data Scientist do?

Data science at Amazon is the bridge between Amazon’s technical innovations and its customers. The models and pipelines you’ll build and evaluate will provide insights impacting over 200 million worldwide subscribers, and empower the entire company’s leadership to make more informed decisions to support customer needs. In keeping with Amazon’s global reach, you’ll analyze massive data sets of both technical and business data, define and implement metrics, and forecast future solutions.

Much of the work you do will be with internal teams, extracting information from existing systems and pipelines within Amazon to create new analyses, build analytical models, and drive business intelligence engineering (or BIE)-related observations, bridging the gap between business and technical domains. As a senior data scientist, you’ll be the person driving the decisions around tools, methodology, and goals, as well as mentoring more junior DS team members. In fact, a lack of mentorship experience is one of the most common reasons candidate’s get downleveled in this loop.

Amazon’s thirteen Leadership Principles are the foundation of Amazon’s work culture, and are how you’ll be evaluated as an applicant. Expect to be asked questions explicitly tied to the principles at every stage of the interview process. Interviewers often down-level candidates with strong technical competency who aren’t up to scratch on the Leadership Principles.

The average total compensation across levels for data scientists at Amazon are:

DS 1 (aka L4): $176K
DS 2 (aka L5): $297K
DS 3 (aka L6): $393K
Principal DS (aka L7): $618K

Before you apply

Research recently asked interview questions at Amazon for senior data scientists
Brush up on your data science fundamentals.
Read Amazon’s annual report to get a sense of their future priorities
Familiarize yourself with Amazon’s Leadership Principles since you’ll be explicitly asked about them throughout the process
Prepare 5+ impact statements from your past projects

Interview process

Most interviews within Amazon’s data science teams take between 4–6 weeks, though more senior candidates should factor in a few extra weeks.

The main factors your interviewers will look for are your understanding of the Amazon Leadership Principles and your technical competency, which, depending on the team, could include:

Problem-solving
Data analysis and manipulation
Machine learning / AI
Coding

Amazon has no “cooling-off” period, meaning that if you get rejected for a role, you can interview with another team immediately. Or, you can even interview with multiple teams at the same time.

You’ll have three main steps between applying and getting your offer:

A 30-minute recruiter screen to get a sense of work history
A 45-minute technical screen to see how you handle SQL and statistics questions
A 5-hour day of onsite interviews, including:
- A data manipulation and scripting interview to go into greater depth on your technical skills
- A data science breadth interview to test your data science knowledge
- A data science depth interview to see how you explain a past project
- A hiring manager round to get your responses to behavioral questions
- A Bar Raiser interview to ensure Amazon doesn't make a bad hire

1. Recruiter screen

You’ll have at least one 30-minute phone screen at this stage, although some teams have two screens. Although it’s called a recruiter screen, your interviewer might be a recruiter or a hiring manager. Questions will be pretty standard for a typical recruiter round. One unique point is that Amazon recruiters are more aggressive when digging for compensation expectations. Unless you already have other competitive offers in hand, don’t say a number first.

Even though interviewers don’t report asking Leadership Principles-related questions at this stage, candidates who’ve gotten offers recently recommend weaving them into questions you’re asked at this stage, since all your interviewers share notes.

Some topics include:

Your previous work experience
Your location and sponsorship

Recent questions include:

Tell me about your resume.
What is your experience with SQL?
Why Amazon?

2. Technical screen

You’ll have at least one 45-minute technical screen, generally set up over Amazon Chime. All teams across data science at Amazon are able to pull questions about SQL from the same question bank, which you’ll be expected to solve in a shared notepad.

Most data science roles at Amazon don’t usually write code that isn’t in SQL day-to-day, depending on your team, so unlike interviews at Meta, which may be more algorithm-heavy, you typically won’t have that many algorithm questions at this stage. (That said, if your team is particularly heavy on large-scale user data, you can expect Python and PySpark questions at later stages.)

Stellar candidates can show practical experience setting up ML workflows on AWS infrastructure. Start with sharpening your skills with an ML Concepts course to prepare for commonly asked questions.

As well as SQL questions, you can expect to be asked about ML evaluation metrics (i.e., precision, AUC-ROC, etc.), building large-scale pipelines and analyzing their data, as well as probability and statistics. Under the heading of ML, recent interviewers were asked about step functions, CloudFormation templates, and how to containerize models with Docker for Amazon ECS/EKS. Your interviewer may alternate practical “write me a query” or “design a pipeline to…” questions with more theoretical questions to assess whether or not you have experience with DS topics, like ETL.

Your Amazon interviewers will be assessing your results more than your outcome, especially on SQL questions. Either those queries will work, or they won’t, but optimization isn’t necessary at this stage.

A common disqualifier at this stage is a lack of experience with window functions in SQL, so brush up on those.

Some topics include:

Window functions
Lag
Ranking functions
CDs
Subqueries
Aggregation functions

Some teams also assign a take-home project at this stage, though this is reportedly rare, and generally reserved for candidates for AWS government operations teams. If you’re given one of these, your recruiter will send you a choice of three ML questions, and you’ll pick one and submit a solution.

Take-home topics include:

Classification
NLP problems
Computer vision

Recent questions include:

Given a retail data set, find the best-selling item for each month (no need to separate months by year) where the biggest total invoice was paid. The best-selling item is calculated using the formula (unitprice * quantity). Output the month, the description of the item, along with the amount paid.
Find the cumulative sum of the top 10 most profitable products of the last 6 months for customers in Seattle.
When users are navigating through the Amazon website, they are performing several actions. What is the best way to model if their next action would be a purchase?
Estimate the disease probability in one city given the probability is very low nationwide. If you randomly asked 1000 people in this city, with all negative responses (i.e., no disease), then what is the probability of disease in this city?
Given a set of user data, write a SQL query to explain the month-to-month user retention rate.
Name some common window functions.
There are 4 red balls and 2 blue balls; what's the probability of them not being the same in 2 picks?
How do you inspect missing data, and when are they important?
You are asked to reduce delivery delays in a specific geography. How would you apply statistical analysis and machine learning to identify root causes?

3. Onsite interviews

Many recent interviewees not located near an office report that this 5-hour day was scheduled remotely, but if you live near one of Amazon’s office locations, you can expect to come in.

You’ll have five rounds of interviews, each lasting around an hour, which will be one-on-one interviews with a mix of people from the team you're applying to join, your hiring manager, and in the Bar Raiser, a senior interviewer not from your team.

Data manipulation and scripting

This round will feature three SQL questions, escalating in difficulty, all more difficult than the ones you faced in the tech screen. Depending on your team, you may also get an optional second set of domain-specific questions. For example, a team that works on delivery logistics is likely to ask you to design algorithms for graph traversal problems, like route optimization and dependency resolution for warehouse networks.

If your team is particularly data-heavy (for example, working on video or AWS) you can also expect to see Python, PySpark, and PyTorch questions.

You’ll also be offered two bonus questions if you finish your main questions, though it’s not expected that you’ll get to them, and interviewers say it’s extremely rare that candidates complete both bonuses.

As you work through your more practical code- and data-related problems, you’ll also be asked more abstract, theoretical questions about imbalance data scenarios (which are common occurrences for data scientists at Amazon) as well as about your experience with techniques like SMOTE, undersampling, and cost-sensitive learning.

A recent data science hiring trend at Amazon is emphasizing explainability in answers, so be ready to discuss frameworks like SHAP, LIME, and what techniques you use to justify your model decisions at this stage in the interview.

Some topics include:

Aggregation
Self joins
Time intervals
CTE and multiple joins

Recent questions include:

How do you use Pyspark on a large data set?
Given a table with three columns, (id, category, value) and each id has 3 or less categories (price, size, color); how can you find those ids for which the value of 2 or more categories matches one another?
Given a .csv file with id and quantity columns, 50 million records, and the size of the data is 2gb, write a program to aggregate the quantity column.
Write a Python method for recognizing if entries to a list have the same characters or not. What is its computational complexity?
You have an array of integers and you want to find a certain element; what effective algorithm would you use and what is its efficiency?
For a long sorted list and a short (4-element) sorted list, what algorithm would you use to search the long list for the 4 elements?
Given an unfair coin with the probability of heads not equal to .5, what algorithm could you use to create a list of random 1s and 0s?
There's a need for A/B testing of green vs yellow colors for the buy button. How would you design this experiment?
Say you want to predict how many units of an item you want to stock in a grocery store each day for the next week. How would you go about doing that?
How do you handle categorical variables?
Design an experiment to evaluate the success of Prime Video recommendations, ensuring both user satisfaction and business KPIs.
How do you interpret OLS regression results?

DS breadth screen

This round assesses your general understanding of a wide variety of data science topics. Typically, this interview is split into two sections. The first is a handful of questions about machine learning. The second section is one large case study. Unlike companies like Meta, which base all their case studies around real-world products they’ve faced, the case study scenario you’ll be given might not be relevant to Amazon at all.

Your interviewer will give you a prompt and then ask you to talk them through your process end-to-end, including generating features, selecting models (and explaining selection) and viewing, metrics definition, and making recommendations for next steps. Throughout your answers, your interviewer will also be weaving in technical (“what framework would you use…”) and Amazon Leadership Principle questions (“tell me about a time have you done this work previously...”).

Sample topics include:

Regularization
L1 vs L2 regularization
Hosting vs radiant boosting model
Weight assignment
Clustering

Recent questions include:

Case study: predict housing prices end-to-end, including generating features and model selection and viewing, metrics definition, and weaving in technical questions.
How would changing the cost of Prime membership fees affect the market?
What is L1 vs L2 regularization?
What is the difference between linear regression and a t-test?
What is bootstrapping?
How do you interpret logistic regression?
What is the difference between bagging and boosting?
If there is a defective/unsafe product on Amazon, how would you identify it?
What does it mean to implement k-means clustering?
What are common linear regression problems?
Is random weight assignment better than assigning same weights to the units in the hidden layer?
Given a bar plot, imagine you are pouring water from the top; how would you qualify how much water can be kept in the bar chart?
How would you improve a classification model that suffers from low precision?
We have two models, one with 85% accuracy, one 82%. Which one do you pick and why?
How do you inspect missing data, and when are they important?

DS depth screen

This project deep-dive assesses your previous history and previous business impact, and your ability to recognize and speak about it. You’ll be asked to go through one project in detail, explaining your actions and contributions from end to end.

The most common disqualifying mistake at this stage is not talking clearly about the measurable impacts of your project on your organization, and one of the easiest metrics to prove impact is revenue. If your project has had an org-wide monetary impact, make sure to mention it here.

Candidates for senior-level roles can be down-leveled at this stage if they’re not able to show that they were the senior data scientist on the team (i.e., insufficient leadership) or if they don’t show that they were thinking about next steps. Good questions to think about as you prepare are “what did I do to make sure this issue didn’t happen again?” and “what guidelines did I put in place for future improvements?” Amazon interviewers want to see not only that you’re doing your job here and now, but that you’re also making repeatable mechanisms for the future.

While Amazon interviewers want to hear the dollar impact of your project, if your project’s organizational impact isn’t at a scale comparable to Amazon’s, you can still use it. Instead of sharing absolute numbers, use percentages, orders of magnitude improvement, or something like “my automation allowed us to do this process 2x or 3x faster.”

Another key factor in leveling at this stage is how you talk about your project’s scope and ambiguity. Senior data scientists can scope and solve ambiguous problems themselves, determine tasks, and lead end-to-end solutions, so make sure you find a project that showcases you taking the lead. Exceptional candidates can also speak to their timeline management, the drawbacks and tradeoffs of their potential solutions, and what methods of communication they used to earn the trust of leadership as they delivered results.

Interviewers stress that at this stage, great candidates also know what’s relevant, and stay on-topic. You don’t have to go too deeply into the fine technical details of your data set and tooling if you can clearly describe how you’re picking the correct model and how it improved performance.

Recent questions include:

Tell me about a recent project you executed on end-to-end.
What were some areas of ambiguity you encountered while working on this project? How did you address them?
How did you communicate goals, needs, and updates on this project to non-technical business stakeholders?

Hiring manager round

Don’t let this round fool you—just because your interviewer is asking traditionally “behavioral” interview questions does not mean they don’t want you to talk about previous projects. Your interviewer will want your answers to go into almost as much depth as you did during the DS depth screen. You’ll be expected to tie impactful projects and previous experience into behavioral answers, and candidates who get the highest recommendations at this stage can also weave in the Amazon Leadership Principles.

Since all your interviewers communicate and share notes throughout the loop, it’s highly strategic to differentiate which projects you use as answers to interview questions. If you share the same project the same way (i.e., emphasizing the same aspects or techniques) multiple times, your interviewers will assume you lack experience, and you may get down-leveled, or even rejected. Plan to talk about at least 3 projects throughout all the rounds, but more is better.

Recent questions include:

Tell me about a recent project you worked on that required you to simplify something abstract for stakeholders.
What have you done when the deadline given for a project was earlier than expected? How did you deal with the change in time frame, and what were the results?
How would you improve a previous project if you had more time?
Tell me about a time you used data to come up with data-driven statistics, and how you presented your findings.

Bar Raiser interview

The Bar Raiser round is often the toughest round in the loop because the interviewer’s are well-trained, fast moving, and ruthlessly deep diving for signals on the LP’s. They can be challenging personalities, who won’t hesitate to interrupt you if you’re off track. And like all Amazon interviewers, they’re assigned 1 to 3 of Amazon’s Leadership Principles to assess you on.

You can identify the Bar Raiser because they are the one outsider in your loop–they’re the only interviewer who has nothing to do with the team you’re interviewing with.

Recent questions include:

Tell me about a project you worked on that was not successful. What would you do differently?
Tell me about a time you applied judgment to a decision when data was not available.
Why data science?
Tell me about your most significant accomplishment.
Describe the last time you figured out a way to keep an approach simple or to save on expenses.
Describe a situation when you disagreed with your manager, and how you handled that.
How have you balanced data insights with leadership instincts when driving critical team decisions?

Additional resources

Take Exponent’s Data Science course to level up your probability and statistics skills and practice for your SQL interview
Get coaching for actionable feedback from senior data scientists at Amazon
Prep with mock interviews on the most commonly asked engineering questions

FAQs about the senior DS interview at Amazon

How should I prepare for a data science interview at Amazon?

Research recently asked interview questions at Amazon for senior data scientists
Brush up on DS and SQL interview skills.
Read Amazon’s annual report to get a sense of their future priorities
Familiarize yourself with Amazon’s Leadership Principles since you’ll be explicitly asked about them throughout the process

How much do Amazon Senior Data Scientists make?

The average total compensation across levels for senior data scientists at Amazon is:

DS 1 (aka L4): $176K
DS 2 (aka L5): $297K
DS 3 (aka L6): $393K
Principal DS (aka L7): $618K

How long is the Amazon Senior Data Scientist interview process?

Typically, the interview process for data scientist roles at Amazon lasts between 4 and 6 weeks, but more senior roles are often closer to 8 weeks.

Does Amazon have an RTO policy?

Currently, Amazon does have a full return-to-office policy in place, though you may have some flexibility depending on the capacity of your nearest office location. Check in with your recruiter if you get an offer, and read job listings carefully.

If I get rejected, how long should I wait before re-applying?

Amazon has a “cooling-off” period of at least six months for candidates who do well on their onsite interviews, so if you get rejected, wait at least that long before applying to a similar role, and longer if you get any negative feedback on your interview performance.

tl;dr

What does an Amazon Data Scientist do?

Before you apply

Interview process

1. Recruiter screen

2. Technical screen

3. Onsite interviews

Data manipulation and scripting

DS breadth screen

DS depth screen

Hiring manager round

Bar Raiser interview

Additional resources

FAQs about the senior DS interview at Amazon

How should I prepare for a data science interview at Amazon?

How much do Amazon Senior Data Scientists make?

How long is the Amazon Senior Data Scientist interview process?

Does Amazon have an RTO policy?

If I get rejected, how long should I wait before re-applying?

Learn everything you need to ace your Data Scientist interviews.

Follow Us