Skip to main content

Amazon Data Scientist Interview Questions

Review this list of 21 Amazon Data Scientist interview questions and answers verified by hiring managers and candidates.
  • Amazon logoAsked at Amazon 
    125 answers
    Video answer for 'Tell me about yourself.'
    +117

    "As you know, this is the most important question for any interview. Here is a structure I like to follow, Start with 'I'm currently a SDE/PM/TPM etc with XYZ company.... ' Mention how you got into PM/TPM/SDE field (explaining your journey) Mention 1 or 2 accomplishments Mention what you do outside work (blogging, volunteer etc) Share why are you looking for a new role Ask the interviewer if they have any questions or will like to dive deep into any of your experience"

    Bipin R. - "As you know, this is the most important question for any interview. Here is a structure I like to follow, Start with 'I'm currently a SDE/PM/TPM etc with XYZ company.... ' Mention how you got into PM/TPM/SDE field (explaining your journey) Mention 1 or 2 accomplishments Mention what you do outside work (blogging, volunteer etc) Share why are you looking for a new role Ask the interviewer if they have any questions or will like to dive deep into any of your experience"See full answer

    Data Scientist
    Behavioral
    +14 more
  • Amazon logoAsked at Amazon 
    28 answers
    Video answer for 'Tell me about a time when you solved a complex problem and how you went about it.'
    +12

    "I work at a startup that makes software for Law Enforcement and the FBI. Our product analyzes calls being made by prison inmates and "listens" for predictors of violence and criminal behavior. Our clients are some of the top state prisons in the country. Recently one of the largest states in the country decided to evaluate our product for their prison system. I demo'd the product to the officers and they seemed to like everything. During the presentation they asked us if the product was ADA com"

    Aabid S. - "I work at a startup that makes software for Law Enforcement and the FBI. Our product analyzes calls being made by prison inmates and "listens" for predictors of violence and criminal behavior. Our clients are some of the top state prisons in the country. Recently one of the largest states in the country decided to evaluate our product for their prison system. I demo'd the product to the officers and they seemed to like everything. During the presentation they asked us if the product was ADA com"See full answer

    Data Scientist
    Behavioral
    +6 more
  • Amazon logoAsked at Amazon 
    60 answers
    Video answer for 'What is the project you are most proud of?'
    +53

    "I was working for my friend building streams at venues across the Chicago land area for FGC (fighting game tournaments), I adjusted and engineered his equipment to be set up permanently that's until covid came around at least. I used OBS to give visual appearances to stream watchers. So we're talking about subscribe, follow, and donation notifications and things of that nature for viewers to know they contributed in one of those ways. I set up proper sign-up scheduling for participants to lock t"

    Ayinde B. - "I was working for my friend building streams at venues across the Chicago land area for FGC (fighting game tournaments), I adjusted and engineered his equipment to be set up permanently that's until covid came around at least. I used OBS to give visual appearances to stream watchers. So we're talking about subscribe, follow, and donation notifications and things of that nature for viewers to know they contributed in one of those ways. I set up proper sign-up scheduling for participants to lock t"See full answer

    Data Scientist
    Behavioral
    +13 more
  • Amazon logoAsked at Amazon 
    10 answers
    Video answer for 'What product that you led are you most proud of and why?'
    +6

    "I was working for an Insurance company and I was the lead PM responsible for delivering HIPAA 5010 835 (statement of bill claims received vs paid) transactions to healthcare providers/hospitals For this initiative as a PM, I was responsible to create and prioritize features around file creation. I collaborated with cross-functional teams, engineers, DevOps. First I conducted sessions with claims adjudication, SFTP file transfer, security, and privacy, clearinghouse, third-party vendors and ensu"

    S P. - "I was working for an Insurance company and I was the lead PM responsible for delivering HIPAA 5010 835 (statement of bill claims received vs paid) transactions to healthcare providers/hospitals For this initiative as a PM, I was responsible to create and prioritize features around file creation. I collaborated with cross-functional teams, engineers, DevOps. First I conducted sessions with claims adjudication, SFTP file transfer, security, and privacy, clearinghouse, third-party vendors and ensu"See full answer

    Data Scientist
    Behavioral
    +1 more
  • Amazon logoAsked at Amazon 
    13 answers
    Video answer for 'Tell me about a time when you proposed an idea that was not agreed on.'
    +10

    "Disagreement --> persistent ---> more data insights ---> positive relationship ---> mentor/trust"

    Sam - "Disagreement --> persistent ---> more data insights ---> positive relationship ---> mentor/trust"See full answer

    Data Scientist
    Behavioral
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Amazon logoAsked at Amazon 
    54 answers
    Video answer for 'Tell me about a skill you recently learned.'
    +47

    "What are they looking for in the answer? "

    Astro S. - "What are they looking for in the answer? "See full answer

    Data Scientist
    Behavioral
    +1 more
  • Amazon logoAsked at Amazon 
    11 answers
    +8

    "In my time at Snapp! I was in charge of communicating the product backlog to our CEO. We had a shared Jira board that he had access to and I made specifically for him. One day he saw me in the office and said he doesn’t know anything about our backlog and that’s because I failed to communicate with him. I got upset at first because of the fact that I made the dashboard exclusively for him. But I tried to ask questions to understand his point of view in depth. He then mentioned he doesn't have t"

    Ra R. - "In my time at Snapp! I was in charge of communicating the product backlog to our CEO. We had a shared Jira board that he had access to and I made specifically for him. One day he saw me in the office and said he doesn’t know anything about our backlog and that’s because I failed to communicate with him. I got upset at first because of the fact that I made the dashboard exclusively for him. But I tried to ask questions to understand his point of view in depth. He then mentioned he doesn't have t"See full answer

    Data Scientist
    Behavioral
    +9 more
  • Amazon logoAsked at Amazon 
    3 answers

    "Context - I joined a large public-facing service as a PM midway through its development. Situation - Due to the SOPs of the company the team already had a metrics framework. That included your standard DAUs, Retention and Acquisition Metrics Concern - As SOP metrics were publicly accepted, the team did not internalise what success actually means to the product they are developing, as the actual value was not being encapsulated in the framework This was evident in the show and tells as the"

    Umang S. - "Context - I joined a large public-facing service as a PM midway through its development. Situation - Due to the SOPs of the company the team already had a metrics framework. That included your standard DAUs, Retention and Acquisition Metrics Concern - As SOP metrics were publicly accepted, the team did not internalise what success actually means to the product they are developing, as the actual value was not being encapsulated in the framework This was evident in the show and tells as the"See full answer

    Data Scientist
    Analytical
    +2 more
  • Amazon logoAsked at Amazon 
    66 answers
    Video answer for 'Move all zeros to the end of an array.'
    +61

    "Initialize left pointer: Set a left pointer left to 0. Iterate through the array: Iterate through the array from left to right. If the current element is not 0, swap it with the element at the left pointer and increment left. Time complexity: O(n). The loop iterates through the entire array once, making it linear time. Space complexity: O(1). The algorithm operates in-place, modifying the input array directly without using additional data structures. "

    Avon T. - "Initialize left pointer: Set a left pointer left to 0. Iterate through the array: Iterate through the array from left to right. If the current element is not 0, swap it with the element at the left pointer and increment left. Time complexity: O(n). The loop iterates through the entire array once, making it linear time. Space complexity: O(1). The algorithm operates in-place, modifying the input array directly without using additional data structures. "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    9 answers
    +6

    "DFS with check of an already seen node in the graph would work from collections import deque, defaultdict from typing import List def iscourseloopdfs(idcourse: int, graph: defaultdict[list]) -> bool: stack = deque([(id_course)]) seen_courses = set() while stack: print(stack) curr_course = stack.pop() if currcourse in seencourses: return True seencourses.add(currcourse) for dependency in graph[curr_course]: "

    Gabriele G. - "DFS with check of an already seen node in the graph would work from collections import deque, defaultdict from typing import List def iscourseloopdfs(idcourse: int, graph: defaultdict[list]) -> bool: stack = deque([(id_course)]) seen_courses = set() while stack: print(stack) curr_course = stack.pop() if currcourse in seencourses: return True seencourses.add(currcourse) for dependency in graph[curr_course]: "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    2 answers
    Video answer for 'Implement k-means clustering.'

    "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "

    Taheia S. - "at first I want to know number of cluster I will put random number if I don't know and I will use method called Elbow method or Silhouette Score ,Gap Statistic and Davies–Bouldin Index to know the best number of cluster and I will use scikit-learn library to import kmeans from sklearn.cluster import KMeans kmeans = KMeans(nclusters=2, randomstate=0) kmeans.fit(X) and X this my data "See full answer

    Data Scientist
    Analytical
    +5 more
  • Amazon logoAsked at Amazon 
    52 answers
    +48

    " Brute Force Two Pointer Solution: from typing import List def two_sum(nums, target): for i in range(len(nums)): for j in range(i+1, len(nums)): if nums[i]+nums[j]==target: return [i,j] return [] debug your code below print(two_sum([2, 7, 11, 15], 9)) `"

    Ritaban M. - " Brute Force Two Pointer Solution: from typing import List def two_sum(nums, target): for i in range(len(nums)): for j in range(i+1, len(nums)): if nums[i]+nums[j]==target: return [i,j] return [] debug your code below print(two_sum([2, 7, 11, 15], 9)) `"See full answer

    Data Scientist
    Data Structures & Algorithms
    +5 more
  • Amazon logoAsked at Amazon 
    5 answers
    +2

    "Situation: COVID has impacted everyone's lives, especially small businesses. Earlier this year, during the second lockdown in Malaysia, it was estimated that 50%-70% of small businesses have closed. It got me thinking, beyond the existing training programmes, what can my company do to support small businesses? Task: So, I took the initiative to gather our Comms and Government Affairs team, to work together and explore how we can: 1) meaningfully demonstrate our company's commitment in"

    Judy W. - "Situation: COVID has impacted everyone's lives, especially small businesses. Earlier this year, during the second lockdown in Malaysia, it was estimated that 50%-70% of small businesses have closed. It got me thinking, beyond the existing training programmes, what can my company do to support small businesses? Task: So, I took the initiative to gather our Comms and Government Affairs team, to work together and explore how we can: 1) meaningfully demonstrate our company's commitment in"See full answer

    Data Scientist
    Behavioral
    +1 more
  • Amazon logoAsked at Amazon 
    6 answers

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Scientist
    Coding
    +4 more
  • Amazon logoAsked at Amazon 
    18 answers
    Video answer for 'Given an nxn grid of 1s and 0s, return the number of islands in the input.'
    +15

    " from typing import List def getnumberof_islands(binaryMatrix: List[List[int]]) -> int: if not binaryMatrix: return 0 rows = len(binaryMatrix) cols = len(binaryMatrix[0]) islands = 0 for r in range(rows): for c in range(cols): if binaryMatrixr == 1: islands += 1 dfs(binaryMatrix, r, c) return islands def dfs(grid, r, c): if ( r = len(grid) "

    Rick E. - " from typing import List def getnumberof_islands(binaryMatrix: List[List[int]]) -> int: if not binaryMatrix: return 0 rows = len(binaryMatrix) cols = len(binaryMatrix[0]) islands = 0 for r in range(rows): for c in range(cols): if binaryMatrixr == 1: islands += 1 dfs(binaryMatrix, r, c) return islands def dfs(grid, r, c): if ( r = len(grid) "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    3 answers

    "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."

    Ali H. - "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."See full answer

    Data Scientist
    Concept
    +7 more
  • Amazon logoAsked at Amazon 
    2 answers
    Video answer for 'What are common linear regression problems?'

    "I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"

    Ilnur I. - "I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"See full answer

    Data Scientist
    Analytical
    +2 more
  • Amazon logoAsked at Amazon 
    2 answers

    "Law is my passion. Traveling all over the world in 5 years"

    Moshe S. - "Law is my passion. Traveling all over the world in 5 years"See full answer

    Data Scientist
    Behavioral
    +4 more
  • Amazon logoAsked at Amazon 
    2 answers

    "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"

    Saurabh N. - "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"See full answer

    Data Scientist
    Behavioral
    +5 more
  • Amazon logoAsked at Amazon 
    12 answers
    +9

    "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "

    Anonymous Roadrunner - "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
Showing 1-20 of 21