Data Scientist Interview Questions

Review this list of 174 Data Scientist interview questions and answers verified by hiring managers and candidates.

+ Share interview

Asked at Amazon • 5 years ago
What are common linear regression problems?
Data Scientist
Analytical
+2 more
2 answers
"I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"
Ilnur I. - "I can try to summarize their discussion as I remembered. Linear regression is one of the method to predict target (Y) using features (X). Formula for linear regression is a linear function of features. The aim is to choose coefficients (Teta) of the prediction function in such a way that the difference between target and prediction is least in average. This difference between target and prediction is called loss function. The form of this loss function could be dependent from the particular real"See full answer
Data Scientist
Analytical
+2 more
Asked at Adobe, Intel, Nvidia + 1 more • a year ago
Sort Colors
Data Scientist
Data Structures & Algorithms
+4 more
Add answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at Adobe, Apple, Capital One + 1 more • a year ago
Roman to Integer
Data Scientist
Data Structures & Algorithms
+4 more
Add answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at Meta • 10 months ago
What are the KPIs for push notifications?
Data Scientist
Analytical
+2 more
3 answers
"Product Understanding - Push notifications are pop up notifications received on the device (phone, tablet etc.) sent by various Meta apps whenever a new post has been made or a new message is received Clarifying Questions - Is is specific to one device? Is it specific to one product? Is it specific to one region? Is it specific to one OS? Is this as a result of changes to algorithm/UI? Existing or a new feature? Assumptions - KPI calculation will only be for users who h"
Vishal S. - "Product Understanding - Push notifications are pop up notifications received on the device (phone, tablet etc.) sent by various Meta apps whenever a new post has been made or a new message is received Clarifying Questions - Is is specific to one device? Is it specific to one product? Is it specific to one region? Is it specific to one OS? Is this as a result of changes to algorithm/UI? Existing or a new feature? Assumptions - KPI calculation will only be for users who h"See full answer
Data Scientist
Analytical
+2 more
Asked at Meta • 5 years ago
How would you enhance Facebook comments?
Data Scientist
Product Design
1 answer
"How would you increase the number of comments on groups?"
rkk293 - "How would you increase the number of comments on groups?"See full answer
Data Scientist
Product Design

🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

Find Customer Lifetime Value (LTV)
IDE
Medium
Data Scientist
Coding
+3 more
7 answers
+4
"-- LTV = Sum of all purchases made by that user -- order the results by desc on LTV select u.user_id, sum(a.purchase_value) as LTV from user_sessions u join attribution a on u.sessionid = a.sessionid group by u.user_id order by sum(a.purchase_value) desc"
Mohit C. - "-- LTV = Sum of all purchases made by that user -- order the results by desc on LTV select u.user_id, sum(a.purchase_value) as LTV from user_sessions u join attribution a on u.sessionid = a.sessionid group by u.user_id order by sum(a.purchase_value) desc"See full answer
Data Scientist
Coding
+3 more
Asked at Apple, Goldman Sachs, LinkedIn + 8 more • 11 days ago
Search in rotated sorted array
Data Scientist
Data Structures & Algorithms
+4 more
Add answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at OpenAI • a year ago
Explain deep reinforcement learning.
Data Scientist
Concept
+1 more
2 answers
"Reinforcement Learning is a type of machine learning where an agent learns to make decisions by trying out different actions and receiving rewards or penalties in return. The goal is to learn, over time, which actions yield the highest rewards. There are three core components in RL: The agent — the learner or decision-maker (e.g., an algorithm or robot), The environment — everything the agent interacts with, Actions and rewards — the agent takes actions, and the environmen"
Constantin P. - "Reinforcement Learning is a type of machine learning where an agent learns to make decisions by trying out different actions and receiving rewards or penalties in return. The goal is to learn, over time, which actions yield the highest rewards. There are three core components in RL: The agent — the learner or decision-maker (e.g., an algorithm or robot), The environment — everything the agent interacts with, Actions and rewards — the agent takes actions, and the environmen"See full answer
Data Scientist
Concept
+1 more
Asked at Airbnb, Google • 11 days ago
What is the best way to connect SQL databases and why?
Data Scientist
Concept
+6 more
2 answers
"Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"
Aaron W. - "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"See full answer
Data Scientist
Concept
+6 more
Asked at Meta • a year ago
A PM at Meta asked you to describe the distribution of daily minutes spent on Facebook per user. How would you describe it?
Data Scientist
Statistics & Experimentation
1 answer
"The distribution of daily minutes spent on Facebook per user is heavily right-skewed with a long tail. Most users spend a short amount of time while a smaller segment of heavy users push up the average with 2–3+ hours daily."
Vineet M. - "The distribution of daily minutes spent on Facebook per user is heavily right-skewed with a long tail. Most users spend a short amount of time while a smaller segment of heavy users push up the average with 2–3+ hours daily."See full answer
Data Scientist
Statistics & Experimentation
Asked at Adobe, Bytedance, LinkedIn + 3 more • a month ago
Merge two sorted lists
Data Scientist
Data Structures & Algorithms
+5 more
6 answers
+3
"def mergeTwoListsRecursive(l1, l2): if not l1 or not l2: return l1 or l2 if l1.val < l2.val: l1.next = mergeTwoListsRecursive(l1.next, l2) return l1 else: l2.next = mergeTwoListsRecursive(l1, l2.next) return l2 "
Ramachandra N. - "def mergeTwoListsRecursive(l1, l2): if not l1 or not l2: return l1 or l2 if l1.val < l2.val: l1.next = mergeTwoListsRecursive(l1.next, l2) return l1 else: l2.next = mergeTwoListsRecursive(l1, l2.next) return l2 "See full answer
Data Scientist
Data Structures & Algorithms
+5 more
Asked at OpenAI • a month ago
Walk me through a past data science project.
Data Scientist
Behavioral
Add answer
Data Scientist
Behavioral
Asked at Adobe, Meta, Oracle + 1 more • 14 days ago
Determine if a given binary tree is a binary search tree (BST).
IDE
Medium
Data Scientist
Data Structures & Algorithms
+4 more
10 answers
+6
"bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"
Alvaro R. - "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"See full answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at Amazon, American Express, Apple + 4 more • 18 days ago
What are you passionate about?
Data Scientist
Behavioral
+4 more
2 answers
"Law is my passion. Traveling all over the world in 5 years"
Moshe S. - "Law is my passion. Traveling all over the world in 5 years"See full answer
Data Scientist
Behavioral
+4 more
Asked at TikTok • 2 years ago
Describe an experience that you consider most valuable.
Data Scientist
Behavioral
2 answers
"During the early learning stages of my career as a data scientist, I struggled to fully grasp the concept of patterns in seemingly random events. Despite thorough research, I remained skeptical about the universality of this idea. However, my perspective changed during a surprising scuba diving experience while on vacation. The underwater world I encountered not only amazed me but also gave me a key insight into patterns in nature and data analysis. As I dove into the clear waters, I witnessed"
Saipranay M. - "During the early learning stages of my career as a data scientist, I struggled to fully grasp the concept of patterns in seemingly random events. Despite thorough research, I remained skeptical about the universality of this idea. However, my perspective changed during a surprising scuba diving experience while on vacation. The underwater world I encountered not only amazed me but also gave me a key insight into patterns in nature and data analysis. As I dove into the clear waters, I witnessed"See full answer
Data Scientist
Behavioral
Asked at Amazon, Anthropic, Discord + 1 more • 5 months ago
How do you encourage collaboration among cross-functional teams?
Data Scientist
Behavioral
+5 more
2 answers
"1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"
Saurabh N. - "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"See full answer
Data Scientist
Behavioral
+5 more
Asked at Microsoft • 2 years ago
Given a list of sentences, find the top n most frequent words.
Data Scientist
Coding
1 answer
"class Solution { public: vector topKFrequent(vector& sentences , int k) { unordered_map mp; auto cmp={ if(a.second==b.second) return a.first>b.first; return a.second,vector>,decltype(cmp)> pq(cmp); for(string s : sentences) { stringstream ss(s); string word; while(ss >> word) { mp[word]++; } } for(auto it=mp.begin();it!=mp.end();++it) pq.push({it->first,it->secon"
Manu G. - "class Solution { public: vector topKFrequent(vector& sentences , int k) { unordered_map mp; auto cmp={ if(a.second==b.second) return a.first>b.first; return a.second,vector>,decltype(cmp)> pq(cmp); for(string s : sentences) { stringstream ss(s); string word; while(ss >> word) { mp[word]++; } } for(auto it=mp.begin();it!=mp.end();++it) pq.push({it->first,it->secon"See full answer
Data Scientist
Coding
Asked at Anthropic • 6 months ago
Identify success metrics for a marketing campaign to get new users, then design an experiment to determine if the campaign should continue.
Data Scientist
Statistics & Experimentation
1 answer
"Marketing campaigns are run through different channels such as social media, emails, SEO, web advertising, events, etc. Let’s look at some of the overall success metrics at a broader level: Total views for your campaign Unique views for your campaign Returning visitors for your campaign Engagement for your campaign (If it’s a social media campaign, the marketer might be interested in knowing the number of users engaging with the campaign and the type of campaign positive/negative) 5"
Sangeeta P. - "Marketing campaigns are run through different channels such as social media, emails, SEO, web advertising, events, etc. Let’s look at some of the overall success metrics at a broader level: Total views for your campaign Unique views for your campaign Returning visitors for your campaign Engagement for your campaign (If it’s a social media campaign, the marketer might be interested in knowing the number of users engaging with the campaign and the type of campaign positive/negative) 5"See full answer
Data Scientist
Statistics & Experimentation
Asked at Adobe, Amazon, Apple + 10 more • a year ago
Calculate the trapped rainwater between bars in a given array.
IDE
Hard
Data Scientist
Data Structures & Algorithms
+4 more
13 answers
+10
"from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "
Anonymous Roadrunner - "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "See full answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at OpenAI • 5 months ago
When experimentation is constrained, how do you make decisions using imperfect or historical data?
Data Scientist
Statistics & Experimentation
Add answer
Data Scientist
Statistics & Experimentation