Data Scientist Interview Questions

Review this list of 125 data scientist interview questions and answers verified by hiring managers and candidates.

+ Add interview

Product

Engineering

Operations

Design

Marketing

Data

Sales

Finance

Consulting

Add interview

Product Manager Software Engineer Technical Program Manager Engineering Manager Data Scientist Data Engineer Machine Learning Engineer Data Analyst BizOps & Strategy Product Analyst

Asked at Walmart Labs • 9 months ago
Why do you want to work at Walmart Labs?
Data Scientist
Behavioral
+5 more
Add answer I was asked this
Data Scientist
Behavioral
+5 more
Asked at Adobe, Meta (Facebook), Oracle + 1 more • 9 months ago
Determine if a given binary tree is a binary search tree (BST).
IDE
Medium
Data Scientist
Coding
+4 more
9 answers I was asked this
+5
"bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"
Alvaro R. - "bool isValidBST(TreeNode* root, long min = LONGMIN, long max = LONGMAX){ if (root == NULL) return true; if (root->val val >= max) return false; return isValidBST(root->left, min, root->val) && isValidBST(root->right, root->val, max); } `"See full answer
Data Scientist
Coding
+4 more
Asked at Figma, Traba • 4 months ago
What motivates you?
Data Scientist
Behavioral
+2 more
Add answer I was asked this
Data Scientist
Behavioral
+2 more
Asked at McKinsey • 6 months ago
In what cases should you use the median instead of the mean?
Data Scientist
Statistics & Experimentation
1 answer I was asked this
"The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"
Himani E. - "The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"See full answer
Data Scientist
Statistics & Experimentation
Asked at Adobe, Bytedance, Meta (Facebook) + 3 more • 2 months ago
Merge two sorted lists
Data Scientist
Data Structures & Algorithms
+4 more
5 answers I was asked this
+2
"public class sample { public int [] merge(int [] a, int [] b) { if(a == null || a.length == 0 || b == null || b.length == 0) return null; int i = 0, j = 0, index = -1; int [] merged = new int[a.length + b.length]; while (i < a.length && j < b.length) { if(a[i] < b[i]) merged[++index] = a[i++]; else merged[++index] = b[j++]; } while (i < a.length) { merged[++index] = a[i++]; } "
Nikhil R. - "public class sample { public int [] merge(int [] a, int [] b) { if(a == null || a.length == 0 || b == null || b.length == 0) return null; int i = 0, j = 0, index = -1; int [] merged = new int[a.length + b.length]; while (i < a.length && j < b.length) { if(a[i] < b[i]) merged[++index] = a[i++]; else merged[++index] = b[j++]; } while (i < a.length) { merged[++index] = a[i++]; } "See full answer
Data Scientist
Data Structures & Algorithms
+4 more

🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

Asked at Tinder • 2 years ago
Tinder subscriptions renew monthly. Explain why different months may have different numbers of renewals.
Data Scientist
Technical
1 answer I was asked this
"Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"
Srijita P. - "Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"See full answer
Data Scientist
Technical
Asked at American Express • 7 months ago
What is the important mechanism that happens within Random Forest and what does bootstrapping do in this algorithm?
Data Scientist
Technical
1 answer I was asked this
"A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"
Yuexiang Y. - "A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"See full answer
Data Scientist
Technical
Asked at TikTok, Valve • 2 years ago
As the data scientist, interpreting a significant increase in revenue from a new feature in one of 20 countries, what would you recommend?
Data Scientist
Analytical
2 answers I was asked this
"too much discussing on p-value…. and theoritical things…. country are independant…."
Brook - "too much discussing on p-value…. and theoritical things…. country are independant…."See full answer
Data Scientist
Analytical
Asked at Adobe, Apple, Booking.com + 10 more • 4 months ago
Find the maximum subarray sum.
IDE
Medium
Data Scientist
Data Structures & Algorithms
+4 more
25 answers I was asked this
+17
" O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"
Rick E. - " O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"See full answer
Data Scientist
Data Structures & Algorithms
+4 more
Asked at Amazon, Discord, Slack • 5 months ago
How do you encourage collaboration among cross-functional teams?
Data Scientist
Behavioral
+4 more
2 answers I was asked this
"1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"
Saurabh N. - "1) Have a common goal 2) Have a clear and fair accountability between teams 3) Ensure conflicts are resolved in time on common issues 4) Promote common Brain-storming , problem solving sessions 5) Most important , Have clear and effective communication established and practised"See full answer
Data Scientist
Behavioral
+4 more
Asked at SAP • 2 years ago
Design a system capable of identifying ships that deviate from their course using a dataset that tracks ship positions, recorded as tuples containing (ship_ID, x, y, z, timestamp), with irregular t...
Data Scientist
System Design
1 answer I was asked this
"To handle the non-uniform sampling, I'd first clean and divide the dataset into chunks of n second interval 'uniform' trajectory data(e.g. 5s or 10s trajectories). This gives us a cleaner trajectory data chunks, T, of format (ship_ID, x, y, z, timestamp) to be formed. For the system itself, I'd use a generative model, e.g. Variational AutoEncoder (VAE), and train the model's 'encoder' to produce a latent-space representation of input features (x,y,z,timestamp) from T, and it's 'decoder' to pred"
Anonymous Hornet - "To handle the non-uniform sampling, I'd first clean and divide the dataset into chunks of n second interval 'uniform' trajectory data(e.g. 5s or 10s trajectories). This gives us a cleaner trajectory data chunks, T, of format (ship_ID, x, y, z, timestamp) to be formed. For the system itself, I'd use a generative model, e.g. Variational AutoEncoder (VAE), and train the model's 'encoder' to produce a latent-space representation of input features (x,y,z,timestamp) from T, and it's 'decoder' to pred"See full answer
Data Scientist
System Design
Asked at Amazon, Apple, Meta (Facebook) + 3 more • 4 months ago
What are you passionate about?
Data Scientist
Behavioral
+4 more
2 answers I was asked this
"Law is my passion. Traveling all over the world in 5 years"
Moshe S. - "Law is my passion. Traveling all over the world in 5 years"See full answer
Data Scientist
Behavioral
+4 more
Asked at AstraZeneca • 2 years ago
Tell me about your experience working with scientists.
Data Scientist
Behavioral
1 answer I was asked this
"I don't have experience working with alot of Biological Scientists. Most of my experience comes with Data Scientists. Described how I used ideation techniques like brainstorming and other creative ways to get people to find common ground. I also mentioned how I like to do survey's before meetings to prompt people and also get unbiased opnions"
Mark M. - "I don't have experience working with alot of Biological Scientists. Most of my experience comes with Data Scientists. Described how I used ideation techniques like brainstorming and other creative ways to get people to find common ground. I also mentioned how I like to do survey's before meetings to prompt people and also get unbiased opnions"See full answer
Data Scientist
Behavioral
Asked at Meta (Facebook) • 2 months ago
A user advocacy group raises concerns about accessibility for individuals with hearing disabilities. What are some product improvements for Facebook Live and Videos, and how would you define succes...
Data Scientist
Execution
Add answer I was asked this
Data Scientist
Execution
Asked at Meta (Facebook) • 4 years ago
How would you help the Instagram team decide whether to launch the Rooms feature after a successful launch on Facebook?
Data Scientist
Add answer I was asked this
Data Scientist
Asked at Meta (Facebook) • 4 years ago
Would you port Facebook rooms to Instagram?
Data Scientist
Product Strategy
Add answer I was asked this
Data Scientist
Product Strategy
Asked at Lyft • a year ago
A discount coupon is given to N riders. The probability of using a coupon is P. What is the probability that one of the coupons will be used?
Data Scientist
Statistics & Experimentation
2 answers I was asked this
"Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"
Chetak C. - "Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"See full answer
Data Scientist
Statistics & Experimentation
Asked at Adobe, Apple, Google + 1 more • 9 months ago
Permutations
IDE
Medium
Data Scientist
Data Structures & Algorithms
+3 more
3 answers I was asked this
"function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"
Tiago R. - "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"See full answer
Data Scientist
Data Structures & Algorithms
+3 more
Asked at SAP • 2 years ago
You have a dataset comprising 1,000 avatar images and 100,000 user descriptions with associated avatar images. Create a model that recommends an image from a new set of 100,000 images for a user de...
Data Scientist
Machine Learning
2 answers I was asked this
"[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"
Nick S. - "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"See full answer
Data Scientist
Machine Learning
Asked at Walmart Labs • 9 months ago
Tell me about your e-commerce experience.
Data Scientist
Behavioral
+2 more
Add answer I was asked this
Data Scientist
Behavioral
+2 more

Showing 81-100 of 125

Interviewed recently?

Help improve our question database (and earn karma) by telling us about your experience

Trending companies

Data Scientist Interview Questions

Why do you want to work at Walmart Labs?

Determine if a given binary tree is a binary search tree (BST).

What motivates you?

In what cases should you use the median instead of the mean?

Merge two sorted lists

Tinder subscriptions renew monthly. Explain why different months may have different numbers of renewals.

What is the important mechanism that happens within Random Forest and what does bootstrapping do in this algorithm?

As the data scientist, interpreting a significant increase in revenue from a new feature in one of 20 countries, what would you recommend?

Find the maximum subarray sum.

How do you encourage collaboration among cross-functional teams?

Design a system capable of identifying ships that deviate from their course using a dataset that tracks ship positions, recorded as tuples containing (ship_ID, x, y, z, timestamp), with irregular t...

What are you passionate about?

Tell me about your experience working with scientists.

A user advocacy group raises concerns about accessibility for individuals with hearing disabilities. What are some product improvements for Facebook Live and Videos, and how would you define succes...

How would you help the Instagram team decide whether to launch the Rooms feature after a successful launch on Facebook?

Would you port Facebook rooms to Instagram?

A discount coupon is given to N riders. The probability of using a coupon is P. What is the probability that one of the coupons will be used?

Permutations

You have a dataset comprising 1,000 avatar images and 100,000 user descriptions with associated avatar images. Create a model that recommends an image from a new set of 100,000 images for a user de...

Tell me about your e-commerce experience.

Explore questions by company

Explore questions by role