Skip to main content

Data Scientist Interview Questions

Review this list of 179 Data Scientist interview questions and answers verified by hiring managers and candidates.
  • "I would conduct a sample z-test because we have enough samples and the population variance is known. H1: average monthly spending per user is $50 H0: average monthly spending per user is greater $50 One-sample z-test x_bar = $85 mu = $50 s = $20 n = 100 x_bar - mu / (s / sqrt(n) = 17.5 17.5 is the z-score that we will need to associate with its corresponding p-value. However, the z-score is very high, so the p-value will be very close to zero, which is much less than the standa"

    Lucas G. - "I would conduct a sample z-test because we have enough samples and the population variance is known. H1: average monthly spending per user is $50 H0: average monthly spending per user is greater $50 One-sample z-test x_bar = $85 mu = $50 s = $20 n = 100 x_bar - mu / (s / sqrt(n) = 17.5 17.5 is the z-score that we will need to associate with its corresponding p-value. However, the z-score is very high, so the p-value will be very close to zero, which is much less than the standa"See full answer

    Data Scientist
    Statistics & Experimentation
  • McKinsey logoAsked at McKinsey 
    1 answer

    "Spoiled food In a process I improved, I streamlined how tasks were assigned to reduce delays and confusion."

    Ruth A. - "Spoiled food In a process I improved, I streamlined how tasks were assigned to reduce delays and confusion."See full answer

    Data Scientist
    Analytical
    +1 more
  • Walmart Labs logoAsked at Walmart Labs 
    1 answer

    "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"

    Rajeev K. - "I’ve spent over 6 years building and scaling e-commerce products across EMEA and APAC. At Jumia, I led product initiatives on the checkout and payments side. For example, I launched gamified promotions on PDP and checkout that improved engagement and delivered a 2.3x uplift in conversion. I also introduced automated installment payments and order cancellation flows, which not only improved user trust but also reduced complaints by 30% and lowered operational costs. Before that, at Lazada, I work"See full answer

    Data Scientist
    Behavioral
    +2 more
  • TikTok logoAsked at TikTok 
    3 answers

    "I generate insights through stakeholder requirements and the data I have in hand"

    Anonymous Eagle - "I generate insights through stakeholder requirements and the data I have in hand"See full answer

    Data Scientist
    Analytical
    +1 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"

    Nick S. - "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"See full answer

    Data Scientist
    Machine Learning
  • Slack logoAsked at Slack 
    2 answers

    "The only time I felt unhappy at work was when I didn’t get to pick the job I want to do at the jobsite."

    Amparo L. - "The only time I felt unhappy at work was when I didn’t get to pick the job I want to do at the jobsite."See full answer

    Data Scientist
    Behavioral
  • DoorDash logoAsked at DoorDash 
    Add answer
    Data Scientist
    Data Analysis
  • McKinsey logoAsked at McKinsey 
    1 answer

    "The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"

    Himani E. - "The cases where data is under heavy outlier influence. Since mean fluctuates due to the presence of an outlier, median might be a better measure"See full answer

    Data Scientist
    Statistics & Experimentation
  • "Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"

    Chetak C. - "Probability that one of the coupons is used = 1 - Probability that no coupon is used = 1 - nC0 p^0 * (1-p)^n = 1 -(1-p)^n"See full answer

    Data Scientist
    Statistics & Experimentation
  • "A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"

    Yuexiang Y. - "A Random Forest works by building an ensemble of decision trees, each trained on a slightly different version of the data. The key mechanism is bagging: for each tree, we sample the training data with replacement (bootstrapping), so every tree sees a different subset of examples. On top of that, at each split the algorithm randomly selects a subset of features, so trees explore different predictors. These two sources of randomness decorrelate the trees. When we aggregate them — by averag"See full answer

    Data Scientist
    Technical
  • Apple logoAsked at Apple 
    9 answers
    +5

    "Make current as root. 2 while current is not null, if p and q are less than current, go left. If p and q are greater than current, go right. else return current. return null"

    Vaibhav D. - "Make current as root. 2 while current is not null, if p and q are less than current, go left. If p and q are greater than current, go right. else return current. return null"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Apple logoAsked at Apple 

    Coin Change

    IDE
    Medium
    12 answers
    +9

    "The example given is wrong. The 2nd test case should have answer 3, as we can get to 6 by using 3 coins of denomination 2."

    Anmol R. - "The example given is wrong. The 2nd test case should have answer 3, as we can get to 6 by using 3 coins of denomination 2."See full answer

    Data Scientist
    Coding
    +4 more
  • Salesforce logoAsked at Salesforce 
    Add answer
    Data Scientist
    Behavioral
    +4 more
  • Google logoAsked at Google 
    Add answer
    Data Scientist
    Behavioral
  • 1 answer

    "The algorithm calculates certain metrics like entropy & Gini Impurity. The goal of the decision tree algorithm is to find the most optimal value for these metrics, lowest values for Gini Impurity & Entropy. Once it converges on the minima, it creates a split & grows the branches."

    Saurabh J. - "The algorithm calculates certain metrics like entropy & Gini Impurity. The goal of the decision tree algorithm is to find the most optimal value for these metrics, lowest values for Gini Impurity & Entropy. Once it converges on the minima, it creates a split & grows the branches."See full answer

    Data Scientist
    Concept
    +1 more
  • Tinder logoAsked at Tinder 
    Add answer
    Data Scientist
    Behavioral
  • Discord logoAsked at Discord 
    Add answer
    Data Scientist
    Behavioral
    +2 more
  • Adobe logoAsked at Adobe 
    1 answer

    "Leetcode 347: Heap + Hashtable Follow up question: create heap with the length of K instead of N (more time complexity but less space )"

    Chen J. - "Leetcode 347: Heap + Hashtable Follow up question: create heap with the length of K instead of N (more time complexity but less space )"See full answer

    Data Scientist
    Data Structures & Algorithms
    +3 more
  • Oracle logoAsked at Oracle 
    3 answers

    "def countuniqueoutfits(totalpants: int, uniquepants: int, totalshirts: int, uniqueshirts: int, totalhats: int, uniquehats: int) -> int: """ Number of unique outfits can simply be defined by (uniquepantschoose1uniqueshirtschoose1uniquehatschoose_1) (uniquepantschoose1*uniqueshirtschoose1) # Not wearing a hat nchoosek is n """ res = (uniquepants*uniqueshirtsuniquehats) + (uniquepantsunique_shirts) return res print(countuniqueoutfits(2, 1, 1, 1, 3, 2))"

    Sai R. - "def countuniqueoutfits(totalpants: int, uniquepants: int, totalshirts: int, uniqueshirts: int, totalhats: int, uniquehats: int) -> int: """ Number of unique outfits can simply be defined by (uniquepantschoose1uniqueshirtschoose1uniquehatschoose_1) (uniquepantschoose1*uniqueshirtschoose1) # Not wearing a hat nchoosek is n """ res = (uniquepants*uniqueshirtsuniquehats) + (uniquepantsunique_shirts) return res print(countuniqueoutfits(2, 1, 1, 1, 3, 2))"See full answer

    Data Scientist
    Coding
Showing 141-160 of 179
Exponent

Get updates in your inbox with the latest tips, job listings, and more.

Follow Us

Products
Courses
Interview Questions
Interview Experiences
Popular articles
Guides
Coaching
For Partners
Company
Exponent © 2026
Terms of Service | Privacy