Data Scientist Interview Questions

Review this list of 125 data scientist interview questions and answers verified by hiring managers and candidates.
  • "Product Understanding - Ads are what you see from companies as stories, posts, reels. Post are from users (connections). We have to design an experience which produces maximum engagement while generating ad revenue. Clarifying Questions - Is it specific to posts/stories/reels ? Is there an existing post to ads ratio or do we have to start from scratch? Is it specific to a device/OS? Is it specific to a region/user demographic? Assumption - Existing posts to ads ratio"

    Vishal S. - "Product Understanding - Ads are what you see from companies as stories, posts, reels. Post are from users (connections). We have to design an experience which produces maximum engagement while generating ad revenue. Clarifying Questions - Is it specific to posts/stories/reels ? Is there an existing post to ads ratio or do we have to start from scratch? Is it specific to a device/OS? Is it specific to a region/user demographic? Assumption - Existing posts to ads ratio"See full answer

    Data Scientist
    Data Analysis
  • Adobe logoAsked at Adobe 
    Video answer for 'Given an integer array nums and an integer k, return true if nums has a subarray of at least two elements whose sum is a multiple of k.'
    +10

    "Would be better to adjust resolution in the video player directly."

    Anonymous Prawn - "Would be better to adjust resolution in the video player directly."See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 

    "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."

    Gaston B. - "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • "The distribution of daily search queries per user, as shown in the histogram, can be described as approximately normal (or bell-shaped) with a slight positive skew. Key Characteristics: Shape: The distribution is roughly symmetrical around its center, resembling a bell curve. This indicates that most users perform a moderate number of daily search queries. Central Tendency: The peak of the distribution, representing the highest density of users, appears to be around **8"

    Sam A. - "The distribution of daily search queries per user, as shown in the histogram, can be described as approximately normal (or bell-shaped) with a slight positive skew. Key Characteristics: Shape: The distribution is roughly symmetrical around its center, resembling a bell curve. This indicates that most users perform a moderate number of daily search queries. Central Tendency: The peak of the distribution, representing the highest density of users, appears to be around **8"See full answer

    Data Scientist
    Statistics & Experimentation
  • Apple logoAsked at Apple 
    +10

    "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"

    Kishor J. - "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Meta (Facebook) logoAsked at Meta (Facebook) 

    "Clarifying Questions and possible responses: both audio and video goals: increase engagement time among groups/communitites and not require another platform to do group call (be one-stop for communication) region-TBD ios/android only available to users in a group to call users within the group who can intitiate these calls?- only admin? or anyone? metrics:NSM: feature engagement (C), number of calls made in a week per user (C). PM: % of people joining the call in a group"

    theproductguy - "Clarifying Questions and possible responses: both audio and video goals: increase engagement time among groups/communitites and not require another platform to do group call (be one-stop for communication) region-TBD ios/android only available to users in a group to call users within the group who can intitiate these calls?- only admin? or anyone? metrics:NSM: feature engagement (C), number of calls made in a week per user (C). PM: % of people joining the call in a group"See full answer

    Data Scientist
    Data Analysis
    +2 more
  • OpenAI logoAsked at OpenAI 
    Data Scientist
    Behavioral
    +5 more
  • +4

    "Step 1: Define Objectives and Key Metrics Objectives: Understand the demand for group video calling. Assess the potential impact on user engagement. Identify technical and user experience considerations. Key Metrics: Call Frequency: Number of 1:1 calls per user. Call Duration: Average duration of 1:1 calls. Call Participants: Identify users who frequently call multiple individuals. Concurrent Calls: Instances where users are engaged in multiple 1:1 call"

    Bhavna S. - "Step 1: Define Objectives and Key Metrics Objectives: Understand the demand for group video calling. Assess the potential impact on user engagement. Identify technical and user experience considerations. Key Metrics: Call Frequency: Number of 1:1 calls per user. Call Duration: Average duration of 1:1 calls. Call Participants: Identify users who frequently call multiple individuals. Concurrent Calls: Instances where users are engaged in multiple 1:1 call"See full answer

    Data Scientist
  • Adobe logoAsked at Adobe 
    +20

    "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "

    Anonymous Possum - "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"

    Aarav G. - "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"See full answer

    Data Scientist
    Statistics & Experimentation
  • TikTok logoAsked at TikTok 
    Video answer for 'Define success for TikTok.'
    +2

    "Mission: Tiktok's mission is to inspire creativity and Joy. Any business wants to make sure that they are serving the value to their customers: For TikTok customers are: Viewers 2. Content Creators 3. Advertisers So few metrics we could measure are: Time spent/day Total no of videos created/day engagement rate = users who interacted in one of the meaningful action on Tiktok / total users at a day level either likes, share, watched vide for at least 5 mins, created video "

    Nikita B. - "Mission: Tiktok's mission is to inspire creativity and Joy. Any business wants to make sure that they are serving the value to their customers: For TikTok customers are: Viewers 2. Content Creators 3. Advertisers So few metrics we could measure are: Time spent/day Total no of videos created/day engagement rate = users who interacted in one of the meaningful action on Tiktok / total users at a day level either likes, share, watched vide for at least 5 mins, created video "See full answer

    Data Scientist
    Analytical
    +1 more
  • "clarify: so does the 5% drop a sudden drop or overtime in the one week does it broadly drop 5% or it dropped only in some regions or in some segments like new acqusition / frequent active customers? or does the 5% drop also happened last year same period? DAU = acqusition x activation x retention segment: I will first quickly do some EDA to find out problem, like calculate the DAU drop in new customer, tenured customer, between regions to find out is there any difference. then I will also look"

    Yuexiang Y. - "clarify: so does the 5% drop a sudden drop or overtime in the one week does it broadly drop 5% or it dropped only in some regions or in some segments like new acqusition / frequent active customers? or does the 5% drop also happened last year same period? DAU = acqusition x activation x retention segment: I will first quickly do some EDA to find out problem, like calculate the DAU drop in new customer, tenured customer, between regions to find out is there any difference. then I will also look"See full answer

    Data Scientist
    Analytical
  • Microsoft logoAsked at Microsoft 

    "In the Transformer architecture, the decoder differs from the encoder primarily in its additional mechanisms designed to handle autoregressive sequence generation. Here's a breakdown of the key differences: Self-Attention Mechanism: Encoder: The encoder has a standard self-attention mechanism that allows each token to attend to all other tokens in the input sequence. Decoder: The decoder has two types of self-attention. The first is the same as in the encoder, but the second is mas"

    Ranj A. - "In the Transformer architecture, the decoder differs from the encoder primarily in its additional mechanisms designed to handle autoregressive sequence generation. Here's a breakdown of the key differences: Self-Attention Mechanism: Encoder: The encoder has a standard self-attention mechanism that allows each token to attend to all other tokens in the input sequence. Decoder: The decoder has two types of self-attention. The first is the same as in the encoder, but the second is mas"See full answer

    Data Scientist
    Statistics & Experimentation
  • Apple logoAsked at Apple 
    +15

    "function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"

    Tiago R. - "function isValid(s) { const stack = []; for (let i=0; i < s.length; i++) { const char = s.charAt(i); if (['(', '{', '['].includes(char)) { stack.push(char); } else { const top = stack.pop(); if ((char === ')' && top !== '(') || (char === '}' && top !== '{') || (char === ']' && top !== '[')) { return false; } } } return stack.length === 0"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Amazon logoAsked at Amazon 
    +4

    "DFS with check of an already seen node in the graph would work from collections import deque, defaultdict from typing import List def iscourseloopdfs(idcourse: int, graph: defaultdict[list]) -> bool: stack = deque([(id_course)]) seen_courses = set() while stack: print(stack) curr_course = stack.pop() if currcourse in seencourses: return True seencourses.add(currcourse) for dependency in graph[curr_course]: "

    Gabriele G. - "DFS with check of an already seen node in the graph would work from collections import deque, defaultdict from typing import List def iscourseloopdfs(idcourse: int, graph: defaultdict[list]) -> bool: stack = deque([(id_course)]) seen_courses = set() while stack: print(stack) curr_course = stack.pop() if currcourse in seencourses: return True seencourses.add(currcourse) for dependency in graph[curr_course]: "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Visa logoAsked at Visa 

    "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."

    Nidhi S. - "There are couple of reasons for it - Kind of role : Its a product manager role loaded with analytical work, So working with data in stringent regulatory guideline make it more exciting and thrilling. Location & industry is like - Cherry on the cake, Bangalore weather and BFI is at its all time peak as people spending behavior is changing continuously, it will be interesting to see big giants like visa are managing it."See full answer

    Data Scientist
    Behavioral
    +4 more
  • Adobe logoAsked at Adobe 
    Video answer for 'Move all zeros to the end of an array.'
    +43

    "this solution here is much faster than the exponent reference soln. It is also far more concise and easy to understand def moveZerosToEnd(arr: List[int]) -> List[int]: left = 0 for right in range(len(arr)): if arr[right] == 0: pass else: if left != right: temp = arr[left] arr[left] = arr[right] arr[right] = temp left += 1 return arr `"

    Devesh K. - "this solution here is much faster than the exponent reference soln. It is also far more concise and easy to understand def moveZerosToEnd(arr: List[int]) -> List[int]: left = 0 for right in range(len(arr)): if arr[right] == 0: pass else: if left != right: temp = arr[left] arr[left] = arr[right] arr[right] = temp left += 1 return arr `"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Meta (Facebook) logoAsked at Meta (Facebook) 
    Data Scientist
    Statistics & Experimentation
  • Discord logoAsked at Discord 

    " A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"

    Scott S. - " A couple of years ago, we were working on a project to integrate a new third-party data feed into our existing data processing pipeline. This data feed was critical for enhancing our trading algorithms with more comprehensive market data. Given the tight timeline and high stakes, I decided to push for a rapid implementation. In my eagerness to meet the deadline, I underestimated the complexity of integrating this new data feed. I did not allocate sufficient time for thorough testing and valida"See full answer

    Data Scientist
    Behavioral
    +2 more
Showing 21-40 of 125