Skip to main content

Data Scientist Interview Questions

Review this list of 179 Data Scientist interview questions and answers verified by hiring managers and candidates.
  • Microsoft logoAsked at Microsoft 
    1 answer

    "class Solution { public: vector topKFrequent(vector& sentences , int k) { unordered_map mp; auto cmp={ if(a.second==b.second) return a.first>b.first; return a.second,vector>,decltype(cmp)> pq(cmp); for(string s : sentences) { stringstream ss(s); string word; while(ss >> word) { mp[word]++; } } for(auto it=mp.begin();it!=mp.end();++it) pq.push({it->first,it->secon"

    Manu G. - "class Solution { public: vector topKFrequent(vector& sentences , int k) { unordered_map mp; auto cmp={ if(a.second==b.second) return a.first>b.first; return a.second,vector>,decltype(cmp)> pq(cmp); for(string s : sentences) { stringstream ss(s); string word; while(ss >> word) { mp[word]++; } } for(auto it=mp.begin();it!=mp.end();++it) pq.push({it->first,it->secon"See full answer

    Data Scientist
    Coding
  • Adobe logoAsked at Adobe 
    13 answers
    +10

    "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "

    Anonymous Roadrunner - "from typing import List def traprainwater(height: List[int]) -> int: if not height: return 0 l, r = 0, len(height) - 1 leftMax, rightMax = height[l], height[r] res = 0 while l < r: if leftMax < rightMax: l += 1 leftMax = max(leftMax, height[l]) res += leftMax - height[l] else: r -= 1 rightMax = max(rightMax, height[r]) "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Adobe logoAsked at Adobe 
    4 answers
    +1

    "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"

    Divya R. - "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • "too much discussing on p-value…. and theoritical things…. country are independant…."

    Brook - "too much discussing on p-value…. and theoritical things…. country are independant…."See full answer

    Data Scientist
    Analytical
  • OpenAI logoAsked at OpenAI 
    Add answer
    Data Scientist
    Statistics & Experimentation
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • 6 answers
    +3

    "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"

    Jayveer S. - "WITH suspicious_transactions AS ( SELECT c.first_name, c.last_name, t.receipt_number, COUNT(t.receiptnumber) OVER (PARTITION BY c.customerid) AS noofoffences FROM customers c JOIN transactions t ON c.customerid = t.customerid WHERE t.receipt_number LIKE '%999%' OR t.receipt_number LIKE '%1234%' OR t.receipt_number LIKE '%XYZ%' ) SELECT first_name, last_name, receipt_number, noofoffences FROM suspicious_transactions WHERE noofoffences >= 2;"See full answer

    Data Scientist
    Coding
    +3 more
  • Adobe logoAsked at Adobe 

    Permutations

    IDE
    Medium
    4 answers
    +1

    "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"

    Tiago R. - "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"See full answer

    Data Scientist
    Data Structures & Algorithms
    +3 more
  • Cognition AI logoAsked at Cognition AI 
    Add answer
    Data Scientist
    Behavioral
    +3 more
  • Meta logoAsked at Meta 
    Add answer
    Data Scientist
    Analytical
  • OpenAI logoAsked at OpenAI 
    Add answer
    Data Scientist
    Statistics & Experimentation
  • "To handle the non-uniform sampling, I'd first clean and divide the dataset into chunks of n second interval 'uniform' trajectory data(e.g. 5s or 10s trajectories). This gives us a cleaner trajectory data chunks, T, of format (ship_ID, x, y, z, timestamp) to be formed. For the system itself, I'd use a generative model, e.g. Variational AutoEncoder (VAE), and train the model's 'encoder' to produce a latent-space representation of input features (x,y,z,timestamp) from T, and it's 'decoder' to pred"

    Anonymous Hornet - "To handle the non-uniform sampling, I'd first clean and divide the dataset into chunks of n second interval 'uniform' trajectory data(e.g. 5s or 10s trajectories). This gives us a cleaner trajectory data chunks, T, of format (ship_ID, x, y, z, timestamp) to be formed. For the system itself, I'd use a generative model, e.g. Variational AutoEncoder (VAE), and train the model's 'encoder' to produce a latent-space representation of input features (x,y,z,timestamp) from T, and it's 'decoder' to pred"See full answer

    Data Scientist
    System Design
  • Discord logoAsked at Discord 
    Add answer
    Data Scientist
    Behavioral
    +4 more
  • Uber logoAsked at Uber 
    1 answer

    "Based on the required significance level (usually less than 5%) and based on the test power (usually 95%?), I will calculate the required sample size. Once I get the sample size, then I will do the A/B testing until I meet the sample size."

    Naga M. - "Based on the required significance level (usually less than 5%) and based on the test power (usually 95%?), I will calculate the required sample size. Once I get the sample size, then I will do the A/B testing until I meet the sample size."See full answer

    Data Scientist
    Technical
    +1 more
  • "Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"

    Srijita P. - "Clarification question: How many subscription plans are offered by Tinder ? If there is more than one subscription plan, then we need to ask is the fluctuation happening across all plans or in a particular one ? Assumption: Let's say lower priced subscription plan is showing the most fluctuation and there are only two types of plans In this subscription plan which age group is showing the most fluctuation (18-24,25-30, 30+ etc) ? Is there any seasonality trend observed (eg: placemen"See full answer

    Data Scientist
    Technical
  • OpenAI logoAsked at OpenAI 
    Add answer
    Data Scientist
    Statistics & Experimentation
  • Data Scientist
    Coding
  • Meta logoAsked at Meta 
    Add answer
    Data Scientist
    Product Strategy
  • Walmart Labs logoAsked at Walmart Labs 
    Add answer
    Data Scientist
    Behavioral
    +5 more
  • AstraZeneca logoAsked at AstraZeneca 
    1 answer

    "I don't have experience working with alot of Biological Scientists. Most of my experience comes with Data Scientists. Described how I used ideation techniques like brainstorming and other creative ways to get people to find common ground. I also mentioned how I like to do survey's before meetings to prompt people and also get unbiased opnions"

    Mark M. - "I don't have experience working with alot of Biological Scientists. Most of my experience comes with Data Scientists. Described how I used ideation techniques like brainstorming and other creative ways to get people to find common ground. I also mentioned how I like to do survey's before meetings to prompt people and also get unbiased opnions"See full answer

    Data Scientist
    Behavioral
    +1 more
Showing 121-140 of 179
Exponent

Get updates in your inbox with the latest tips, job listings, and more.

Follow Us

Products
Courses
Interview Questions
Interview Experiences
Popular articles
Guides
Coaching
For Partners
Company
Exponent © 2026
Terms of Service | Privacy