Data Scientist Interview Questions

Review this list of 278 data scientist interview questions and answers verified by hiring managers and candidates.
  • Data Scientist
    Statistics & Experimentation
  • Adobe logoAsked at Adobe 
    +16

    " O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"

    Rick E. - " O(n) time, O(1) space from typing import List def maxsubarraysum(nums: List[int]) -> int: if len(nums) == 0: return 0 maxsum = currsum = nums[0] for i in range(1, len(nums)): currsum = max(currsum + nums[i], nums[i]) maxsum = max(currsum, max_sum) return max_sum debug your code below print(maxsubarraysum([-1, 2, -3, 4])) `"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • +1

    "select DISTINCT p.product_id, p.product_name , CASE when sale_date is null then 'Not Sold' else 'Sold' END as sale_status from products p left join sales s on p.productid= s.productid `"

    Gowtami K. - "select DISTINCT p.product_id, p.product_name , CASE when sale_date is null then 'Not Sold' else 'Sold' END as sale_status from products p left join sales s on p.productid= s.productid `"See full answer

    Data Scientist
    Coding
    +1 more
  • "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"

    Nick S. - "[I'm not sure whether the answer below is the best, as I have not gotten result and feedback from my interview] Ans: I would solve by first using a VAE-style model, to create a latent space embedding that translates user description to generate images. Training would be done on the 1000 avatar images and 100000 descriptions, following this scheme: VAE: description -> encoder -> latent space -> decoder -> image Q: "OK, but that means you're limiting the generated images to be only the 1000 imag"See full answer

    Data Scientist
    Machine Learning
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • "SELECT DISTINCT title, ROUND(AVG(rating) over (partition by title),1) avg_rating, ROUND(AVG(rating) over (partition by genre),1) genre_rating FROM rating r JOIN movie m ON r.movieid=m.movieid ORDER by 1"

    Harshi B. - "SELECT DISTINCT title, ROUND(AVG(rating) over (partition by title),1) avg_rating, ROUND(AVG(rating) over (partition by genre),1) genre_rating FROM rating r JOIN movie m ON r.movieid=m.movieid ORDER by 1"See full answer

    Data Scientist
    Coding
    +1 more
  • Meta (Facebook) logoAsked at Meta (Facebook) 
    Data Scientist
    Product Strategy
  • "SELECT COUNT(*) unique_conversations FROM messenger_sends WHERE senderid < receiverid"

    Lucas G. - "SELECT COUNT(*) unique_conversations FROM messenger_sends WHERE senderid < receiverid"See full answer

    Data Scientist
    Coding
    +1 more
  • Walmart Labs logoAsked at Walmart Labs 
    Data Scientist
    Behavioral
    +2 more
  • Discord logoAsked at Discord 
    Data Scientist
    Behavioral
    +2 more
  • Adobe logoAsked at Adobe 

    Permutations

    IDE
    Medium

    "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"

    Tiago R. - "function permute(nums) { if (nums.length <= 1) { return [nums]; } const prevPermutations = permute(nums.slice(0, nums.length-1)); const currentNum = nums[nums.length-1]; const permutations = new Set(); for (let prev of prevPermutations) { for (let i=0; i < prev.length; i++) { permutations.add([...prev.slice(0, i), currentNum, ...prev.slice(i)]); } permutations.add([...prev, currentNum]); } return [...permutations]"See full answer

    Data Scientist
    Data Structures & Algorithms
    +3 more
  • +1

    "WITH CTE AS ( SELECT *, ROWNUMBER()OVER(PARTITION BY utxoid ORDER BY transactionid) AS trxrk FROM transactions JOIN transaction_inputs USING (transaction_id) JOIN utxo USING (utxo_id) ) SELECT transaction_id AS InvalidTransactionId FROM CTE WHERE sender!=address OR trx_rk > 1 `"

    E L. - "WITH CTE AS ( SELECT *, ROWNUMBER()OVER(PARTITION BY utxoid ORDER BY transactionid) AS trxrk FROM transactions JOIN transaction_inputs USING (transaction_id) JOIN utxo USING (utxo_id) ) SELECT transaction_id AS InvalidTransactionId FROM CTE WHERE sender!=address OR trx_rk > 1 `"See full answer

    Data Scientist
    Coding
    +1 more
  • "The solution produces the same result as the 'prescribed solution' yet it does not get accepted In the test results section transcript['year'] = transcript['year'].astype(str) df = pd.pivottable(data = transcript, index = 'studentid', columns = 'year', values = 'yearlygpa', aggfunc = 'mean').resetindex() df = df[(df['2021'] < df['2022']) & (df['2022'] < df['2023'])] df['average_gpa'] = df[['2021', '2022', '2023']].mean(axis=1).round(2) return df "

    Prachi G. - "The solution produces the same result as the 'prescribed solution' yet it does not get accepted In the test results section transcript['year'] = transcript['year'].astype(str) df = pd.pivottable(data = transcript, index = 'studentid', columns = 'year', values = 'yearlygpa', aggfunc = 'mean').resetindex() df = df[(df['2021'] < df['2022']) & (df['2022'] < df['2023'])] df['average_gpa'] = df[['2021', '2022', '2023']].mean(axis=1).round(2) return df "See full answer

    Data Scientist
    Coding
    +2 more
  • +1

    "with jay as (select date,temperature,deff from( select *,temperature-lag(temperature)over(order by date )  as deff from city_temperatures) where deff >=-3 and deff >=5) select date,temperature from jay"

    Jayveer S. - "with jay as (select date,temperature,deff from( select *,temperature-lag(temperature)over(order by date )  as deff from city_temperatures) where deff >=-3 and deff >=5) select date,temperature from jay"See full answer

    Data Scientist
    Coding
    +1 more
  • "import random def coin_flip(): x=4*[0]+[1] res=[] for i in range(20): res.append(random.choice(x)) return res res=[0,0] # [head,tail] for j in range(1000): temp=coin_flip() res[0]+=sum(temp) #head res[1]+=(20-sum(temp)) #tail"

    Alireza K. - "import random def coin_flip(): x=4*[0]+[1] res=[] for i in range(20): res.append(random.choice(x)) return res res=[0,0] # [head,tail] for j in range(1000): temp=coin_flip() res[0]+=sum(temp) #head res[1]+=(20-sum(temp)) #tail"See full answer

    Data Scientist
    Coding
  • TikTok logoAsked at TikTok 

    "I generate insights through stakeholder requirements and the data I have in hand"

    Anonymous Eagle - "I generate insights through stakeholder requirements and the data I have in hand"See full answer

    Data Scientist
    Analytical
  • Adobe logoAsked at Adobe 

    "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"

    Divya R. - "static boolean sudokuSolve(char board) { return sudokuSolve(board, 0, 0); } static boolean sudokuSolve(char board, int r, int c) { if(c>=board[0].length) { r=r+1; c=0; } if(r>=board.length) return true; if(boardr=='.') { for(int num=1; num<=9; num++) { boardr=(char)('0' + num); if(isValidPosition(board, r, c)) { if(sudokuSolve(board, r, c+1)) return true; } boardr='.'; } } else { return sudokuSolve(board, r, c+1); } return false; } static boolean isValidPosition(char b"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • "with cte as (select *, row_number() over(order by score desc) as rn from players) select player_name, score, rn as ranking from cte where rn= 4 or rn =6 or rn =11 `"

    Gowtami K. - "with cte as (select *, row_number() over(order by score desc) as rn from players) select player_name, score, rn as ranking from cte where rn= 4 or rn =6 or rn =11 `"See full answer

    Data Scientist
    Coding
    +1 more
  • +1

    "SELECT e1.empid AS manageremployee_id, e1.empname AS managername, COUNT(e2.empid) AS numberofdirectreports FROM employees AS e1 INNER JOIN employees AS e2 ON e2.managerid = e1.empid GROUP BY e1.emp_id HAVING COUNT(e2.emp_id) >= 2 ORDER BY numberofdirectreports DESC, managername ASC `"

    Alvin P. - "SELECT e1.empid AS manageremployee_id, e1.empname AS managername, COUNT(e2.empid) AS numberofdirectreports FROM employees AS e1 INNER JOIN employees AS e2 ON e2.managerid = e1.empid GROUP BY e1.emp_id HAVING COUNT(e2.emp_id) >= 2 ORDER BY numberofdirectreports DESC, managername ASC `"See full answer

    Data Scientist
    Coding
    +1 more
  • "SELECT COUNT(DISTINCT o.customerid) AS customers, d.departmentname FROM orders o INNER JOIN departments d ON d.departmentid = o.departmentid WHERE d.departmentname IN ('Electronics','Fashion') AND o.orderdate BETWEEN '2022-01-01' AND '2022-12-31' GROUP BY d.department_name; `"

    Derrick M. - "SELECT COUNT(DISTINCT o.customerid) AS customers, d.departmentname FROM orders o INNER JOIN departments d ON d.departmentid = o.departmentid WHERE d.departmentname IN ('Electronics','Fashion') AND o.orderdate BETWEEN '2022-01-01' AND '2022-12-31' GROUP BY d.department_name; `"See full answer

    Data Scientist
    Coding
    +1 more
Showing 201-220 of 278