Data Scientist Interview Questions

Review this list of 155 data scientist interview questions and answers verified by hiring managers and candidates.
  • +6

    "As a PM i received a feedback from my program manager on my style of verbal communication. It is about me speaking faster when i wanted to get away with a topic that i wasn't confident (may be not backed up with data, or still in process of getting detailed insight of a problem etc.). Whereas when I'm confident I tend to speak slowly or more assertively that made people to follow easily. I welcomed that feedback so from then on when I'm not confident in a topic I became more assertive to let pe"

    Rajesh V. - "As a PM i received a feedback from my program manager on my style of verbal communication. It is about me speaking faster when i wanted to get away with a topic that i wasn't confident (may be not backed up with data, or still in process of getting detailed insight of a problem etc.). Whereas when I'm confident I tend to speak slowly or more assertively that made people to follow easily. I welcomed that feedback so from then on when I'm not confident in a topic I became more assertive to let pe"See full answer

    Data Scientist
    Behavioral
    +7 more
  • +24

    "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"

    David I. - "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"See full answer

    Data Scientist
    Coding
    +3 more
  • +2

    "Always assume good intentions on the part of both parties when resolving conflicts. Then proceed with a STAR example."

    Abhinav M. - "Always assume good intentions on the part of both parties when resolving conflicts. Then proceed with a STAR example."See full answer

    Data Scientist
    Behavioral
    +2 more
  • "WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"

    Bala G. - "WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"See full answer

    Data Scientist
    Coding
    +2 more
  • +37

    "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"

    Bradley E. - "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"See full answer

    Data Scientist
    Coding
    +3 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • +52

    "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"

    Vysali K. - "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"See full answer

    Data Scientist
    Coding
    +3 more
  • +1

    "Before diving into the Solution, I would ask a few clarifying questions. What is the scope of the fake news What type of fake news are we focusing on - Political, Health-related, etc Are we looking at specific examples or a general category of fake news When you say impact, what do you mean by that? Is it time spent on posts, the nature of the engagement (e.g., likes, shares, comments), and the sentiment of the comments? User Demographics: what is the demographic pr"

    Bhavna S. - "Before diving into the Solution, I would ask a few clarifying questions. What is the scope of the fake news What type of fake news are we focusing on - Political, Health-related, etc Are we looking at specific examples or a general category of fake news When you say impact, what do you mean by that? Is it time spent on posts, the nature of the engagement (e.g., likes, shares, comments), and the sentiment of the comments? User Demographics: what is the demographic pr"See full answer

    Data Scientist
    Analytical
  • +10

    "Would be better to adjust resolution in the video player directly."

    Anonymous Prawn - "Would be better to adjust resolution in the video player directly."See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • "Product Understanding - Ads are what you see from companies as stories, posts, reels. Post are from users (connections). We have to design an experience which produces maximum engagement while generating ad revenue. Clarifying Questions - Is it specific to posts/stories/reels ? Is there an existing post to ads ratio or do we have to start from scratch? Is it specific to a device/OS? Is it specific to a region/user demographic? Assumption - Existing posts to ads ratio"

    Vishal S. - "Product Understanding - Ads are what you see from companies as stories, posts, reels. Post are from users (connections). We have to design an experience which produces maximum engagement while generating ad revenue. Clarifying Questions - Is it specific to posts/stories/reels ? Is there an existing post to ads ratio or do we have to start from scratch? Is it specific to a device/OS? Is it specific to a region/user demographic? Assumption - Existing posts to ads ratio"See full answer

    Data Scientist
    Data Analysis
  • Adobe logoAsked at Adobe 

    "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."

    Gaston B. - "Use a representative of each, e.g. sort the string and add it to the value of a hashmap> where we put all the words that belong to the same anagram together."See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • Apple logoAsked at Apple 
    +16

    "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"

    Kishor J. - "we can use two pointer + set like maintain i,j and also insert jth character to set like while set size is equal to our window j-i+1 then maximize our answer and increase jth pointer till last index"See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • +25

    "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"

    Anisha S. - "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"See full answer

    Data Scientist
    Coding
    +3 more
  • Adobe logoAsked at Adobe 
    +22

    "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "

    Anonymous Possum - "#inplace reversal without inbuilt functions def reverseString(s): chars = list(s) l, r = 0, len(s)-1 while l < r: chars[l],chars[r] = chars[r],chars[l] l += 1 r -= 1 reversed = "".join(chars) return reversed "See full answer

    Data Scientist
    Data Structures & Algorithms
    +4 more
  • +4

    "Step 1: Define Objectives and Key Metrics Objectives: Understand the demand for group video calling. Assess the potential impact on user engagement. Identify technical and user experience considerations. Key Metrics: Call Frequency: Number of 1:1 calls per user. Call Duration: Average duration of 1:1 calls. Call Participants: Identify users who frequently call multiple individuals. Concurrent Calls: Instances where users are engaged in multiple 1:1 call"

    Bhavna S. - "Step 1: Define Objectives and Key Metrics Objectives: Understand the demand for group video calling. Assess the potential impact on user engagement. Identify technical and user experience considerations. Key Metrics: Call Frequency: Number of 1:1 calls per user. Call Duration: Average duration of 1:1 calls. Call Participants: Identify users who frequently call multiple individuals. Concurrent Calls: Instances where users are engaged in multiple 1:1 call"See full answer

    Data Scientist
  • "I would want to start by asking a few clarifying questions? What business goal are we trying to achieve with this feature? What decisions will this feature help drive? Is calling feature (audio/video) currently available on 1:1 chats? Does the feature allow for cross-device switch? Once I get some clarity, assuming the goal is to increase user engagement and help drive repeat messenger usage with its user base. I would want to evaluate a few motivation for this feature with looking at"

    Aman M. - "I would want to start by asking a few clarifying questions? What business goal are we trying to achieve with this feature? What decisions will this feature help drive? Is calling feature (audio/video) currently available on 1:1 chats? Does the feature allow for cross-device switch? Once I get some clarity, assuming the goal is to increase user engagement and help drive repeat messenger usage with its user base. I would want to evaluate a few motivation for this feature with looking at"See full answer

    Data Scientist
    Data Analysis
    +2 more
  • "I'd hypothesize it's a highly right-skewed, long-tail distribution. This is because the user base is a mix of a huge number of 'casual' users who make very few queries and a small number of 'power' users who make a ton."

    Vineet M. - "I'd hypothesize it's a highly right-skewed, long-tail distribution. This is because the user base is a mix of a huge number of 'casual' users who make very few queries and a small number of 'power' users who make a ton."See full answer

    Data Scientist
    Statistics & Experimentation
  • OpenAI logoAsked at OpenAI 
    Data Scientist
    Behavioral
    +5 more
  • +17

    "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"

    Akshay D. - "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"See full answer

    Data Scientist
    Coding
    +3 more
  • +1

    "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"

    Aarav G. - "Is there a reason a confidence interval was used to solve this problem over just using the mean/expected value directly?"See full answer

    Data Scientist
    Statistics & Experimentation
  • TikTok logoAsked at TikTok 
    Video answer for 'Define success for TikTok.'
    +2

    "Mission: Tiktok's mission is to inspire creativity and Joy. Any business wants to make sure that they are serving the value to their customers: For TikTok customers are: Viewers 2. Content Creators 3. Advertisers So few metrics we could measure are: Time spent/day Total no of videos created/day engagement rate = users who interacted in one of the meaningful action on Tiktok / total users at a day level either likes, share, watched vide for at least 5 mins, created video "

    Nikita B. - "Mission: Tiktok's mission is to inspire creativity and Joy. Any business wants to make sure that they are serving the value to their customers: For TikTok customers are: Viewers 2. Content Creators 3. Advertisers So few metrics we could measure are: Time spent/day Total no of videos created/day engagement rate = users who interacted in one of the meaningful action on Tiktok / total users at a day level either likes, share, watched vide for at least 5 mins, created video "See full answer

    Data Scientist
    Analytical
    +1 more
Showing 21-40 of 155