Skip to main content

Data Scientist SQL Interview Questions

Review this list of 26 SQL Data Scientist interview questions and answers verified by hiring managers and candidates.
  • 76 answers
    Video answer for 'Employee Earnings.'
    +69

    "select e.firstname as firstname, m.salary as manager_salary from employees e join employees m on e.manager_id = m.id where e.salary > m.salary; `"

    Ravi K. - "select e.firstname as firstname, m.salary as manager_salary from employees e join employees m on e.manager_id = m.id where e.salary > m.salary; `"See full answer

    Data Scientist
    SQL
    +4 more
  • Meta logoAsked at Meta 
    13 answers
    +8

    "Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "

    KAI - "Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "See full answer

    Data Scientist
    SQL
    +3 more
  • IBM logoAsked at IBM 
    70 answers
    +64

    "SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"

    Salome L. - "SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"See full answer

    Data Scientist
    SQL
    +3 more
  • LinkedIn logoAsked at LinkedIn 
    36 answers
    +31

    "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"

    David I. - "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"See full answer

    Data Scientist
    SQL
    +4 more
  • 47 answers
    +43

    "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"

    Bradley E. - "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"See full answer

    Data Scientist
    SQL
    +3 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • 4 answers
    Video answer for 'SQL Stored Procedures'
    +1

    "Very Good Explanation Thanks For This Rely Good Explanation"

    Temesgen B. - "Very Good Explanation Thanks For This Rely Good Explanation"See full answer

    Data Scientist
    SQL
    +4 more
  • 69 answers
    +63

    "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"

    Vysali K. - "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"See full answer

    Data Scientist
    SQL
    +3 more
  • Meta logoAsked at Meta 
    4 answers
    +1

    "WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"

    Bala G. - "WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"See full answer

    Data Scientist
    SQL
    +2 more
  • Tesla logoAsked at Tesla 
    35 answers
    +32

    "with empbysalary as ( select id, firstname, lastname, salary, department_id, rank() over (partition by department_id order by salary desc) as rnk from employees ) select d.name as department_name, e.id as employee_id, e.firstname, e.lastname, e.salary from empbysalary e join departments d on e.department_id=d.id where e.rnk=1 order by 1; `"

    Rishabh L. - "with empbysalary as ( select id, firstname, lastname, salary, department_id, rank() over (partition by department_id order by salary desc) as rnk from employees ) select d.name as department_name, e.id as employee_id, e.firstname, e.lastname, e.salary from empbysalary e join departments d on e.department_id=d.id where e.rnk=1 order by 1; `"See full answer

    Data Scientist
    SQL
    +4 more
  • 25 answers
    +22

    "The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"

    Evan R. - "The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"See full answer

    Data Scientist
    SQL
    +3 more
  • 28 answers
    +21

    "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"

    Akshay D. - "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"See full answer

    Data Scientist
    SQL
    +3 more
  • 22 answers
    +17

    "--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "

    Victor N. - "--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "See full answer

    Data Scientist
    SQL
    +3 more
  • 28 answers
    +25

    "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"

    Daniel C. - "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"See full answer

    Data Scientist
    SQL
    +3 more
  • 14 answers
    +11

    "In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"

    Peter W. - "In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"See full answer

    Data Scientist
    SQL
    +3 more
  • Google logoAsked at Google 
    5 answers
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Scientist
    SQL
    +4 more
  • Google logoAsked at Google 
    27 answers
    +24

    "with cte as (select ts.employee_id, e.name, t.id as test_id, max(distinct ts.score) as total from test_results as ts join tests as t on ts.test_id = t.id join employees as e on ts.employee_id = e.id group by ts.employee_id, e.name, t.id) select employee_id, name as employee_name, sum(total) as total_score from cte group by employee_id, employee_name order by total_score desc, employee_id asc ;"

    Christian B. - "with cte as (select ts.employee_id, e.name, t.id as test_id, max(distinct ts.score) as total from test_results as ts join tests as t on ts.test_id = t.id join employees as e on ts.employee_id = e.id group by ts.employee_id, e.name, t.id) select employee_id, name as employee_name, sum(total) as total_score from cte group by employee_id, employee_name order by total_score desc, employee_id asc ;"See full answer

    Data Scientist
    SQL
    +3 more
  • Amazon logoAsked at Amazon 
    6 answers

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Scientist
    SQL
    +4 more
  • 14 answers
    +10

    " select user_id, b.marketing_channel from user_sessions a Left join attribution b on b.sessionid = a.sessionid group by 1,2 HAVING sum(purchasevalue)>100 and min(adclick_timestamp) `"

    G B. - " select user_id, b.marketing_channel from user_sessions a Left join attribution b on b.sessionid = a.sessionid group by 1,2 HAVING sum(purchasevalue)>100 and min(adclick_timestamp) `"See full answer

    Data Scientist
    SQL
    +3 more
  • 15 answers
    +12

    " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"

    Bhavna S. - " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"See full answer

    Data Scientist
    SQL
    +3 more
  • Amazon logoAsked at Amazon 
    3 answers

    "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."

    Ali H. - "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."See full answer

    Data Scientist
    SQL
    +7 more
Showing 1-20 of 26