Data Scientist SQL Interview Questions

Review this list of 26 sql data scientist interview questions and answers verified by hiring managers and candidates.

+ Add interview

Product

Engineering

Operations

Design

Marketing

Data

Sales

Finance

Consulting

Security

Add interview

Data Engineer Data Analyst Data Scientist Software Engineer Product Manager Product Analyst Backend Engineer Frontend Engineer Engineering Manager Solutions Architect

Asked at Meta (Facebook) • 4 months ago
Find the number of users who called three or more people in the last week.
Data Scientist
SQL
+3 more
13 answers I was asked this
+8
"Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "
KAI - "Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "See full answer
Data Scientist
SQL
+3 more
Employee Earnings.
IDE
Medium
Data Scientist
SQL
+4 more
43 answers I was asked this
+37
"SELECT employees.first_name, managers.salary AS manager_salary FROM employees LEFT JOIN employees AS managers ON employees.manager_id = managers.id WHERE employees.salary > managers.salary `"
Tiffany A. - "SELECT employees.first_name, managers.salary AS manager_salary FROM employees LEFT JOIN employees AS managers ON employees.manager_id = managers.id WHERE employees.salary > managers.salary `"See full answer
Data Scientist
SQL
+4 more
Asked at IBM • 2 months ago
Remove Duplicate Emails
IDE
Medium
Data Scientist
SQL
+3 more
53 answers I was asked this
+49
"SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"
Salome L. - "SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"See full answer
Data Scientist
SQL
+3 more
SQL Stored Procedures
Data Scientist
SQL
+4 more
2 answers I was asked this
"it is really good explanation thanks it is really good explanation thanks"
Amney M. - "it is really good explanation thanks it is really good explanation thanks"See full answer
Data Scientist
SQL
+4 more
Asked at Meta (Facebook) • a year ago
What percentage of users who were active on Messenger yesterday made a video call yesterday?
Data Scientist
SQL
+2 more
3 answers I was asked this
"WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"
Bala G. - "WITH ActiveUsersYesterday AS ( SELECT DISTINCT user_id FROM user_activity WHERE activity_date = CAST(GETDATE() - 1 AS DATE) ), VideoCallUsersYesterday AS ( SELECT DISTINCT user_id FROM video_calls WHERE call_date = CAST(GETDATE() - 1 AS DATE) ) SELECT (CAST(COUNT(DISTINCT v.userid) AS FLOAT) / NULLIF(COUNT(DISTINCT a.userid), 0)) * 100 AS percentagevideocall_users FROM ActiveUsersYesterday a LEFT JOIN VideoCallUsersYesterday v ON a.userid = v.userid;"See full answer
Data Scientist
SQL
+2 more

🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

Monthly Post Success Analysis.
IDE
Easy
Data Scientist
SQL
+3 more
24 answers I was asked this
+20
"WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"
David I. - "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"See full answer
Data Scientist
SQL
+3 more
Instagram Likes
IDE
Medium
Data Scientist
SQL
+3 more
39 answers I was asked this
+35
"Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"
Bradley E. - "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"See full answer
Data Scientist
SQL
+3 more
Top Earning Employees
IDE
Easy
Data Scientist
SQL
+3 more
53 answers I was asked this
+48
"Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"
Vysali K. - "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"See full answer
Data Scientist
SQL
+3 more
Top Salaries by Department
IDE
Hard
Data Scientist
SQL
+3 more
27 answers I was asked this
+24
"SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"
Anisha S. - "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"See full answer
Data Scientist
SQL
+3 more
Lyft Ride Requests
IDE
Easy
Data Scientist
SQL
+3 more
24 answers I was asked this
+17
"SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"
Akshay D. - "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"See full answer
Data Scientist
SQL
+3 more
Employee Hierarchy
IDE
Hard
Data Scientist
SQL
+3 more
16 answers I was asked this
+11
"--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "
Victor N. - "--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "See full answer
Data Scientist
SQL
+3 more
Most Recent Transaction
IDE
Medium
Data Scientist
SQL
+3 more
23 answers I was asked this
+20
"select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"
Daniel C. - "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"See full answer
Data Scientist
SQL
+3 more
Duolingo Leaderboards
IDE
Hard
Data Scientist
SQL
+3 more
16 answers I was asked this
+13
"The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"
Evan R. - "The user table no longer exists as expected - I get an error that user does not contain user_id. Note that querying the table results in only user:swuoevkivrjfta select * FROM user `"See full answer
Data Scientist
SQL
+3 more
Asked at Google, Microsoft • 2 years ago
Write SQL code to publish the Fibonacci series.
Data Scientist
SQL
+4 more
5 answers I was asked this
+2
"WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"
Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer
Data Scientist
SQL
+4 more
High Volume Low Success.
IDE
Easy
Data Scientist
SQL
+3 more
11 answers I was asked this
+8
"In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"
Peter W. - "In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >= Caused me 1 hr to find out. plz fix"See full answer
Data Scientist
SQL
+3 more
Asked at Amazon • 3 years ago
Session Data Analysis.
Hard
Data Scientist
SQL
+4 more
5 answers I was asked this
"1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"
Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer
Data Scientist
SQL
+4 more
Calculate Test Scores
IDE
Medium
Data Scientist
SQL
+3 more
19 answers I was asked this
+16
"with max_score as( select test_id, employee_id, max(score) as max_score from test_results as t1 join tests as t2 on t1.test_id=t2.id group by 2,1 ) select id as employee_id, name as employee_name, sum(maxscore) as totalscore from employees as e join max_score as m on e.id=m.employee_id group by 1,2 order by 3 desc, 1 `"
Victor N. - "with max_score as( select test_id, employee_id, max(score) as max_score from test_results as t1 join tests as t2 on t1.test_id=t2.id group by 2,1 ) select id as employee_id, name as employee_name, sum(maxscore) as totalscore from employees as e join max_score as m on e.id=m.employee_id group by 1,2 order by 3 desc, 1 `"See full answer
Data Scientist
SQL
+3 more
Marketing Channel Attribution
IDE
Medium
Data Scientist
SQL
+3 more
8 answers I was asked this
+4
"-- Write your query here WITH high_value AS( SELECT user_id FROM user_sessions JOIN attribution ON usersessions.sessionid = attribution.session_id GROUP BY user_id HAVING SUM(purchase_value) > 100 ORDER BY SUM(purchase_value) DESC ) SELECT usersessions.userid, marketing_channel FROM user_sessions JOIN high_value ON usersessions.userid = highvalue.userid JOIN attribution ON usersessions.sessionid = attribution.session_id GROUP BY usersessions.userid H"
Alina G. - "-- Write your query here WITH high_value AS( SELECT user_id FROM user_sessions JOIN attribution ON usersessions.sessionid = attribution.session_id GROUP BY user_id HAVING SUM(purchase_value) > 100 ORDER BY SUM(purchase_value) DESC ) SELECT usersessions.userid, marketing_channel FROM user_sessions JOIN high_value ON usersessions.userid = highvalue.userid JOIN attribution ON usersessions.sessionid = attribution.session_id GROUP BY usersessions.userid H"See full answer
Data Scientist
SQL
+3 more
Analyze Monthly Customer Transactions
Data Scientist
SQL
+3 more
Add answer I was asked this
Data Scientist
SQL
+3 more
Asked at Deloitte • 8 months ago
Explain the key differences between BETWEEN and HAVING clauses in SQL.
Data Scientist
SQL
+4 more
2 answers I was asked this
"BETWEEN and HAVING clauses in SQL serve different purposes: 1. BETWEEN Clause Used to filter rows based on a range of values. Works with numeric, date, or text values. Can be used with WHERE or HAVING clauses. The range includes both lower and upper bounds. Example: Filtering employees with salaries between 30,000 and 50,000 `SELECT * FROM Employees WHERE salary BETWEEN 30000 AND 50000;` 2. HAVING Clause Used to filter **groups"
Meenakshi D. - "BETWEEN and HAVING clauses in SQL serve different purposes: 1. BETWEEN Clause Used to filter rows based on a range of values. Works with numeric, date, or text values. Can be used with WHERE or HAVING clauses. The range includes both lower and upper bounds. Example: Filtering employees with salaries between 30,000 and 50,000 `SELECT * FROM Employees WHERE salary BETWEEN 30000 AND 50000;` 2. HAVING Clause Used to filter **groups"See full answer
Data Scientist
SQL
+4 more