Prepare for data analytics interviews in our brand new course. Start now →

Data Scientist SQL Interview Questions

Review this list of 65 sql data scientist interview questions and answers verified by hiring managers and candidates.
  • O
    O
    V
    +7

    "Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "

    KAI - "Answer: select fromcaller, count(DISTINCT tocallee) as num_calls from calls group by fromcaller having count(DISTINCT tocallee) >= 3 Setup: CREATE TABLE calls ( from_caller VARCHAR(20), to_callee VARCHAR(20) ); INSERT INTO calls (fromcaller, tocallee) VALUES ('Alice', 'Bob'), ('Charlie', 'Dave'), ('Alice', 'Frank'), ('Charlie', 'Heidi'), ('Charlie', 'Judy'); "See full answer

    Data Scientist
    SQL
    +1 more
  • Video answer for 'Employee Earnings.'
    B
    X
    T
    +31

    "SELECT employees.first_name, managers.salary AS manager_salary FROM employees LEFT JOIN employees AS managers ON employees.manager_id = managers.id WHERE employees.salary > managers.salary `"

    Tiffany A. - "SELECT employees.first_name, managers.salary AS manager_salary FROM employees LEFT JOIN employees AS managers ON employees.manager_id = managers.id WHERE employees.salary > managers.salary `"See full answer

    Data Scientist
    SQL
    +2 more
  • W
    Q
    D
    +1

    "select employeename, employeeid, salary, department, DR from ( select employeename, employeeid, salary, dense_rank() over (partition by department order by salary desc) DR, department from employee ) where DR <=3 order by department, DR"

    Sreeram reddy B. - "select employeename, employeeid, salary, department, DR from ( select employeename, employeeid, salary, dense_rank() over (partition by department order by salary desc) DR, department from employee ) where DR <=3 order by department, DR"See full answer

    Data Scientist
    SQL
    +2 more
  • F
    X
    X
    +42

    "SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"

    Salome L. - "SELECT MIN(id) AS id, TRIM(LOWER(email)) AS cleaned_email FROM users GROUP BY cleaned_email ORDER BY id `"See full answer

    Data Scientist
    SQL
    +1 more
  • Z

    "How do you find consecutive days for login (MySQL, SQL, date, subquery, MySQL 5.7, development)? 1 Follow Request Answer More All related (34) Recommended 📷 Trausti Thor Johannsson · Follow Been using MySQL for more than 16 yearsDec 27 There are functions like DATEDIFF but there are also BETWE"

    Hayatu H. - "How do you find consecutive days for login (MySQL, SQL, date, subquery, MySQL 5.7, development)? 1 Follow Request Answer More All related (34) Recommended 📷 Trausti Thor Johannsson · Follow Been using MySQL for more than 16 yearsDec 27 There are functions like DATEDIFF but there are also BETWE"See full answer

    Data Scientist
    SQL
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • [

    "SELECT s.Sale_Date, SUM(si.Quantity * si.SalePrice) AS TotalRevenue FROM Sales s JOIN SaleItems si ON s.SaleID = si.Sale_ID GROUP BY s.Sale_Date ORDER BY s.Sale_Date; "

    Bala G. - "SELECT s.Sale_Date, SUM(si.Quantity * si.SalePrice) AS TotalRevenue FROM Sales s JOIN SaleItems si ON s.SaleID = si.Sale_ID GROUP BY s.Sale_Date ORDER BY s.Sale_Date; "See full answer

    Data Scientist
    SQL
    +2 more
  • L
    X

    "it is really good explanation thanks it is really good explanation thanks"

    Amney M. - "it is really good explanation thanks it is really good explanation thanks"See full answer

    Data Scientist
    SQL
    +2 more
  • M
    V
    I
    +24

    "SELECT customer_id, order_date, orderid AS secondearliestorderid FROM ( SELECT order_id, customer_id, order_date, ROWNUMBER() OVER (PARTITION BY customerid, orderdate ORDER BY orderid ASC) AS rank FROM orders ) WHERE rank = 2 ORDER BY orderdate, customerid `"

    Tiffany A. - "SELECT customer_id, order_date, orderid AS secondearliestorderid FROM ( SELECT order_id, customer_id, order_date, ROWNUMBER() OVER (PARTITION BY customerid, orderdate ORDER BY orderid ASC) AS rank FROM orders ) WHERE rank = 2 ORDER BY orderdate, customerid `"See full answer

    Data Scientist
    SQL
    +1 more
  • V
    A
    G

    "SELECT ROUND( COUNT(DISTINCT(CASE when eventname = 'callstarted' then user_id end)) / COUNT(DISTINCT(CASE when eventname = 'appopen' then user_id end)) * 100) as videocallpercentage from activity where DATE(date) = CURDATE() - INTERVAL 1 DAY"

    Anonymous Ladybug - "SELECT ROUND( COUNT(DISTINCT(CASE when eventname = 'callstarted' then user_id end)) / COUNT(DISTINCT(CASE when eventname = 'appopen' then user_id end)) * 100) as videocallpercentage from activity where DATE(date) = CURDATE() - INTERVAL 1 DAY"See full answer

    Data Scientist
    SQL
    +2 more
  • V
    T
    J
    +19

    "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"

    David I. - "WITH filtered_posts AS ( SELECT p.user_id, p.issuccessfulpost FROM post p WHERE p.postdate >= '2023-11-01' AND p.postdate < '2023-12-01' ), post_summary AS ( SELECT pu.user_type, COUNT(*) AS post_attempt, SUM(CASE WHEN fp.issuccessfulpost = 1 THEN 1 ELSE 0 END) AS post_success FROM filtered_posts fp JOIN postuser pu ON fp.userid = pu.user_id GROUP BY pu.user_type ) SELECT user_type, post_success, post_attempt, CAST(postsuccess AS FLOAT) / postattempt AS postsuccessrate FROM po"See full answer

    Data Scientist
    SQL
    +1 more
  • Q
    G
    W
    +46

    "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"

    Vysali K. - "Limit and rank() only works if there are no 2 employees with same salary ( which is okay for this use case) For the query to pass all the test results, we need to use dense_rank with ranked_employees as ( select id, firstname, lastname, salary, denserank() over(order by salary desc) as salaryrank from employees ) select id, firstname, lastname, salary from ranked_employees where salary_rank <= 3 `"See full answer

    Data Scientist
    SQL
    +1 more
  • N
    Q
    C
    +31

    "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"

    Bradley E. - "Here's a simpler solution: select u.username , count(p.postid) as countposts from posts as p join users as u on p.userid = u.userid where p.likes >= 100 group by 1 order by 2 desc, 1 asc limit 3 `"See full answer

    Data Scientist
    SQL
    +1 more
  • V
    C
    P
    +21

    "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"

    Anisha S. - "SELECT d.name as departmentname,e.id as employeeid,e.firstname,e.lastname,MAX(e.salary) as salary FROM employees e LEFT JOIN departments d ON e.department_id=d.id GROUP BY department_name ORDER BY department_name;"See full answer

    Data Scientist
    SQL
    +1 more
  • V
    A
    E
    +15

    "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"

    Akshay D. - "SELECT u.user_id, u.user_name, u.email, ROUND(AVG(CASE WHEN b.status = 'Unmatched' THEN 1.0 ELSE 0 END), 2) AS avgunmatchedbookings FROM users u LEFT JOIN bookings b ON u.userid = b.userid GROUP BY u.user_id, u.user_name, u.email; `"See full answer

    Data Scientist
    SQL
    +1 more
  • S
    E
    Z
    +8

    "Required output in the solution not the one requested from the question. only customerid, firstname, last_name and years were required. Please this needs to be very clear. Otherwise my answer is with totalorderyear as ( SELECT o.customer_id, c.first_name, c.last_name, EXTRACT(YEAR FROM o.orderdate) AS orderyear, COUNT(o.orderid) AS totalorders FROM orders o LEFT JOIN customers c ON c.customerid = o.customerid GROUP BY o.customerid, c.firstname, c.last"

    Gloriose H. - "Required output in the solution not the one requested from the question. only customerid, firstname, last_name and years were required. Please this needs to be very clear. Otherwise my answer is with totalorderyear as ( SELECT o.customer_id, c.first_name, c.last_name, EXTRACT(YEAR FROM o.orderdate) AS orderyear, COUNT(o.orderid) AS totalorders FROM orders o LEFT JOIN customers c ON c.customerid = o.customerid GROUP BY o.customerid, c.firstname, c.last"See full answer

    Data Scientist
    SQL
    +1 more
  • Video answer for 'E-commerce (1 of 5)'
    H
    E
    Z
    +12

    "SELECT SUM(orderquantity) AS totalunitsorderedyesterday FROM orders WHERE order_date = DATE('now', '-1 DAY') `"

    Akshay D. - "SELECT SUM(orderquantity) AS totalunitsorderedyesterday FROM orders WHERE order_date = DATE('now', '-1 DAY') `"See full answer

    Data Scientist
    SQL
    +1 more
  • H
    Z
    N
    +11

    "--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "

    Victor N. - "--country names are UPPERCASE but the table in the in the question showing lowercase. That's why it took me a while to figure it out until I ran the country column WITH RECURSIVE Hierarchy AS ( SELECT e.Emp_ID, CONCAT(e.FirstName, ' ', e.MiddleName, ' ', e.LastName) AS FullName, e.Manager_ID, 0 AS Level, CASE WHEN e.Country = 'IRELAND' THEN s.Salary * 1.09 WHEN e.Country = 'INDIA' THEN s.Salary * 0.012 ELSE s.Salary "See full answer

    Data Scientist
    SQL
    +1 more
  • I
    C
    U
    +16

    "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"

    Daniel C. - "select name, stock from products p left join transactions t on p.id = t.product_id order by date desc limit 1"See full answer

    Data Scientist
    SQL
    +1 more
  • Google logoAsked at Google 
    L
    Z
    D
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Scientist
    SQL
    +3 more
  • F
    O
    C
    +6

    "with cte as (select (count(postid)/count(userid)) as avgpost, avg(issuccessfulpost) as avgsuccess from post) select p.userid,sum(issuccessfulpost) as postsuccess,count(p.postid) as postattempt,ROUND(avg(issuccessfulpost),2) as postsuccessrate from post p,cte c on p.user_id group by p.user_id having postattempt>c.avgpost and postsuccessrate<c.avg_success order by postsuccessrate desc"

    Devanshu K. - "with cte as (select (count(postid)/count(userid)) as avgpost, avg(issuccessfulpost) as avgsuccess from post) select p.userid,sum(issuccessfulpost) as postsuccess,count(p.postid) as postattempt,ROUND(avg(issuccessfulpost),2) as postsuccessrate from post p,cte c on p.user_id group by p.user_id having postattempt>c.avgpost and postsuccessrate<c.avg_success order by postsuccessrate desc"See full answer

    Data Scientist
    SQL
    +1 more
Showing 1-20 of 65