Want to ace your product analytics interviews? Start now →

Data Analyst Coding Interview Questions

Review this list of 10 coding data analyst interview questions and answers verified by hiring managers and candidates.
  • Meta (Facebook) logoAsked at Meta (Facebook) 
    Video answer for 'Merge Intervals'
    N
    Y
    D
    +34

    "const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"

    Kofi N. - "const mergeIntervals = (intervals) => { const compare = (a, b) => { if(a[0] b[0]) return 1 else if(a[0] === b[0]) { return a[1] - b[1] } } let current = [] const result = [] const sorted = intervals.sort(compare) for(let i = 0; i = b[0]) current[1] = b[1] els"See full answer

    Data Analyst
    Coding
    +6 more
  • Google logoAsked at Google 
    J
    S
    [
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Analyst
    Coding
    +2 more
  • Amazon logoAsked at Amazon 
    O

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Analyst
    Coding
    +2 more
  • I
    K
    Y
    +5

    "productssold = set(transactions['productid']) unsoldproducts = products.loc[~products['id'].isin(productssold)] return unsold_products[["id", "name", "stock"]] `"

    Laura U. - "productssold = set(transactions['productid']) unsoldproducts = products.loc[~products['id'].isin(productssold)] return unsold_products[["id", "name", "stock"]] `"See full answer

    Data Analyst
    Coding
    +1 more
  • H
    S
    Q
    +3

    "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"

    Gabriel P. - "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"See full answer

    Data Analyst
    Coding
    +1 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • V
    Q
    C
    +1

    "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"

    Arshad P. - "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"See full answer

    Data Analyst
    Coding
    +1 more
  • C
    D
    A

    "The solution produces the same result as the 'prescribed solution' yet it does not get accepted In the test results section transcript['year'] = transcript['year'].astype(str) df = pd.pivottable(data = transcript, index = 'studentid', columns = 'year', values = 'yearlygpa', aggfunc = 'mean').resetindex() df = df[(df['2021'] < df['2022']) & (df['2022'] < df['2023'])] df['average_gpa'] = df[['2021', '2022', '2023']].mean(axis=1).round(2) return df "

    Prachi G. - "The solution produces the same result as the 'prescribed solution' yet it does not get accepted In the test results section transcript['year'] = transcript['year'].astype(str) df = pd.pivottable(data = transcript, index = 'studentid', columns = 'year', values = 'yearlygpa', aggfunc = 'mean').resetindex() df = df[(df['2021'] < df['2022']) & (df['2022'] < df['2023'])] df['average_gpa'] = df[['2021', '2022', '2023']].mean(axis=1).round(2) return df "See full answer

    Data Analyst
    Coding
    +1 more
  • V
    T

    "The question is incomplete --- the code only passes if you return the data frame sorted by BOTH department name AND rank. While in the problem description, it mentions to only rank by department name: "The results should be ordered by department name." Not a big difference I know, but students shouldn't need to look into the solution to get the necessary knowledge to answer the question."

    Chao peter Y. - "The question is incomplete --- the code only passes if you return the data frame sorted by BOTH department name AND rank. While in the problem description, it mentions to only rank by department name: "The results should be ordered by department name." Not a big difference I know, but students shouldn't need to look into the solution to get the necessary knowledge to answer the question."See full answer

    Data Analyst
    Coding
    +1 more
  • [

    " import pandas as pd from datetime import datetime def findfastestlike(log: pd.DataFrame) -> pd.DataFrame: log=log.sortvalues(['userid','timestamp']) #get the prev event, time by user log['prevevent'] = log.groupby('userid')['event'].shift(1) log['prevtimestamp'] = log.groupby('userid')['timestamp'].shift(1) True only on rows where the previous event was a login and the current event is a like log['loginlike'] = (log['prevevent'] == 'log"

    Sean L. - " import pandas as pd from datetime import datetime def findfastestlike(log: pd.DataFrame) -> pd.DataFrame: log=log.sortvalues(['userid','timestamp']) #get the prev event, time by user log['prevevent'] = log.groupby('userid')['event'].shift(1) log['prevtimestamp'] = log.groupby('userid')['timestamp'].shift(1) True only on rows where the previous event was a login and the current event is a like log['loginlike'] = (log['prevevent'] == 'log"See full answer

    Data Analyst
    Coding
    +1 more
  • C

    " debug your code below departments = pd.DataFrame({ 'id': [1, 2, 3, 4, 5], 'name': ['Reporting', 'Engineering', 'Marketing', 'Biz Dev', 'Silly Walks'] }) employees = pd.DataFrame({ 'id': [1, 2, 3, 4, 5, 6], 'first_name': ['John', 'Ava', 'Cailin', 'Mike', 'Ian', 'John'], 'last_name': ['Smith', 'Muffinson', 'Ninson', 'Peterson', 'Peterson', 'Mills'], 'salary': [20000, 10000, 30000, 20000, 80000, 50000], 'department_id': [1, 5, 2, 2, 2, 3] }) projects = p"

    Sean L. - " debug your code below departments = pd.DataFrame({ 'id': [1, 2, 3, 4, 5], 'name': ['Reporting', 'Engineering', 'Marketing', 'Biz Dev', 'Silly Walks'] }) employees = pd.DataFrame({ 'id': [1, 2, 3, 4, 5, 6], 'first_name': ['John', 'Ava', 'Cailin', 'Mike', 'Ian', 'John'], 'last_name': ['Smith', 'Muffinson', 'Ninson', 'Peterson', 'Peterson', 'Mills'], 'salary': [20000, 10000, 30000, 20000, 80000, 50000], 'department_id': [1, 5, 2, 2, 2, 3] }) projects = p"See full answer

    Data Analyst
    Coding
    +1 more
Showing 1-10 of 10