Data Analysis Interview Questions

Review this list of 101 data analysis interview questions and answers verified by hiring managers and candidates.
  • Amazon logoAsked at Amazon 

    "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"

    Catherine T. - "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"See full answer

    Business Analyst
    Data Analysis
    +2 more
  • "First, I would start by defining what growth means in the context of this new feature whether it's user acquisition, engagement, retention, or revenue. Next, I’d identify clear KPIs that directly align with that growth goal. For example, if the feature aims to improve engagement, I’d track metrics like daily active users, session duration, or feature adoption rate. Once the KPIs are in place, I’d run an A/B test comparing user behavior with and without the feature. This would be followed by de"

    Himanshu G. - "First, I would start by defining what growth means in the context of this new feature whether it's user acquisition, engagement, retention, or revenue. Next, I’d identify clear KPIs that directly align with that growth goal. For example, if the feature aims to improve engagement, I’d track metrics like daily active users, session duration, or feature adoption rate. Once the KPIs are in place, I’d run an A/B test comparing user behavior with and without the feature. This would be followed by de"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • "Investigation clear understanding the true cause"

    Akash D. - "Investigation clear understanding the true cause"See full answer

    Data Analyst
    Data Analysis
    +3 more
  • Stripe logoAsked at Stripe 
    Video answer for 'How can Stripe use data to predict the optimal time to retry a transaction?'
    Data Scientist
    Data Analysis
    +1 more
  • Data Analyst
    Data Analysis
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • +4

    " import pandas as pd def findunsoldproducts(transactions: pd.DataFrame, products: pd.DataFrame) -> pd.DataFrame: Extract purchased product IDs purchasedproductids = transactions['product_id'].unique() Filter products that have never been purchased unsoldproducts = products[~products['id'].isin(purchasedproduct_ids)] Select the desired columns result = unsold_products[['id', 'name', 'stock']] Sort the result by product ID in ascending order"

    Gowtham B. - " import pandas as pd def findunsoldproducts(transactions: pd.DataFrame, products: pd.DataFrame) -> pd.DataFrame: Extract purchased product IDs purchasedproductids = transactions['product_id'].unique() Filter products that have never been purchased unsoldproducts = products[~products['id'].isin(purchasedproduct_ids)] Select the desired columns result = unsold_products[['id', 'name', 'stock']] Sort the result by product ID in ascending order"See full answer

    Data Scientist
    Data Analysis
    +2 more
  • "First, I’d start by checking the alignment of each idea with our core business goals. If any idea doesn't directly contribute to those goals, I’d deprioritize or eliminate it upfront. Next, I’d use a scoring model like RICE (Reach, Impact, Confidence, Effort), especially because effort is a critical factor when resources are limited. This gives us a structured and quantifiable way to rank the ideas. Once we have a prioritized list based on scores, I’d take it a step further and evaluate key as"

    Himanshu G. - "First, I’d start by checking the alignment of each idea with our core business goals. If any idea doesn't directly contribute to those goals, I’d deprioritize or eliminate it upfront. Next, I’d use a scoring model like RICE (Reach, Impact, Confidence, Effort), especially because effort is a critical factor when resources are limited. This gives us a structured and quantifiable way to rank the ideas. Once we have a prioritized list based on scores, I’d take it a step further and evaluate key as"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • DoorDash logoAsked at DoorDash 
    BizOps & Strategy
    Data Analysis
    +1 more
  • +3

    "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"

    Gabriel P. - "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"See full answer

    Data Scientist
    Data Analysis
    +2 more
  • +1

    "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"

    Arshad P. - "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"See full answer

    Data Scientist
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Snap logoAsked at Snap 
    Video answer for 'How would you use data to help Snap engineering improve phone camera speed?'
    Data Analysis
    Analytical
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • " import pandas as pd def findimprovingstudents(transcript: pd.DataFrame) -> pd.DataFrame: summary = transcript.pivottable(index='studentid', values = 'yearlygpa', aggfunc = 'sum',columns = 'year').resetindex() summary['average_gpa'] = round((summary[2023] + summary[2022] + summary[2021])/3,2) return summary(summary[2023] > summary[2022]) & (summary[2022] > summary[2021])] #yn > yn-1, yn-1 > yn-2, yn-3 debug your co"

    Caleb S. - " import pandas as pd def findimprovingstudents(transcript: pd.DataFrame) -> pd.DataFrame: summary = transcript.pivottable(index='studentid', values = 'yearlygpa', aggfunc = 'sum',columns = 'year').resetindex() summary['average_gpa'] = round((summary[2023] + summary[2022] + summary[2021])/3,2) return summary(summary[2023] > summary[2022]) & (summary[2022] > summary[2021])] #yn > yn-1, yn-1 > yn-2, yn-3 debug your co"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • Meta (Facebook) logoAsked at Meta (Facebook) 

    "sum of continuous subarray and keep checking if arr[i]==arr[j]. if true increase count;"

    Rishabh R. - "sum of continuous subarray and keep checking if arr[i]==arr[j]. if true increase count;"See full answer

    Data Analysis
    Data Structures & Algorithms
    +1 more
Showing 21-40 of 101