Data Analyst Interview Questions

Review this list of 116 data analyst interview questions and answers verified by hiring managers and candidates.
  • "I’d assess a new feature launch by first checking if it achieved the goal we set before launch, whether that’s driving engagement, monetization, or retention. I would look at adoption and usage to see if users are discovering and repeatedly using it, the impact on the main KPI we targeted, and guardrail metrics to ensure there’s no negative effect on core product health like retention, crashes, or satisfaction. Ideally I would measure this through an A/B test or phased rollout and complement the"

    Madina A. - "I’d assess a new feature launch by first checking if it achieved the goal we set before launch, whether that’s driving engagement, monetization, or retention. I would look at adoption and usage to see if users are discovering and repeatedly using it, the impact on the main KPI we targeted, and guardrail metrics to ensure there’s no negative effect on core product health like retention, crashes, or satisfaction. Ideally I would measure this through an A/B test or phased rollout and complement the"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • Microsoft logoAsked at Microsoft 

    "SQL is structured query language."

    Rafia M. - "SQL is structured query language."See full answer

    Data Analyst
    SQL
    +2 more
  • +1

    "Firstly, I would like to be in a room with all the stakeholders (tech/business) and the decision makers. Now starts the analysis of the situation. Certain questions that I will be looking for an answer are- Is this a new issue? or an old one? What is the severity and priority of the feature in the release? In terms of business values. How long would it take the engineering team to fix the issue? Can we manage for a workaround meanwhile the issue gets fixed? What are the risks inv"

    Shreya S. - "Firstly, I would like to be in a room with all the stakeholders (tech/business) and the decision makers. Now starts the analysis of the situation. Certain questions that I will be looking for an answer are- Is this a new issue? or an old one? What is the severity and priority of the feature in the release? In terms of business values. How long would it take the engineering team to fix the issue? Can we manage for a workaround meanwhile the issue gets fixed? What are the risks inv"See full answer

    Data Analyst
    Behavioral
    +2 more
  • Amazon logoAsked at Amazon 

    "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"

    Catherine T. - "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • +4

    "with cte as ( select user_id , sum(purchase_value) clv, min(adclicktimestamp) min_time from user_sessions u join attribution a on a.sessionid = u.sessionid group by user_id ) select cte.user_id, a.marketing_channel from cte join user_sessions u on u.userid= cte.userid join attribution a on u.sessionid = a.sessionid where cte.clv>100 `"

    Rohit K. - "with cte as ( select user_id , sum(purchase_value) clv, min(adclicktimestamp) min_time from user_sessions u join attribution a on a.sessionid = u.sessionid group by user_id ) select cte.user_id, a.marketing_channel from cte join user_sessions u on u.userid= cte.userid join attribution a on u.sessionid = a.sessionid where cte.clv>100 `"See full answer

    Data Analyst
    Coding
    +3 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • Data Analyst
    Coding
    +3 more
  • +6

    " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"

    Bhavna S. - " with youngsuccrate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as yascrate from post where userid in (select userid from post_user where age between 0 and 18) group by post_month ), nonyoungsucc_rate as( select strftime('%m', postdate) AS postmonth, round(sum(issuccessfulpost)*1.0/count(issuccessfulpost),2)as nonyasc_rate from post where user_id in (select"See full answer

    Data Analyst
    Coding
    +3 more
  • Deloitte logoAsked at Deloitte 

    "BETWEEN and HAVING clauses in SQL serve different purposes: 1. BETWEEN Clause Used to filter rows based on a range of values. Works with numeric, date, or text values. Can be used with WHERE or HAVING clauses. The range includes both lower and upper bounds. Example: Filtering employees with salaries between 30,000 and 50,000 `SELECT * FROM Employees WHERE salary BETWEEN 30000 AND 50000;` 2. HAVING Clause Used to filter **groups"

    Meenakshi D. - "BETWEEN and HAVING clauses in SQL serve different purposes: 1. BETWEEN Clause Used to filter rows based on a range of values. Works with numeric, date, or text values. Can be used with WHERE or HAVING clauses. The range includes both lower and upper bounds. Example: Filtering employees with salaries between 30,000 and 50,000 `SELECT * FROM Employees WHERE salary BETWEEN 30000 AND 50000;` 2. HAVING Clause Used to filter **groups"See full answer

    Data Analyst
    Concept
    +4 more
  • "At one of my project, I worked on a project where we needed to collect data from different sections of a large factory and deliver it to a third-party company responsible for predictive analytics on product quality and production levels. The challenge was that each department had different data types and structures, and in many cases, direct connections were restricted due to strict security policies. My responsibility was to design and implement a solution that could gather all these heterogene"

    Maryam G. - "At one of my project, I worked on a project where we needed to collect data from different sections of a large factory and deliver it to a third-party company responsible for predictive analytics on product quality and production levels. The challenge was that each department had different data types and structures, and in many cases, direct connections were restricted due to strict security policies. My responsibility was to design and implement a solution that could gather all these heterogene"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • +2

    "-- Write your query here select u.userid as userid, IFNULL(sum(purchase_value), 0) AS LTV FROM user_sessions u JOIN attribution a ON u.sessionid = a.sessionid group by user_id order by LTV desc ; Needs a full join. Wondering why cant we do a left outer join here. All the sessions should have complete data."

    Aneesha K. - "-- Write your query here select u.userid as userid, IFNULL(sum(purchase_value), 0) AS LTV FROM user_sessions u JOIN attribution a ON u.sessionid = a.sessionid group by user_id order by LTV desc ; Needs a full join. Wondering why cant we do a left outer join here. All the sessions should have complete data."See full answer

    Data Analyst
    Coding
    +3 more
  • Data Analyst
    Data Analysis
    +2 more
  • Amazon logoAsked at Amazon 

    "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."

    Ali H. - "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."See full answer

    Data Analyst
    Concept
    +7 more
  • "Investigation clear understanding the true cause"

    Akash D. - "Investigation clear understanding the true cause"See full answer

    Data Analyst
    Behavioral
    +3 more
  • Airbus logoAsked at Airbus 

    "Implementing the anomaly Machine learning using some algorithm like Isolation Forest algorithm DBscan algorithm and local data point find algorithm."

    Suhas P. - "Implementing the anomaly Machine learning using some algorithm like Isolation Forest algorithm DBscan algorithm and local data point find algorithm."See full answer

    Data Analyst
    Data Analysis
  • Snap logoAsked at Snap 
    Video answer for 'How would you use data to help Snap engineering improve phone camera speed?'
    Data Analyst
    Analytical
    +1 more
  • +3

    "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"

    Gabriel P. - "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"See full answer

    Data Analyst
    Coding
    +1 more
  • Google logoAsked at Google 

    "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"

    Aaron W. - "Clarification questions What is the purpose of connecting the DB? Do we expect high-volumes of traffic to hit the DB Do we have scalability or reliability concerns? Format Code -> DB Code -> Cache -> DB API -> Cache -> DB - APIs are built for a purpose and have a specified protocol (GET, POST, DELETE) to speak to the DB. APIs can also use a contract to retrieve information from a DB much faster than code. Load balanced APIs -> Cache -> DB **Aut"See full answer

    Data Analyst
    Concept
    +5 more
  • +1

    "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"

    Arshad P. - "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"See full answer

    Data Analyst
    Coding
    +1 more
  • Data Analyst
    Data Analysis
    +2 more
  • +1

    "SELECT upsellcampaignid, COUNT(DISTINCT trans.userid) AS eligibleusers FROM campaign JOIN "transaction" AS trans ON transactiondate BETWEEN datestart AND date_end JOIN user ON trans.userid = user.userid WHERE iseligibleforupsellcampaign = 1 GROUP BY upsellcampaignid `"

    Alina G. - "SELECT upsellcampaignid, COUNT(DISTINCT trans.userid) AS eligibleusers FROM campaign JOIN "transaction" AS trans ON transactiondate BETWEEN datestart AND date_end JOIN user ON trans.userid = user.userid WHERE iseligibleforupsellcampaign = 1 GROUP BY upsellcampaignid `"See full answer

    Data Analyst
    Coding
    +3 more
Showing 41-60 of 116