Data Analyst Interview Questions

Review this list of 84 data analyst interview questions and answers verified by hiring managers and candidates.
  • Google logoAsked at Google 
    +4

    "Clarifying questions Do we mean the Google Play store or Apple App store or some other app store? : Google Play Store Do we mean to calculate the number of apps as in 2024? : yes Are we looking at any particular data slices like in a particular country/ for a particular geography etc.? : No, I want the overall global number And we mean to calculate total number of apps that are registered on Play Store and can be searched and accessed by a user, am i right? : Yes Okay, here's how"

    Kartikeya N. - "Clarifying questions Do we mean the Google Play store or Apple App store or some other app store? : Google Play Store Do we mean to calculate the number of apps as in 2024? : yes Are we looking at any particular data slices like in a particular country/ for a particular geography etc.? : No, I want the overall global number And we mean to calculate total number of apps that are registered on Play Store and can be searched and accessed by a user, am i right? : Yes Okay, here's how"See full answer

    Data Analyst
    Estimation
    +3 more
  • Google logoAsked at Google 
    +2

    "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"

    Yashasvi V. - "WITH RECURSIVE fibonacci_series AS ( SELECT 1 AS n, 0 AS fib1, 1 AS fib2 UNION ALL SELECT n + 1 AS n, fib2 AS fib1, fib1 + fib2 AS fib2 FROM fibonacci_series WHERE n < 20 -- Limit the series to 20 numbers ) SELECT n, fib1 AS fib FROM fibonacci_series ORDER BY n; `"See full answer

    Data Analyst
    Coding
    +3 more
  • Amazon logoAsked at Amazon 

    "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"

    Erjan G. - "1) select avg(session) from table where session> 180 2) select round(sessiontime/300)*300 as sessionbin, count() as sessioncount from table group by round(sessiontime/300)300 order by session_bin 3) SELECT t1.country AS country_a, t2.country AS country_b FROM ( SELECT country, COUNT(*) AS session_count FROM yourtablename GROUP BY country ) AS t1 JOIN ( SELECT country, COUNT(*) AS session_count FROM yourtablename `GROUP BY countr"See full answer

    Data Analyst
    Coding
    +3 more
  • Data Analyst
    Data Analysis
    +2 more
  • Amazon logoAsked at Amazon 

    "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"

    Catherine T. - "We want sales to grow, in order to have a growth in revenue. And customer usage as well as it allows to see if our product lead more engagement from our users. So to be able to see this overall evolution I would make a line chart for both : Sales : with month on x-axis and sales revenue on y-axis Customer Usage : with month on x-axis and a KPI allowing to measure customer usage (nblogins or nbsessions or nbgamesplayed, ... depending on the industry) on y-axis Moreover, after knowing th"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • 🧠 Want an expert answer to a question? Saving questions lets us know what content to make next.

  • "First, I would start by defining what growth means in the context of this new feature whether it's user acquisition, engagement, retention, or revenue. Next, I’d identify clear KPIs that directly align with that growth goal. For example, if the feature aims to improve engagement, I’d track metrics like daily active users, session duration, or feature adoption rate. Once the KPIs are in place, I’d run an A/B test comparing user behavior with and without the feature. This would be followed by de"

    Himanshu G. - "First, I would start by defining what growth means in the context of this new feature whether it's user acquisition, engagement, retention, or revenue. Next, I’d identify clear KPIs that directly align with that growth goal. For example, if the feature aims to improve engagement, I’d track metrics like daily active users, session duration, or feature adoption rate. Once the KPIs are in place, I’d run an A/B test comparing user behavior with and without the feature. This would be followed by de"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • "Investigation clear understanding the true cause"

    Akash D. - "Investigation clear understanding the true cause"See full answer

    Data Analyst
    Behavioral
    +3 more
  • +5

    "df.loc[ isin()] is the crucial part of the solution."

    Sean L. - "df.loc[ isin()] is the crucial part of the solution."See full answer

    Data Analyst
    Coding
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • "First, I’d start by checking the alignment of each idea with our core business goals. If any idea doesn't directly contribute to those goals, I’d deprioritize or eliminate it upfront. Next, I’d use a scoring model like RICE (Reach, Impact, Confidence, Effort), especially because effort is a critical factor when resources are limited. This gives us a structured and quantifiable way to rank the ideas. Once we have a prioritized list based on scores, I’d take it a step further and evaluate key as"

    Himanshu G. - "First, I’d start by checking the alignment of each idea with our core business goals. If any idea doesn't directly contribute to those goals, I’d deprioritize or eliminate it upfront. Next, I’d use a scoring model like RICE (Reach, Impact, Confidence, Effort), especially because effort is a critical factor when resources are limited. This gives us a structured and quantifiable way to rank the ideas. Once we have a prioritized list based on scores, I’d take it a step further and evaluate key as"See full answer

    Data Analyst
    Data Analysis
    +2 more
  • Amazon logoAsked at Amazon 

    "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."

    Ali H. - "SQL databases are relational, NoSQL databases are non-relational. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable."See full answer

    Data Analyst
    Technical
    +4 more
  • +3

    "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"

    Gabriel P. - "Hi, my solution gives the exact numerical values as the proposed solution, but it doesn't pass the tests. Am I missing something, or is this a bug? def findrevenueby_city(transactions: pd.DataFrame, users: pd.DataFrame, exchange_rate: pd.DataFrame) -> pd.DataFrame: gets user city for each user id userids = users[['id', 'usercity']] and merge on transactions transactions = transactions.merge(user_ids, how='left"See full answer

    Data Analyst
    Coding
    +2 more
  • +1

    "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"

    Arshad P. - "Schema is wrong - id from product is mapped to id from transactions, id from product should point to product_id in transcations table"See full answer

    Data Analyst
    Coding
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
  • Data Analyst
    Data Analysis
    +2 more
Showing 21-40 of 84