How to Answer Impact Sizing Questions
When answering impact sizing questions as a data scientist, interviewers are looking for more detailed estimates than in BizOps and PM interviews. You’re expected to discuss different data sources and statistical techniques to calculate impact.
Following a structured and logical framework helps showcase your problem-solving skills and analytical thinking.
Below, we’ll guide you through a step-by-step process that you can use to discuss impact sizing in an interview.
- Step 1. Define the problem. Ask questions to define the business problem, scope, and KPIs.
- Step 2. Identify baselines. Identify an existing comparable or baseline to measure performance.
- Step 3. Estimate the equation variables. Estimate the variables from the equation. State your assumptions at each step.
- Step 4. Identify and interpret the impact estimate. Estimate the final impact as a range and discuss its implications.
- Step 5. Validate results. Identify biases and validate estimates by comparing them to other metrics.
- Step 6. Discuss second-order effects. Recommend next steps or future iterations of the analysis.
- Step 7. Summarize. Recap your analysis approach, results, and suggestions.

We’ll use this example interview question to walk through the 7-step answer framework: “How much incremental revenue can a food delivery company make if it expands to a new vertical, e.g. grocery delivery?”
Step 1: Define the problem
Understand the business problem and define the scope by asking clarifying questions about the primary goal, expected output, and market size. Identify the key performance indicators (KPIs) or metrics relevant to measuring the impact.
In the delivery revenue example, you could say,
- “What is the goal of expanding to grocery?
- Possible short-term goals include new customer acquisition, increasing existing customer spend, and improving efficiency or utilization of drivers. Longer-term goals include improving the company's profitability and increasing customer retention, thus building a more resilient business.
- What is the expected output of this analysis?
- Estimated annualized incremental revenue from the grocery expansion
- Should this analysis be focused on a specific market?
- Geographically, the region or state level makes more sense because the city level might have too much variation, and the results would not be generalizable.
- Focusing on large chain retailers is more meaningful in terms of the type of grocery store because local stores are highly variable.
- Should the company also offer shopping and delivery, or just pickup and delivery?
- This depends on various factors such as competitors, supply of drivers, etc which we will discuss in more detail later.
For this analysis, we’ll focus on the short-term goals of new customer acquisition and increasing customer spending, primarily focused on large chain grocery retailers in a particular region of the United States.”
Step 2: Identify baselines
Identify an existing comparable or baseline to measure performance.
In the delivery revenue example, you could say,
“Valuable sources and uses of data for this problem include:
- Market research analysis: use datasets such as Nielsen, which has grocery transaction data, user research, and user surveys, to estimate the number of grocery users.
- Grocery delivery market data: analyze the market size, growth trends, and key competitors to assess how many existing customers use grocery delivery.
- Competitor analysis: analyze data from competitors, if available, to understand costs.
- Cost estimates: work with the finance team to get estimates of one-time costs and identify recurring costs.”
Step 3: Estimate the equation variables
Estimate the variables from your equation and state assumptions at each step. Common statistical techniques for estimating variables include regression analysis, A/B testing, and causal inference methods using observational data. Walk your interviewer through what techniques you’ll use and how their specific inputs and outputs can help establish your estimations.
In the delivery revenue example, you could say,
“The main equation we’re trying to solve is:
Total incremental revenue = number of users * revenue per user - cost
To further breakdown revenue per user:
Revenue per user = order volume * average order value
Now, we need to estimate each part of these equations.
To estimate new customer acquisition, we’ll:
- Identify target user segments within the market, such as urban households, busy professionals, or elderly populations. Then, we’ll use statistical techniques like clustering to see how much overlap there is with the existing customer base.
- Analyze customer preferences, behavior, and willingness to pay for grocery delivery services through surveys and user research.
To estimate revenue per user, we’ll:
- Conduct regression analysis to identify factors that drive current revenue in the food delivery business, such as order frequency, order size, customer location, and time of day.
- Estimate the potential impact of expanding into grocery delivery on these key revenue drivers, using the regression model.
- Consider changes to pricing/subscription fees, such as adding grocery delivery, and introducing pricing tiers for unlimited food/grocery delivery.
To estimate cost, we’ll:
- Analyze the cost of hiring additional drivers/shoppers based on model demand vs. supply.
- Model different grocery models, such as shopping and delivery vs. pickup and delivery only.
To incorporate assumptions about market penetration and adoption into the calculation, we’ll:
- Estimate the potential market share that the food delivery company could capture in the grocery delivery market, based on factors such as brand reputation, service quality, and pricing strategy.
- Consider factors that may influence market adoption, such as competition from existing grocery delivery services, consumer preferences for online shopping, and logistical challenges.”
Step 4: Identify and interpret the impact estimate
Quantify the final impact estimate as a range (with lower and upper bounds) rather than a point estimate and avoid using precise numbers (round up or down to the nearest 1000 or million depending on the size of the business). Discuss the implications of the observed impact in terms of practical significance and business value.
In the delivery revenue example, you could say,
“We’ll plug in the estimates of the variables from the previous step into the equations and estimate a range of potential annualized incremental impact.”
Step 5: Validate results
First, address any sources of bias or confounding factors that may affect the interpretation of results. Be transparent about the limitations of the analysis, such as sample size constraints, data quality issues, or external factors.
Then validate your estimate by comparing it to metrics such as the company's market cap. Also consider other ways to validate your analysis, such as running an experiment.
In the delivery revenue example, you could say,
“This analysis is based on several assumptions and estimates, and it’s difficult to estimate how accurate these are. It is also possible that the results might not be generalizable to other regions because of differences in user demand and driver supply.
We’ll compare the results with baselines like competitors’ revenue to see how large the estimate is, relative to the company’s current market cap.
If it’s close to 100% of the market cap, for example, it’s likely an overestimate.
The next step is to validate in a more robust way by running a pilot in a city with a few national chain grocery stores.
- The pilot city should be an average, i.e. generalizable, city with competitors, rather than a city with no significant competitors
- Key metrics include revenue per user, number of users, supply-demand balance, and supply utilization. These metrics would be compared with a similar city that doesn’t have a grocery delivery option.”
Step 6: Discuss second-order effects
Provide recommendations for the next steps or future iterations of the analysis. Discussing second-order effects demonstrates your ability to think holistically, and provides an opportunity to distinguish yourself from other candidates who might skip this step.
In the delivery revenue example, you could say,
“While we chose to focus on new customer acquisition and increasing revenue for existing users, other effects could include:
- Operational complexity: Grocery delivery involves unique challenges compared to food delivery, such as managing perishable goods, handling larger order volumes, and coordinating with multiple suppliers. Expanding into this sector may require significant adjustments to the company's operations and logistics infrastructure.
- Increased competition and effect on the brand: Entering the grocery delivery market exposes the company to competition from established grocery retailers, as well as other food delivery companies that may already offer grocery services. Competing in this crowded market requires differentiation strategies and a focus on providing unique value to customers.
- Synergies and cross-selling opportunities: Expanding into groceries can create synergies with the company's existing food delivery operations. For example, customers who order meals may also be interested in purchasing groceries from the same platform. Leveraging cross-selling opportunities can increase average order value and customer engagement.”
Step 7: Summarize
Clearly communicate your analysis approach, results, and recommendations in a structured and concise manner.
In the delivery revenue example, you could say,
“We discussed how to estimate annualized incremental revenue in the short term, through new customer acquisition and increased existing customer spending, by expanding into the grocery vertical in a specific region. We assumed we would start by partnering with chain grocery retailers. However, this method has several assumptions around competition and customer demand baked into it. Running a pilot is a good way to validate these assumptions.”