Product Execution Interview Example: Increase Disney+ Retention

In this mock interview, George Perantatos (Product Director, Redfin) answers a product-execution interview question.

The question is, "Increase customer retention for Disney+ or Netflix."

He's being interviewed by Chris Wilson (Senior+ Product Manager).

What separates Georges' approach wasn't technical knowledge about p-values or sample sizes. It was his judgment about when experiments matter and when they don't.

Here's what he did right, and what you can learn from it.

The interviewer's verdict: "I would love to have George on my team."

When to Experiment

Most candidates treat experimentation as the default answer to every product decision.

George established a more nuanced framework:

"I would not use experiments all the time. As a product manager, you need to have conviction in what you're doing and where you're heading.

Some things just don't lend themselves well to experimentation. It is not a hammer and nail where everything has to be hit by this hammer called experiments."

Effort vs. Return

Use experiments when:

Changes are fast and easy to test (button copy, colors, positioning) and eliminate subjective debate
You need magnitude of impact, not just yes/no validation

Don't use experiments when:

The work is exploratory and early-stage (like a new AI chat feature)
Qualitative learning matters more than statistical significance
The change is low-risk and straightforward to measure without controlled testing

"A new zero-to-one exploratory AI-based chat experience? That matters less for A/B testing and measurement.

What matters more is what users are doing. Let's learn. It's very early stage, it's speculative. The value in talking to people, not looking at statistics."

ℹ️

Takeaway: Before designing any experiment, ask yourself: "Is this actually an experimentation problem? Or do I need a different tool?

The Biggest Mistakes

"PMs are often not grounded on the problem they're trying to solve. They jump to solutions and say, 'Oh, I can use experiments to test things and see what happens.'

Without a good grounding on what problem we're really trying to solve, it just becomes a bit of a dart game."

This manifests in two ways:

Problem 1: Random testing without purpose

"Teams generate ideas in planning meetings and test whatever ideas come up. They move elements around, change colors, try different copy, etc.

But they're not solving a specific user problem.

When an experiment shows mixed results, they don't know what to do because they never articulated what success would mean."

Problem 2: Paralysis when data is ambiguous

"Sometimes we have divergent results. Some metric is going up and some metric is going down.

PMs don't feel like they can use intuition and judgment in combination with data to make a call, and they feel frozen."

ℹ️

Takeaway: Experiments are inputs to decisions, not the decisions themselves. Ground yourself in the user problem before designing any test. Use judgment when results are ambiguous.

Example: Improve Disney+ Retention

The prompt: Design an experiment to increase retention at Disney+.
Your hypothesis: A more personalized "Recommended For You" section will keep users subscribed for longer.

Most candidates would immediately start defining control groups and metrics.

George did something different.

The User Problem

"Let's talk about user problems. I would imagine your scenario includes some problem where people are not finding what they're looking for in this service. They don't feel like it's for them, and they're not coming back."

George re-framed the exercise before accepting the hypothesis.

There's a business problem (retention) and a user problem (not finding content). The recommendation section might address that, but only if you understand what's actually broken.

Validate Before Testing

"I would expect we'd want to validate first. Is something up with Recommended For You? Are a lot of people seeing it and not using it?

Or is it under-exposed? Meaning, not a lot of people are seeing this feature but it's very popular when they do."

Before designing any test, understand the current state:

What's the engagement rate on existing movie and TV recommendations?
Where are users actually struggling?
Is this a discovery problem, a quality problem, or something else?

George made his assumptions explicit:

"Let's assume Recommended For You is pretty visible in the app. It's on every first-time load, but we believe the recommendations just don't feel very up-to-date. They don't feel very relevant."

ℹ️

Takeaway:: Don't accept hypotheses at face value. Work backwards from the business problem to the user problem to the specific failure point. Make your assumptions explicit so the interviewer can redirect if needed.

One Change at a Time

"The next question is why? Why is it not being used as much? The recommendations could be bad quality. Or it's not very discoverable. Or it's a UI problem."

George chose one: the quality hypothesis.

"If we had one additional piece of information to feed into the recommendation model, it would spit out better recommendations, which means more clicks, which means more people finding things to view."

George's specific example: adding watch duration data to the model. Not just which shows users watch, but how long they watch them. One input change. One model improvement.

"Don't test too many things at once because it makes the experiment unclear. If you combine too many changes in one variant, what actually changed? What drove the change? You can stack learnings more quickly versus doing one giant experiment."

What not to do:

"People want to test two or three different signals in the ML model. I'm like 'no, let's test one. What's the most promising one?' Or 'what if we do a UI test and an ML model change?' No, no, let's keep those separate."

What to do:

"What is the smallest and most valuable learning we can have? What's the biggest ROI thing we can do that we think will drive movement in the metric? Let's test one thing. If that's a dud, we can take a different route."

ℹ️

Takeaway: Test one hypothesis at a time. Choose the highest ROI change you can isolate. If it works, build on it. If it fails, try something else. Don't bundle multiple ideas into one test.

Input Metrics vs. Output Metrics

The typical approach: "We're testing retention, so we'll measure weekly active users and subscription churn."

George's approach:

"I would love to know if retention is affected. But I would first want to say—that's an output metric. What's the input? The input is: are people clicking on Recommended For You? Are people scrolling? Are people viewing?"

His reasoning:

"I would look at the actual engagement metrics within the UI that we're changing and see if we are actually improving what we believe are signals that people are more engaged. That should correlate to more daily active users. It should correlate to lower churn."

Why this matters:

Leading indicators move faster than lagging indicators
You can see engagement changes within days; retention changes take weeks or months
If engagement doesn't increase, retention won't either—no need to wait
It forces clarity about your theory of change

ℹ️

Takeaway: Measure the proximate behavior change you expect to cause your desired outcome. If that behavior doesn't shift, your outcome metric won't either.

MVP Experiments

"Don't test too many things at once. You can stack learnings more quickly versus doing one giant experiment."

George also acknowledged when to slow down:

"If a problem is very novel or we're very uncertain, I would want to at least see for my own self—do these feel better? It's not worth running an experiment if you're way off. It's cheaper to test with a prototype with users than to have engineers build it and ship it as an experiment only to realize we were so far off the mark."

Low risk, low uncertainty? Go straight to testing.
High risk or high uncertainty? Validate with users first through prototypes or qualitative research.

ℹ️

Takeaway: One small test runs faster than one large test. But validate high-uncertainty ideas qualitatively before building them.

Interviewer's Perspective

The interviewer, Chris, explained what made George stand out:

Conviction about philosophy: "Not every organization runs experiments or knows how to or even wants to. If you're not convicted into doing experiments, you're probably just going to run with whatever an expert says or be an order taker."
Problem focus: "A lot of people want to jump in and talk about the solution and how they would validate their solution more than validate if there's a problem. They get too married to the outcome and not married enough to the problem."
Initiative: "He didn't really have to ask too many questions. He said 'I'm going to make some assumptions, I'm going to move down this path.' That's what a good product manager does until they're proven otherwise."

ℹ️

Takeaway: Senior+ roles aren't evaluating whether you can execute experiments. They're evaluating your judgment about when to use them and strategic thinking about how they fit into product development.

What Could Be Improved

After the interview, George reflected on what he could have done better:

"It's hard to do in the moment, but maybe being a little bit more structured in the scenario would be good. My tendency is just to dive in and figure it out. I could have paused and said 'I'm going to first talk about the user problem, then I'm going to talk about potential solutions and how we'll measure them.'"

The benefit of signposting:

"It would have given Chris a sense of what I'm about to say for the next few minutes. In case I'm off with what he's looking for, he can redirect. He's also thinking about how he's going to grade me, maybe he has a follow-up question."

Chris agreed:

"Pausing, even writing stuff down, shows you're thinking through it. It also gives the interviewer a breather to think about next questions. You've led them down a path with your answers, so it gives them a second to breathe."

ℹ️

Takeaway: In scenario questions, outline your structure before diving in. "I'm going to cover X, then Y, then Z." It helps the interviewer follow your thinking and allows them to redirect if needed.

Senior+ Candidates Tips

George's final insight about what separates good from great in these interviews:

"Don't forget about those higher-level skills. Even if it is a deep 'what's the hypothesis, show me a dashboard' type of question—take a few moments to express the higher level. Should we even run this experiment? How do I decide what's in my roadmap to experiment on? It shows you have higher-level thinking. It really rounds out a candidate. It could be a tiebreaker between two candidates equally strong on the technical side."

Anyone can learn statistical significance and sample size calculations. Those are table stakes.

The skills that distinguish senior leadership are:

judgment about when to use those tools,
when to experiment versus when to ship,
when to measure versus when to learn qualitatively,
and when to wait for data versus when to trust conviction.

Practical Takeaways

Before the interview:

Develop a clear point of view on when experiments are useful and when they're not
Have specific examples from your experience of experiments that worked, failed, or shouldn't have been run
Practice articulating your philosophy, not just your process

During setup questions:

Lead with your framework for thinking about experimentation
Give specific examples that ground your answers in reality
Don't be afraid to challenge assumptions or show nuance

During scenario questions:

Outline your structure before diving in: "I'm going to first talk about X, then Y"
Start with the user problem, not the solution
Make assumptions explicit: "I'm going to assume X, tell me if I should go a different direction"
Work through the problem systematically, but don't be afraid to drive the conversation

When discussing experiment design:

Articulate what you'd research before designing the test
Choose one hypothesis to test, not multiple
Lead with input metrics that signal the behavior change you expect
Keep the first test small and focused on the highest ROI change

The interviewer isn't evaluating whether you can execute experiments flawlessly. They're evaluating your judgment about when to use them and how they fit into building products.

Show that you can think strategically, not just tactically.

Product Execution Interview Example: Increase Disney+ Retention

When to Experiment

Effort vs. Return

The Biggest Mistakes

Problem 1: Random testing without purpose

Problem 2: Paralysis when data is ambiguous

Example: Improve Disney+ Retention

The User Problem

Validate Before Testing

One Change at a Time

Input Metrics vs. Output Metrics

MVP Experiments

Interviewer's Perspective

What Could Be Improved

Senior+ Candidates Tips

Practical Takeaways

Your Exponent membership awaits.

Follow Us

Products

Courses

Interview Questions

Popular articles

Guides

Coaching

For Partners

Company