TV Show Watch Time
Disney+ is a streaming platform with multiple shows and millions of subscribers. The company wishes to identify their star customers, people who are using the platform more and more over time.
You are given the table watch_time
Write an SQL query to identify the viewers who have received a month-over-month increase in watch time of at least 3 months. In other words, you're looking for viewers who have consistently increased their watch time for a minimum of 3 consecutive months.
Desired output:
Step 1: Preparing the Data
We set up a temporary table named lagged_watch_time using WITH. This table helps us organize data, containing viewer IDs, year, month, and watch hours. We also use the LAG function twice to fetch watch hours from the previous two months for each viewer.
Step 2: Analyzing Changes With the helper table, we compare current watch hours with those from the last two months. This helps us see if viewers are increasing their watch time consistently.
Step 3: Identifying Consistent Increases We filter viewers who have increased their watch time over the last two months compared to the previous month. This indicates a consistent rise in watch time.
Step 4: Selecting Distinct Results
To avoid repetition, we select only unique viewer IDs using SELECT DISTINCT.
WITH lagged_watch_time AS (
SELECT
viewer_id,
year,
month,
watch_hours,
LAG(watch_hours, 1) OVER(PARTITION BY viewer_id ORDER BY year, month) AS prev_watch_hours,
LAG(watch_hours, 2) OVER(PARTITION BY viewer_id ORDER BY year, month) AS prev_prev_watch_hours
FROM watch_time
)
SELECT DISTINCT viewer_id
FROM lagged_watch_time
WHERE watch_hours > prev_watch_hours AND prev_watch_hours > prev_prev_watch_hours;
WITH previous AS(SELECT viewer_id, watch_hours, LAG(watch_hours) OVER(PARTITION BY viewer_id ORDER BY year, month) AS previous_hours, year, month FROM watch_time GROUP BY viewer_id, year, month ), streaks AS(SELECT viewer_id, SUM(CASE WHEN previous_hours IS NOT NULL AND previous_hours < watch_hours THEN 1 END) AS streak FROM previous GROUP BY viewer_id ) SELECT DISTINCT viewer_id FROM streaks WHERE streak >= 3