High Volume Low Success
You are given the following tables:
post table:
post_user table
Write a SQL query to isolate users who post above the overall average total posts but also have a successful post rate below the overall average.
Your output should include the following columns: user_id, post_success (no. of successful posts), post_attempt (no. of posts), post_success_rate. Order by decreasing success rate.
First, we utilize a Common Table Expression (CTE) called agg_metrics to compute two crucial metrics: the average post attempt and the average success rate across all users. Within this CTE, we nest a subquery that groups the posts by user and calculates the sum of successful posts and the total number of posts for each user. Then, we derive the overall average post attempt and success rate based on these aggregated values.
Moving on to the main query, we select each user's total successful posts, total post attempts, and success rate. We group the posts by user and compute these metrics. Subsequently, we filter out users whose post attempts exceed the average post attempt and whose success rate falls below the average success rate determined in the agg_metrics CTE. Finally, we order the results by decreasing success rate to identify users with the lowest success rates first.
WITH agg_metrics AS (
SELECT
AVG(post_attempt) AS avg_posting,
AVG(post_success * 1.0 / post_attempt) AS avg_success_rate
FROM (
SELECT
p.user_id,
SUM(p.is_successful_post) AS post_success,
COUNT(p.is_successful_post) AS post_attempt
FROM post AS p
GROUP BY 1
) t1
)
SELECT
p.user_id,
SUM(p.is_successful_post) AS post_success,
COUNT(p.is_successful_post) AS post_attempt,
SUM(p.is_successful_post) * 1.0 / COUNT(p.is_successful_post) AS post_success_rate
FROM post AS p
GROUP BY 1
HAVING (post_attempt >= (SELECT avg_posting FROM agg_metrics))
AND (post_success_rate <= (SELECT avg_success_rate FROM agg_metrics))
ORDER BY post_success_rate DESC
In the question it says: "above the overall average total posts", which to me implying a >, yet in the solution it uses >=
Caused me 1 hr to find out. plz fix