Join ML Engineer Interview MasterClass (April Cohort) led by FAANG Data Scientists | Just 6 seats remaining...
ML Engineer MasterClass (April) | 6 seats left
By the end of this lesson, you will:
Your manager at Meta, Jordan, is interested in analyzing post engagement trends over time.
Can you provide cumulative engagement statistics, like running totals of likes and the average engagement per post?
Your task is to use window functions (SUM() OVER(), AVG() OVER(), COUNT() OVER()) to generate engagement trends. Use the meta_posts table to solve the question.
| post_id | user_id | post_content | post_date | likes | shares | post_status |
|---|---|---|---|---|---|---|
| 1 | 1 | “Had a great day at the beach!” | “2024-01-10T14:30:00” | 120 | 45 | “Public” |
| 2 | 2 | “Launching our new product!” | “2024-01-12T09:20:00” | 250 | 90 | “Friends Only” |
| 3 | 3 | “Just finished a 10K run!” | “2024-01-15T18:10:00” | 80 | 30 | “Public” |
| 4 | 4 | “Hosting a webinar next week.” | “2024-01-18T20:45:00” | 50 | 25 | “Private” |
| 5 | 5 | “Traveling to Japan soon!” | “2024-01-20T10:05:00” | 180 | 75 | “Public” |
| 6 | 1 | “Excited for the weekend!” | “2024-01-25T16:15:00” | 95 | 40 | “Friends Only” |
| 7 | 2 | “Need recommendations for a new laptop.” | “2024-01-27T12:40:00” | 70 | 20 | “Public” |
| 8 | 3 | “Learning SQL window functions!” | “2024-01-29T08:55:00” | 110 | 50 | “Public” |
| 9 | 4 | “Working on a new AI model.” | “2024-02-01T14:00:00” | 200 | 100 | “Friends Only” |
| 10 | 5 | “Book recommendations?” | “2024-02-05T11:30:00” | 65 | 15 | “Public” |
Unlike GROUP BY, window functions perform aggregations while preserving individual rows in the result.
| Function | What It Does | Use Case |
|---|---|---|
| SUM() OVER() | Computes a running total over a partition | Cumulative likes per post |
| AVG() OVER() | Calculates a moving average | Average engagement over time |
| COUNT() OVER() | Counts the number of rows in a partition | Running count of posts |
SELECT column1, column2,
function() OVER (PARTITION BY column ORDER BY column) AS result
FROM table_name;PARTITION BY groups data into subsets (optional).ORDER BY defines the ranking within the window.Jordan wants a cumulative sum of likes to track engagement growth.
SELECT post_id, post_date, likes,
SUM(likes) OVER (ORDER BY post_date) AS running_total_likes
FROM meta_posts;| post_id | post_date | likes | running_total_likes |
|---|---|---|---|
| 1 | “2024-01-10T14:30:00” | 120 | 120 |
| 2 | “2024-01-12T09:20:00” | 250 | 370 |
| 3 | “2024-01-15T18:10:00” | 80 | 450 |
| 4 | “2024-01-18T20:45:00” | 50 | 500 |
| 5 | “2024-01-20T10:05:00” | 180 | 680 |
| 6 | “2024-01-25T16:15:00” | 95 | 775 |
| 7 | “2024-01-27T12:40:00” | 70 | 845 |
| 8 | “2024-01-29T08:55:00” | 110 | 955 |
| 9 | “2024-02-01T14:00:00” | 200 | 1155 |
| 10 | “2024-02-05T11:30:00” | 65 | 1220 |
What’s Happening?
SUM(likes) OVER (ORDER BY post_date) adds up the likes progressively, creating a running total.Jordan wants to find the average number of likes per post over time.
SELECT post_id, post_date, likes,
AVG(likes) OVER (ORDER BY post_date) AS avg_likes_over_time
FROM meta_posts;| post_id | post_date | likes | avg_likes_over_time |
|---|---|---|---|
| 1 | “2024-01-10T14:30:00” | 120 | 120 |
| 2 | “2024-01-12T09:20:00” | 250 | 185 |
| 3 | “2024-01-15T18:10:00” | 80 | 150 |
| 4 | “2024-01-18T20:45:00” | 50 | 125 |
| 5 | “2024-01-20T10:05:00” | 180 | 136 |
| 6 | “2024-01-25T16:15:00” | 95 | 129.16666666666666 |
| 7 | “2024-01-27T12:40:00” | 70 | 120.71428571428571 |
| 8 | “2024-01-29T08:55:00” | 110 | 119.375 |
| 9 | “2024-02-01T14:00:00” | 200 | 128.33333333333334 |
| 10 | “2024-02-05T11:30:00” | 65 | 122 |
What’s Happening?
AVG(likes) OVER (ORDER BY post_date) calculates a moving average of likes per post.Jordan wants to track how many posts have been published over time.
SELECT post_id, post_date,
COUNT(*) OVER (ORDER BY post_date) AS running_post_count
FROM meta_posts;| post_id | post_date | running_post_count |
|---|---|---|
| 1 | “2024-01-10T14:30:00” | 1 |
| 2 | “2024-01-12T09:20:00” | 2 |
| 3 | “2024-01-15T18:10:00” | 3 |
| 4 | “2024-01-18T20:45:00” | 4 |
| 5 | “2024-01-20T10:05:00” | 5 |
| 6 | “2024-01-25T16:15:00” | 6 |
| 7 | “2024-01-27T12:40:00” | 7 |
| 8 | “2024-01-29T08:55:00” | 8 |
| 9 | “2024-02-01T14:00:00” | 9 |
| 10 | “2024-02-05T11:30:00” | 10 |
What’s Happening?
COUNT(*) OVER (ORDER BY post_date) generates a running count of posts over time.Jordan wants to calculate a running total of likes to track engagement growth.
Filter condition:
| post_id | post_date | likes | running_total_likes |
|---|---|---|---|
| 1 | “2024-01-10T14:30:00” | 120 | 120 |
| 2 | “2024-01-12T09:20:00” | 250 | 370 |
| 3 | “2024-01-15T18:10:00” | 80 | 450 |
| 4 | “2024-01-18T20:45:00” | 50 | 500 |
| 5 | “2024-01-20T10:05:00” | 180 | 680 |
| 6 | “2024-01-25T16:15:00” | 95 | 775 |
| 7 | “2024-01-27T12:40:00” | 70 | 845 |
| 8 | “2024-01-29T08:55:00” | 110 | 955 |
| 9 | “2024-02-01T14:00:00” | 200 | 1155 |
| 10 | “2024-02-05T11:30:00” | 65 | 1220 |
Write an SQL query to return the requested data.
Jordan also wants to track how many posts have been published over time.
Filter condition:
| post_id | post_date | running_post_count |
|---|---|---|
| 1 | “2024-01-10T14:30:00” | 1 |
| 2 | “2024-01-12T09:20:00” | 2 |
| 3 | “2024-01-15T18:10:00” | 3 |
| 4 | “2024-01-18T20:45:00” | 4 |
| 5 | “2024-01-20T10:05:00” | 5 |
| 6 | “2024-01-25T16:15:00” | 6 |
| 7 | “2024-01-27T12:40:00” | 7 |
| 8 | “2024-01-29T08:55:00” | 8 |
| 9 | “2024-02-01T14:00:00” | 9 |
| 10 | “2024-02-05T11:30:00” | 10 |
Write an SQL query to return the requested data.