Last Chance to Join Data Science Interview MasterClass (this week) 🚀 | Just 2 spots remaining...

FREE

SQL BETWEEN and PRECEDING

Course Author Daniel Lee
Instructor

Dan Lee

Learn SQL for free with interactive exercises designed to help aspiring Data Engineers, Data Analysts, and Data Scientists. Learn the basics and advanced SQL concepts by solving real-world problems from MANGO (Meta, Amazon, Netflix, Google, OpenAI)

Lesson Objectives

By the end of this lesson, you will:

  • Learn how to use ROWS BETWEEN to define custom window ranges
  • Understand the difference between PRECEDING and FOLLOWING
  • Apply moving averages, running totals, and sliding windows using advanced window functions
  • Write SQL queries to analyze engagement trends on Meta posts

🎬 Scenario: Analyzing Engagement Patterns Over Time

Your manager at Meta, Jordan, wants to analyze engagement trends with rolling averages and cumulative sums.

Can you calculate engagement trends using different time windows? We need insights into how post likes evolve over time?

Your task is to use ROWS BETWEEN and PRECEDING to control window function behavior for rolling analytics.

Use the meta_posts table in this lesson:

post_iduser_idpost_contentpost_datelikessharespost_status
11“Had a great day at the beach!”“2024-01-10T14:30:00”12045“Public”
22“Launching our new product!”“2024-01-12T09:20:00”25090“Friends Only”
33“Just finished a 10K run!”“2024-01-15T18:10:00”8030“Public”
44“Hosting a webinar next week.”“2024-01-18T20:45:00”5025“Private”
55“Traveling to Japan soon!”“2024-01-20T10:05:00”18075“Public”
61“Excited for the weekend!”“2024-01-25T16:15:00”9540“Friends Only”
72“Need recommendations for a new laptop.”“2024-01-27T12:40:00”7020“Public”
83“Learning SQL window functions!”“2024-01-29T08:55:00”11050“Public”
94“Working on a new AI model.”“2024-02-01T14:00:00”200100“Friends Only”
105“Book recommendations?”“2024-02-05T11:30:00”6515“Public”

1. Understanding ROWS BETWEEN in SQL

The default window function behavior considers all previous rows up to the current row. However, using ROWS BETWEEN, we can:

  • Limit the number of previous rows used in calculations.
  • Include future rows for centered moving averages.

Common ROWS BETWEEN Configurations

ClauseDescriptionUse Case
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROWDefault cumulative sumRunning totals
ROWS BETWEEN X PRECEDING AND CURRENT ROWLimits to the last X rowsMoving averages
ROWS BETWEEN X PRECEDING AND X FOLLOWINGAverages over a balanced windowCentered moving averages
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWINGConsiders all rowsFull window analysis

2. Using ROWS BETWEEN for Running Totals

Jordan wants to track how engagement accumulates over time for each user’s posts.

Example Query: Running Total of Likes Per User

1SELECT user_id, post_id, post_date, likes,  
2       SUM(likes) OVER (PARTITION BY user_id ORDER BY post_date  
3                        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total_likes  
4FROM meta_posts;
user_idpost_idpost_datelikesrunning_total_likes
33“2024-01-15T18:10:00”8080
38“2024-01-29T08:55:00”110190
44“2024-01-18T20:45:00”5050
49“2024-02-01T14:00:00”200250
11“2024-01-10T14:30:00”120120
16“2024-01-25T16:15:00”95215
22“2024-01-12T09:20:00”250250
27“2024-01-27T12:40:00”70320
55“2024-01-20T10:05:00”180180
510“2024-02-05T11:30:00”65245

What’s Happening?

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW adds up likes progressively.
  • Each row carries forward the cumulative sum for each user.

3. Using ROWS BETWEEN X PRECEDING AND CURRENT ROW for Moving Averages

Jordan wants to calculate a 3-post moving average of likes, considering only the last 3 posts.

Example Query: 3-Post Moving Average of Likes

1SELECT user_id, post_id, post_date, likes,  
2       AVG(likes) OVER (PARTITION BY user_id ORDER BY post_date  
3                        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_likes  
4FROM meta_posts;
user_idpost_idpost_datelikesmoving_avg_likes
33“2024-01-15T18:10:00”8080
38“2024-01-29T08:55:00”11095
44“2024-01-18T20:45:00”5050
49“2024-02-01T14:00:00”200125
11“2024-01-10T14:30:00”120120
16“2024-01-25T16:15:00”95107.5
22“2024-01-12T09:20:00”250250
27“2024-01-27T12:40:00”70160
55“2024-01-20T10:05:00”180180
510“2024-02-05T11:30:00”65122.5

What’s Happening?

  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW considers only the last 2 posts + the current post.
  • This prevents distortion from much older posts and provides a rolling average.

4. Using ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING for Centered Moving Averages

Jordan wants a balanced moving average that considers the previous, current, and next post.

Example Query: Centered Moving Average of Likes

1SELECT user_id, post_id, post_date, likes,  
2       AVG(likes) OVER (PARTITION BY user_id ORDER BY post_date  
3                        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS centered_avg_likes  
4FROM meta_posts;
user_idpost_idpost_datelikescentered_avg_likes
33“2024-01-15T18:10:00”8095
38“2024-01-29T08:55:00”11095
44“2024-01-18T20:45:00”50125
49“2024-02-01T14:00:00”200125
11“2024-01-10T14:30:00”120107.5
16“2024-01-25T16:15:00”95107.5
22“2024-01-12T09:20:00”250160
27“2024-01-27T12:40:00”70160
55“2024-01-20T10:05:00”180122.5
510“2024-02-05T11:30:00”65122.5

What’s Happening?

  • ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING balances the window by including the next row.
  • This smooths out variations by considering surrounding values rather than past data alone.

✍️ SQL Exercises

Exercise 1: Running Total of Likes

Jordan wants to calculate a running total of likes for each user.

Filter condition:

  • Retrieve user ID, post ID, post date, likes, and running total of likes.
  • Sort by user_id and post_date in ascending order.
user_idpost_idpost_datelikesrunning_total_likes
33“2024-01-15T18:10:00”8080
38“2024-01-29T08:55:00”110190
44“2024-01-18T20:45:00”5050
49“2024-02-01T14:00:00”200250
11“2024-01-10T14:30:00”120120
16“2024-01-25T16:15:00”95215
22“2024-01-12T09:20:00”250250
27“2024-01-27T12:40:00”70320
55“2024-01-20T10:05:00”180180
510“2024-02-05T11:30:00”65245

Write an SQL query to return the requested data.

Upgrade to Pro

Exercise 2: Centered Moving Average of Likes

Jordan also wants to calculate a 3-post centered moving average of likes per user.

Filter condition:

  • Retrieve user ID, post ID, post date, likes, and centered moving average.
  • Sort by user_id and post_date in ascending order.
user_idpost_idpost_datelikescentered_avg_likes
33“2024-01-15T18:10:00”8095
38“2024-01-29T08:55:00”11095
44“2024-01-18T20:45:00”50125
49“2024-02-01T14:00:00”200125
11“2024-01-10T14:30:00”120107.5
16“2024-01-25T16:15:00”95107.5
22“2024-01-12T09:20:00”250160
27“2024-01-27T12:40:00”70160
55“2024-01-20T10:05:00”180122.5
510“2024-02-05T11:30:00”65122.5

Write an SQL query to return the requested data.

Upgrade to Pro
👉 Found this lesson helpful?