Last Chance to Join Data Science Interview MasterClass (this week) 🚀 | Just 2 spots remaining...

FREE

COALESCE

Course Author Daniel Lee
Instructor

Dan Lee

Learn SQL for free with interactive exercises designed to help aspiring Data Engineers, Data Analysts, and Data Scientists. Learn the basics and advanced SQL concepts by solving real-world problems from MANGO (Meta, Amazon, Netflix, Google, OpenAI)

Lesson Objectives

By the end of this lesson, you will:

  • Learn how to use COALESCE() to replace NULL values with defaults
  • Use NULLIF() to handle cases where two values are equal
  • Write SQL queries to manage missing data in Amazon orders and customer transactions

🎬 Scenario: Managing Missing Data in Amazon Orders

Your manager at Amazon, Sophia, noticed that some orders are missing delivery dates and some total spending values are NULL.

Can you ensure that missing delivery dates display ‘Not Delivered’ and that NULL spending amounts default to 0?

Your task is to use COALESCE() to replace missing values and NULLIF() to clean up inconsistent data. Use the amazon_orders and amazon_customers tables to address the questions.

amazon_orders table:

order_idcustomer_idproduct_idorder_datequantitytotal_pricestatusdelivery_date
112“2024-01-10T14:30:00”159.99“Delivered”“2024-01-15T00:00:00”
223“2024-01-12T09:20:00”239.98“Shipped”“NaT”
335“2024-01-15T18:10:00”139.99“Pending”“NaT”
441“2024-01-18T20:45:00”377.97“Delivered”“2024-01-22T00:00:00”
554“2024-01-20T10:05:00”1699.99“Canceled”“NaT”
611“2024-01-25T16:15:00”251.98“Shipped”“NaT”
724“2024-01-27T12:40:00”1699.99“Pending”“NaT”
832“2024-01-29T08:55:00”2119.98“Delivered”“2024-02-02T00:00:00”
945“2024-02-01T14:00:00”139.99“Shipped”“NaT”
1053“2024-02-05T11:30:00”359.97“Pending”“NaT”

amazon_customers table:

customer_idnameemailcountrysignup_datemembership_statustotal_spent
1“Alice Johnson”“alice@amazon.com”“USA”“2020-06-15T00:00:00”“Prime”5000
2“Bob Smith”“bob@amazon.com”“Canada”“2019-04-10T00:00:00”“Regular”1200.5
3“Charlie Lee”“charlie@amazon.com”“UK”“2021-02-22T00:00:00”“Business”8500.75
4“David Kim”“david@amazon.com”“Germany”“2022-09-30T00:00:00”“Prime”3000
5“Emma Garcia”“emma@amazon.com”“Mexico”“2023-01-12T00:00:00”“Regular”400.2

1. Using COALESCE() to Replace NULLs

The COALESCE() function returns the first non-NULL value from a list of arguments.

Why use COALESCE()?
Provide default values for NULLs
Prevent NULLs from breaking calculations
Ensure cleaner reports

Syntax

1SELECT COALESCE(column_name, default_value) FROM table_name;
  • If column_name is NULL, COALESCE() returns default_value.
  • If column_name is not NULL, the original value is returned.

Example Query: Handling Missing Delivery Dates

1SELECT order_id, customer_id,  
2       COALESCE(CAST(delivery_date AS VARCHAR), 'Not Delivered') AS formatted_delivery_date  
3FROM amazon_orders  
4ORDER BY order_id;
order_idcustomer_idformatted_delivery_date
112024-01-15
22Not Delivered
33Not Delivered

What’s Happening?

  • COALESCE(delivery_date, 'Not Delivered') replaces NULL delivery dates with "Not Delivered".

⚠️ Important Note:

  • All values passed to COALESCE() must be of the same data type.
  • If total_spent is a DECIMAL or FLOAT, ensure the replacement value (0) is also of the same type (e.g., 0.00 instead of 0 in some databases).
  • Mismatched types can cause errors in some SQL engines.

2. Using COALESCE() to Replace NULL Financial Values

Sophia wants to ensure that all missing total_spent values are displayed as $0 instead of NULL. This is important for financial reports where missing values should be treated as zero rather than unknown.

Example Query: Handling NULL Customer Spending

1SELECT customer_id, name,  
2       COALESCE(total_spent, 0) AS adjusted_total_spent  
3FROM amazon_customers  
4ORDER BY customer_id;
customer_idnameadjusted_total_spent
1Alice Johnson5000.00
2Bob Smith1200.50
3Charlie Lee8500.75
4David Kim3000.00
5Emma Garcia400.20

What’s Happening?

  • COALESCE(total_spent, 0) replaces any NULL values with 0.

3. Using NULLIF() to Handle Inconsistent Data

The NULLIF() function returns NULL if two values are equal, otherwise it returns the first value.

Why use NULLIF()?
Handle division errors (e.g., dividing by zero)
Convert “invalid” data into NULLs
Normalize inconsistent data

Syntax

1SELECT NULLIF(value1, value2) FROM table_name;
  • If value1 = value2, NULL is returned.
  • Otherwise, value1 is returned.

Example Query: Handling Zero Total Spending

1SELECT customer_id, name,  
2       NULLIF(total_spent, 0) AS valid_total_spent  
3FROM amazon_customers  
4ORDER BY customer_id;
customer_idnamevalid_total_spent
1Alice Johnson5000.00
2Bob Smith1200.50
3Charlie Lee8500.75
4David Kim3000.00
5Emma GarciaNULL

What’s Happening?

  • NULLIF(total_spent, 0) replaces 0 with NULL to avoid including customers who haven’t spent anything.

✍️ SQL Exercises

Exercise 1: Replacing NULL Delivery Dates

Sophia needs a report that ensures missing delivery dates display ‘Not Delivered’.

Filter condition:

  • Retrieve order ID, customer ID, and formatted delivery date.
  • Replace NULL delivery dates with "Not Delivered".
  • Sort by order_id in ascending order.
order_idcustomer_idformatted_delivery_date
112024-01-15
22Not Delivered
33Not Delivered

Write an SQL query to return the requested data.

Upgrade to Pro

Exercise 2: Handling Zero Total Spending

Sophia also wants to replace zero total spending values with NULL to differentiate customers who haven’t made purchases.

Filter condition:

  • Retrieve customer ID, name, and valid total spent.
  • Replace 0 values with NULL.
  • Sort by customer_id in ascending order.
customer_idnamevalid_total_spent
1“Alice Johnson”5000
2“Bob Smith”1200.5
3“Charlie Lee”8500.75
4“David Kim”3000
5“Emma Garcia”400.2

Write an SQL query to return the requested data.

Upgrade to Pro
👉 Found this lesson helpful?