SQL DELETE DUPLICATE ROWS

Duplicate rows can often appear in a database due to data entry errors or merging tables. The DELETE statement in SQL can be used to remove these duplicates while keeping one instance of each unique row. This process often involves using ROW_NUMBER(), CTEs (Common Table Expressions), or subqueries.

The general approach involves identifying duplicates based on a subset of columns and then deleting rows with higher row numbers or IDs.

Example

This example removes duplicate rows from the Orders table, keeping only one instance of each product.

Code Example


-- Identify duplicates using ROW_NUMBER()
WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Product) AS RowNum
    FROM Orders
)
-- Delete rows where RowNum is greater than 1
DELETE FROM Orders
WHERE OrderID IN (
    SELECT OrderID FROM CTE WHERE RowNum > 1
);

-- View the updated table
SELECT * FROM Orders;
            

Output

Product Category Price
Laptop Electronics 1300.00
Phone Mobile Devices 800.00
Tablet Gadgets 600.00
Smartwatch Gadgets 250.00

Explanation

- The ROW_NUMBER() function assigns a unique row number to each duplicate group.
- Only rows with RowNum > 1 are deleted.
- The updated table contains only unique rows for each product.