SQL
Identify and Remove Duplicate Rows Keeping One Instance
Learn to clean your database by identifying duplicate records based on specific columns and efficiently removing them, ensuring data integrity by retaining a single unique entry for each set.
DELETE FROM your_table
WHERE id IN (
SELECT id FROM (
SELECT
id,
ROW_NUMBER() OVER(PARTITION BY column1, column2 ORDER BY id) as rn
FROM
your_table
) AS subquery
WHERE rn > 1
);
How it works: This SQL query removes duplicate rows from `your_table` while keeping one instance of each unique combination of `column1` and `column2`. It uses a subquery with `ROW_NUMBER()` partitioned by `column1` and `column2`. `ROW_NUMBER()` assigns a sequential number to each row within these partitions. Any row with `rn > 1` is a duplicate. The outer `DELETE` statement then targets and removes these identified duplicate `id`s. Remember to replace `your_table`, `column1`, `column2`, and `id` with your actual table and column names.