SQL

Identify and Remove Duplicate Rows While Preserving One

Discover an efficient SQL method to find and delete duplicate records from a table, ensuring one unique entry is retained based on criteria like minimum ID.

DELETE T1 FROM your_table T1
JOIN (
    SELECT column1, column2, MIN(id) as min_id
    FROM your_table
    GROUP BY column1, column2
    HAVING COUNT(*) > 1
) AS T2
ON T1.column1 = T2.column1
AND T1.column2 = T2.column2
AND T1.id > T2.min_id;

-- For PostgreSQL/SQL Server, you might use an equivalent subquery structure:
-- DELETE FROM your_table
-- WHERE id IN (
--     SELECT t_inner.id
--     FROM your_table t_inner
--     JOIN (
--         SELECT column1, column2, MIN(id) as min_id
--         FROM your_table
--         GROUP BY column1, column2
--         HAVING COUNT(*) > 1
--     ) AS t_min
--     ON t_inner.column1 = t_min.column1
--     AND t_inner.column2 = t_min.column2
--     AND t_inner.id > t_min.min_id
-- );

How it works: This snippet demonstrates how to remove duplicate rows from a table, keeping only one unique entry based on a combination of columns (`column1`, `column2`) and retaining the row with the smallest `id`. It uses a self-join with a subquery that first identifies the `MIN(id)` for each group of duplicate `(column1, column2)` pairs. The `DELETE` statement then targets and removes all rows where the `id` is greater than this `min_id` for each duplicate group, effectively preserving the earliest encountered unique record. This approach avoids using database-specific window functions for broader compatibility.

Identify and Remove Duplicate Rows While Preserving One

Related SQL Snippets

Performing Conditional Updates with CASE WHEN

Extracting and Querying Data from JSON Columns

Implementing Basic Full-Text Search with MATCH AGAINST

Need help integrating this into your project?