SQL

Identify and Remove Duplicate Rows (Keeping One)

Learn how to find duplicate records based on specific columns and then delete all but one occurrence, ensuring data integrity in your database effectively.

-- 1. Identify duplicates:
SELECT
    column1, column2, COUNT(*)
FROM
    your_table
GROUP BY
    column1, column2
HAVING
    COUNT(*) > 1;

-- 2. Delete duplicates, keeping the one with the minimum ID (example for MySQL/PostgreSQL/SQL Server):
DELETE FROM
    your_table
WHERE
    id NOT IN (
        SELECT
            MIN(id)
        FROM
            your_table
        GROUP BY
            column1, column2
    );

How it works: This snippet provides a two-step approach to handle duplicate records in your database. The first query helps you identify which combinations of `column1` and `column2` have more than one entry, flagging potential duplicates. The second query then safely deletes all duplicate rows, ensuring that at least one (specifically, the one with the minimum `id` in this example) is kept. This is crucial for maintaining data uniqueness and integrity, especially after data imports or during clean-up operations.

Identify and Remove Duplicate Rows (Keeping One)

Related SQL Snippets

Calculate a Running Total or Cumulative Sum in SQL

Check for Existence with Subqueries Using EXISTS

Handle NULL Values Gracefully with COALESCE

Need help integrating this into your project?