SQL

Identify and Remove Duplicate Records in SQL

Learn essential SQL techniques to find duplicate rows based on specific columns and safely delete redundant entries while preserving a unique record.

-- 1. Find duplicate records (e.g., by email and name)
SELECT email, name, COUNT(*)
FROM customers
GROUP BY email, name
HAVING COUNT(*) > 1;

-- 2. Delete duplicate records, keeping the one with the minimum ID
DELETE FROM customers
WHERE id NOT IN (
    SELECT MIN_ID FROM (
        SELECT MIN(id) AS MIN_ID
        FROM customers
        GROUP BY email, name
    ) AS temp
);
How it works: This snippet provides a two-step approach to managing duplicate records. The first part identifies duplicates by grouping rows on specific columns (e.g., email and name) and filtering for groups with more than one entry. The second part demonstrates how to delete these duplicates while ensuring one unique record is kept, typically the one with the MIN(id) or MAX(id) to preserve the oldest or newest entry, respectively. The inner subquery is necessary in some SQL dialects to avoid issues with modifying a table while selecting from it in the same statement.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs