SQL
Identify Duplicate Records Based on Columns
Discover and list duplicate entries in your database table based on one or more specified columns using GROUP BY and HAVING COUNT(*).
SELECT
email, first_name, last_name,
COUNT(*) AS num_duplicates
FROM
users
GROUP BY
email, first_name, last_name
HAVING
COUNT(*) > 1;
How it works: This SQL query is highly useful for data cleaning and ensuring data integrity by identifying duplicate records. It groups rows by a combination of columns (e.g., `email`, `first_name`, `last_name`) that are expected to be unique. The `HAVING COUNT(*) > 1` clause then filters these groups to show only those combinations that appear more than once, effectively listing all the duplicate entries along with their count. This helps in pinpointing and resolving redundant data in your database.