SQL
Extract Data Using SQL Regular Expressions
Utilize SQL's regular expression functions to parse, validate, or extract specific patterns from string data, perfect for complex data cleaning and transformation.
SELECT
product_code,
REGEXP_SUBSTR(product_code, '^([A-Z]{2})') AS category_prefix,
REGEXP_SUBSTR(product_code, '(\d{4})$') AS product_number_suffix,
REGEXP_REPLACE(product_code, '[^0-9]', '', 'g') AS numeric_only
FROM products
WHERE product_code ~ '^[A-Z]{2}-\d{4}$';
How it works: This query demonstrates powerful string manipulation using PostgreSQL's regular expression functions. `REGEXP_SUBSTR` extracts specific parts of the `product_code` (e.g., a two-letter prefix or a four-digit suffix). `REGEXP_REPLACE` removes all non-numeric characters. The `WHERE product_code ~ '...'` clause uses a regular expression to filter records based on a specific pattern, ensuring data conforms to expected formats and providing a robust way to validate string data.