JAVASCRIPT
Extract Image URLs from HTML with Regex
Learn to use a regular expression in JavaScript to efficiently find and extract all 'src' attributes of 'img' tags from an HTML string for data processing.
const htmlString = `<p>Some text.</p><img src="image1.jpg" alt="Alt 1"><p>More text.</p><img src='https://example.com/images/image2.png' alt="Alt 2">`;
const imgTagRegex = /<img[^>]+src=["']([^"']+)["']/g;
const imageUrls = [];
let match;
while ((match = imgTagRegex.exec(htmlString)) !== null) {
imageUrls.push(match[1]);
}
console.log(imageUrls);
// Expected output: [ "image1.jpg", "https://example.com/images/image2.png" ]
How it works: This snippet demonstrates how to extract all image source (src) URLs from an HTML string using a regular expression in JavaScript. The `imgTagRegex` pattern `/<img[^>]+src=["']([^"']+)["']/g` looks for `<img>` tags, then matches any characters (`[^>]+`) until it finds `src=` followed by either single or double quotes. The `([^"']+)` part is a capturing group that specifically extracts the URL within those quotes. The `g` flag ensures all matches are found, and the `exec()` method is used in a loop to collect each captured URL into an array.