JAVASCRIPT
Extract Specific Attribute Values from HTML Tags
Learn how to extract attribute values like `href` from `<a>` tags or `src` from `<img>` using regular expressions in JavaScript for simple parsing tasks.
const extractAttribute = (htmlString, tagName, attributeName) => {
// Regex to find a specific attribute within a specific HTML tag
// Matches <tagName ...attributeName="VALUE"...>
// Use non-greedy match `.*?` for attributes and `[^>]*?` for content inside tag
// The pattern uses double quotes for the attribute value, and dynamically builds the regex string.
const patternString = `<${tagName}[^>]*?${attributeName}=["']([^
"']*)["']`;
const pattern = new RegExp(patternString, 'gi');
const matches = [];
let match;
while ((match = pattern.exec(htmlString)) !== null) {
matches.push(match[1]); // The captured group (attribute value)
}
return matches;
};
const htmlContent = '<a href="https://example.com">Link 1</a> <img src="/image.jpg" alt="An Image"> <a data-id="123" href="/another-page">Link 2</a>';
console.log("HREFs:", extractAttribute(htmlContent, "a", "href"));
// Expected: ["https://example.com", "/another-page"]
console.log("SRCs:", extractAttribute(htmlContent, "img", "src"));
// Expected: ["/image.jpg"]
How it works: This JavaScript snippet demonstrates how to extract specific attribute values from HTML tags using a regular expression. The function takes the full HTML string, the tag name, and the desired attribute name. It constructs a regex dynamically to find instances of the tag, then captures the value within the specified attribute's quotes. The `while` loop with `exec()` ensures all occurrences are found and returned in an array. Note: For complex HTML parsing, DOM manipulation or dedicated parsers are generally recommended over regex.