JAVASCRIPT

Extract Specific Attribute Values from HTML Tags

Learn how to extract attribute values like `href` from `<a>` tags or `src` from `<img>` using regular expressions in JavaScript for simple parsing tasks.

const extractAttribute = (htmlString, tagName, attributeName) => {
    // Regex to find a specific attribute within a specific HTML tag
    // Matches <tagName ...attributeName="VALUE"...>
    // Use non-greedy match `.*?` for attributes and `[^>]*?` for content inside tag
    // The pattern uses double quotes for the attribute value, and dynamically builds the regex string.
    const patternString = `<${tagName}[^>]*?${attributeName}=["']([^
"']*)["']`;
    const pattern = new RegExp(patternString, 'gi');
    const matches = [];
    let match;

    while ((match = pattern.exec(htmlString)) !== null) {
        matches.push(match[1]); // The captured group (attribute value)
    }
    return matches;
};

const htmlContent = '<a href="https://example.com">Link 1</a> <img src="/image.jpg" alt="An Image"> <a data-id="123" href="/another-page">Link 2</a>';

console.log("HREFs:", extractAttribute(htmlContent, "a", "href"));
// Expected: ["https://example.com", "/another-page"]

console.log("SRCs:", extractAttribute(htmlContent, "img", "src"));
// Expected: ["/image.jpg"]
How it works: This JavaScript snippet demonstrates how to extract specific attribute values from HTML tags using a regular expression. The function takes the full HTML string, the tag name, and the desired attribute name. It constructs a regex dynamically to find instances of the tag, then captures the value within the specified attribute's quotes. The `while` loop with `exec()` ensures all occurrences are found and returned in an array. Note: For complex HTML parsing, DOM manipulation or dedicated parsers are generally recommended over regex.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs