JAVASCRIPT

Remove HTML Tags from User Input

Learn to sanitize user-generated content by effectively stripping out all HTML tags using a regular expression in JavaScript to prevent XSS vulnerabilities.

const stripHtmlTags = (htmlString) => {
    // Regex to match any HTML tag: <tag> or </tag>
    // /<[^>]*>/g - matches '<' followed by any character that is not '>' zero or more times, then '>'
    // g flag ensures all occurrences are replaced
    const htmlTagRegex = /<[^>]*>/g;
    return htmlString.replace(htmlTagRegex, '');
};

const dirtyHtml = "<h1>Hello, <b>world</b>!</h1><p>This is a test with <script>alert('XSS');</script> and more.</p>";
const cleanText = stripHtmlTags(dirtyHtml);
console.log(cleanText); // "Hello, world!This is a test with  and more."
How it works: The `stripHtmlTags` function takes an HTML string and removes all HTML tags from it using a regular expression. The pattern `/<[^>]*>/g` identifies HTML tags: `<` matches the opening angle bracket, `[^>]*` matches any character that is not a closing angle bracket (`>`) zero or more times, and `>` matches the closing angle bracket. The `g` flag ensures that *all* occurrences of HTML tags are found and replaced with an empty string, effectively stripping them from the input, which is a common step in sanitizing user-generated content.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs