JAVASCRIPT
Stripping HTML Tags from a String with Regex
Securely remove all HTML tags from user-provided text in JavaScript using a regular expression to prevent XSS vulnerabilities and clean content for display.
function stripHtmlTags(htmlString) {
const htmlTagRegex = /<[^>]*>/g;
return htmlString.replace(htmlTagRegex, '');
}
const dirtyHtml = "<h1>Welcome</h1><p>This is <b>bold</b> text with <script>alert('XSS!');</script> malicious script.</p>";
const cleanText = stripHtmlTags(dirtyHtml);
console.log(cleanText); // "WelcomeThis is bold text with malicious script."
const simpleText = "Just plain text.";
console.log(stripHtmlTags(simpleText)); // "Just plain text."
How it works: The `stripHtmlTags` function provides a straightforward way to remove HTML tags from a string using a regular expression. The pattern `<[^>]*>` matches any sequence starting with '<', followed by zero or more characters that are not '>', and ending with '>'. The `replace()` method, combined with the global flag (`g`), effectively removes all such tag occurrences, helping to sanitize user-generated content and mitigate cross-site scripting (XSS) risks.