JAVASCRIPT
Remove Basic HTML Tags from User Input Using Regex
A JavaScript regex snippet to strip common HTML tags from user-provided text, helping to prevent basic XSS vulnerabilities and clean content for display.
const userInput = '<h1>Hello</h1> <p>This is <b>user</b> generated content.</p> <script>alert("XSS!");</script>';
function stripHtmlTags(text) {
// Regex to match any HTML tag: <tag> or </tag>
return text.replace(/<[^>]*>/g, '');
}
const cleanText = stripHtmlTags(userInput);
console.log(cleanText); // "Hello This is user generated content. alert("XSS!");"
How it works: This regex, `/<[^>]*>/g`, is a simple way to remove HTML tags from a string. It matches any character between `<` and `>` (non-greedy `*` quantifier), including the angle brackets themselves. The `g` flag ensures that all occurrences of tags in the string are replaced. While useful for basic sanitization and cleaning, it's important to note that this method is not foolproof for preventing all XSS attacks. For robust security, dedicated HTML sanitization libraries are recommended.