JAVASCRIPT
Remove HTML Tags from a String for Sanitization
Learn to effectively strip HTML tags from user-provided strings using a simple JavaScript regular expression, preventing XSS vulnerabilities and cleaning content.
function stripHtmlTags(htmlString) {
// Regex to find any HTML tag like <tag> or </tag>
const tagRegex = /<[^>]*>/g;
return htmlString.replace(tagRegex, '');
}
// Examples
const unsafeInput = "<h1>Hello, World!</h1><p>This is a <b>test</b>.</p><script>alert('XSS');</script>";
console.log(stripHtmlTags(unsafeInput));
// "Hello, World!This is a test."
const anotherInput = "Just plain text without tags.";
console.log(stripHtmlTags(anotherInput));
// "Just plain text without tags."
How it works: This `stripHtmlTags` JavaScript function uses a regular expression `/<[^>]*>/g` to remove all HTML tags from a given string. The regex matches any sequence starting with `<` and ending with `>`, capturing everything in between (any character except `>`). The `g` flag ensures all occurrences are replaced, not just the first one. This is a common and straightforward method for basic content sanitization, helping to prevent cross-site scripting (XSS) attacks by neutralizing potentially malicious HTML.