JAVASCRIPT
Sanitize HTML by Removing Tags with Regex
Remove all HTML tags from a string using a simple regular expression in JavaScript, useful for basic text sanitization and preventing XSS.
function stripHtmlTags(htmlString) {
// Basic regex to remove HTML tags. Note: For robust security, use a DOM parser or library.
const htmlTagRegex = /<[^>]*>/g;
return htmlString.replace(htmlTagRegex, '');
}
// Examples:
// const unsafeHtml = "<h1>Hello</h1><p>This is <i>some</i> text with <script>alert('XSS');</script> tags.</p>";
// const safeText = stripHtmlTags(unsafeHtml);
// console.log(safeText); // "HelloThis is some text with tags."
// console.log(stripHtmlTags("No tags here.")); // "No tags here."
How it works: This snippet provides the `stripHtmlTags` function, which uses a regular expression `/<[^>]*>/g` to remove HTML tags from a string. It finds any sequence starting with `<` and ending with `>`, replacing them with an empty string. While effective for basic sanitization and displaying plain text, it's crucial to understand that for high-security applications, especially to prevent XSS, a dedicated HTML sanitization library or a DOM parser should be used due to the complexity of all possible HTML attack vectors that a simple regex might miss.