JAVASCRIPT
Strip HTML Tags from a String for Plain Text Output
Learn to remove all HTML tags from a string using a regular expression in JavaScript, useful for sanitizing user input or generating plain text summaries.
const stripHtmlTags = (htmlString) => {
// WARNING: For full security (e.g., XSS prevention), use a proper HTML parser/sanitizer library
// (like DOMPurify). This regex is for basic tag removal, not robust security.
const tagRegex = /<[^>]*>/g;
return htmlString.replace(tagRegex, "");
};
const messyHtml = "<h1>Hello, <strong>World</strong>!</h1><p>This is a test.</p><img src='x.jpg' onerror='alert(1)'>";
console.log(stripHtmlTags(messyHtml)); // "Hello, World!This is a test."
How it works: This JavaScript function `stripHtmlTags` uses a simple regular expression to find and replace all HTML tags (anything enclosed between `<` and `>`) with an empty string, effectively removing them. While useful for converting rich text to plain text or creating short summaries, it's crucial to note that for security-critical sanitization (e.g., preventing XSS), a dedicated HTML parser and sanitization library is always recommended over simple regex due to the complexity of HTML parsing.