JAVASCRIPT
Sanitize HTML Input by Removing Script Tags
Prevent Cross-Site Scripting (XSS) attacks by using a robust regular expression in JavaScript to effectively remove all '<script>' tags and their content from user-provided HTML input.
const maliciousHtml = "<div>Hello!</div><script>alert('XSS Attack!');</script><p>Clean content.</p><SCRIPT type=\"text/javascript\">console.log('Another attack');</SCRIPT>";
const scriptTagRegex = /<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi;
const cleanHtml = maliciousHtml.replace(scriptTagRegex, "");
console.log(cleanHtml);
// Expected output: "<div>Hello!</div><p>Clean content.</p>"
How it works: This JavaScript code snippet provides a robust regular expression to sanitize user-provided HTML by removing all `<script>` tags and their content. This is a critical security measure against Cross-Site Scripting (XSS) attacks. The `scriptTagRegex` pattern uses a non-greedy approach `(?:(?!<\/script>)<[^<]*)*` to match any character inside the script block until it encounters the closing `</script>` tag, ensuring it captures the entire script block. The `gi` flags make the search global and case-insensitive. The `replace()` method then substitutes all matched script tags with an empty string, effectively removing them.