← Back to all snippets
PYTHON

Remove Script Tags from Input String

A Python regex pattern to effectively sanitize user input by stripping out potentially malicious `<script>` tags, preventing XSS vulnerabilities.

import re

def remove_script_tags(text):
    # Regex to find <script> tags and their content, case-insensitive and dotall for multiline
    script_tag_regex = re.compile(r'<script\b[^>]*>.*?<\/script>', re.IGNORECASE | re.DOTALL)
    sanitized_text = script_tag_regex.sub('', text)
    return sanitized_text

# Example Usage:
user_input = "Hello, this is safe.<script>alert('XSS!');</script>And this is too.<script type='text/javascript'>console.log('Malicious');</script>"
cleaned_input = remove_script_tags(user_input)
print(cleaned_input)
# Expected: "Hello, this is safe.And this is too."
How it works: This Python function employs a regular expression to find and remove all `<script>` tags, along with their contents, from an input string. This is a fundamental step in sanitizing user-generated content to prevent Cross-Site Scripting (XSS) attacks by neutralizing embedded malicious JavaScript. It uses `re.IGNORECASE` for case-insensitivity and `re.DOTALL` to match newlines within script content.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs