← Back to all snippets
PYTHON

Sanitizing HTML Tags from User Input Using Regex in Python

Clean user-submitted text by removing unwanted HTML tags with a Python regular expression, preventing basic XSS vulnerabilities and ensuring plain text content.

import re

def remove_html_tags(text):
    clean = re.compile('<.*?>')
    return re.sub(clean, '', text)

html_input = "Hello <b>world</b>! This is a <script>alert('dangerous!');</script> example."
clean_output = remove_html_tags(html_input)
print(clean_output) # Output: Hello world! This is a  example.
How it works: This Python snippet defines `remove_html_tags` to strip all HTML tags from a given string. It uses the `re.compile` and `re.sub` functions to find and replace any sequence enclosed in angle brackets (`<.*?>`) with an empty string, effectively sanitizing the input from basic HTML for display or storage.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs