PHP
Server-Side Input Sanitization for HTML Content (PHP)
Implement robust server-side input sanitization in PHP using `htmlspecialchars` to prevent XSS and ensure data integrity before storage or processing.
<?php
function sanitize_input_for_html_output(string $input): string {
// Use htmlspecialchars to convert special characters to HTML entities.
// ENT_QUOTES converts both double and single quotes.
// 'UTF-8' specifies the character encoding.
return htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');
}
// Simulate raw user input (e.g., from a form submission)
$raw_user_comment = "Hello, <script>alert('XSS!');</script> user!";
$raw_user_name = "O'Malley & Sons";
echo "Raw Comment: " . $raw_user_comment . "
";
echo "Raw Name: " . $raw_user_name . "
";
// Sanitize the input before storing in a database or displaying on a page
$sanitized_comment = sanitize_input_for_html_output($raw_user_comment);
$sanitized_name = sanitize_input_for_html_output($raw_user_name);
echo "Sanitized Comment (for HTML output): " . $sanitized_comment . "
";
echo "Sanitized Name (for HTML output): " . $sanitized_name . "
";
// Example of how it prevents XSS upon display
echo "<h3>Displaying Sanitized Content:</h3>";
echo "<p>" . $sanitized_comment . "</p>";
echo "<p>Posted by: " . $sanitized_name . "</p>";
// IMPORTANT: This is for sanitizing data *for HTML output*. For database storage,
// you would typically use parameterized queries (like PDO) to prevent SQL injection,
// and then use htmlspecialchars when *retrieving* and *displaying* the data.
How it works: This PHP snippet demonstrates server-side input sanitization specifically for content that will eventually be rendered as HTML. The `sanitize_input_for_html_output` function uses `htmlspecialchars()` to convert special characters (like `<`, `>`, `&`, quotes) into their HTML entities. This prevents Cross-Site Scripting (XSS) attacks by ensuring that any potentially malicious scripts embedded in user input are rendered harmlessly as text rather than executed by the browser. While this is primarily for output, processing input this way ensures malicious content cannot be accidentally stored in a way that allows injection. For database storage, parameterized queries are the primary defense against SQL injection.