BASH

Filter and Analyze Web Server Access Logs

Quickly filter and analyze web server access logs (e.g., Apache, Nginx) for specific patterns like error codes, IPs, or URLs using `grep`, `awk`, and `sort` for rapid debugging.

#!/bin/bash

# Configuration variables
LOG_FILE="/var/log/nginx/access.log" # Path to your web server access log
SEARCH_TERM="404"                   # Example: Filter for '404' errors

# Check if the log file exists
if [ ! -f "$LOG_FILE" ]; then
    echo "Error: Log file '$LOG_FILE' not found or is not readable." >&2
    exit 1
fi

echo "--- Filtering '$LOG_FILE' for lines containing '$SEARCH_TERM' (first 10 matches) ---"
grep "$SEARCH_TERM" "$LOG_FILE" | head -n 10

echo "
--- Top 5 IP addresses accessing the server ---"
# Extracts the first field (IP), sorts, counts unique occurrences, sorts numerically (descending), and takes top 5
awk '{print $1}' "$LOG_FILE" | sort | uniq -c | sort -nr | head -n 5

echo "
--- Top 5 Most Accessed URLs (excluding common static assets) ---"
# Extracts the 7th field (request path) and filters out common static asset extensions
awk '($7 !~ /\.(jpg|jpeg|png|gif|bmp|svg|css|js|ico|woff|woff2|ttf|otf|eot)/) {print $7}' "$LOG_FILE" \
| sort | uniq -c | sort -nr | head -n 5

echo "
--- Total requests by HTTP status code ---"
# Extracts the 9th field (status code) and counts occurrences
awk '{print $9}' "$LOG_FILE" | sort | uniq -c | sort -nr
How it works: This script offers basic utilities for analyzing web server access logs. It demonstrates how to use `grep` to quickly find specific patterns like HTTP status codes. It further leverages `awk`, `sort`, and `uniq` to extract and count top IP addresses, identify the most accessed URLs (excluding static assets), and summarize HTTP status codes. This helps web developers quickly diagnose issues, monitor traffic patterns, and understand server usage.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs