BASH
Parse and Extract Data from Log Files using Regex
Utilize a bash script to parse log files, extracting specific data patterns like IP addresses, timestamps, or error codes using regular expressions with `grep` and `awk`.
#!/bin/bash
LOG_FILE="/var/log/nginx/access.log"
# Example regex to capture IP address and request path from Nginx access logs
# Matches: 192.168.1.1 - - [21/Jun/2023:10:00:00 +0000] "GET /index.html HTTP/1.1" 200 1234 "-" "Mozilla/5.0"
if [ ! -f "$LOG_FILE" ]; then
echo "Error: Log file '$LOG_FILE' not found." >&2
exit 1
fi
echo "Extracting IP addresses and requested paths from $LOG_FILE:"
awk '{
# Pattern for IP followed by GET request path
match($0, /^([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*"GET ([^ ]*)/, arr);
if (arr[1] && arr[2]) {
print "IP: " arr[1] ", Path: " arr[2];
}
}' "$LOG_FILE"
How it works: This script demonstrates how to parse a log file to extract specific pieces of information using regular expressions with `awk`. It's configured to pull out IP addresses and requested paths from Nginx access logs, but the regex can be easily adapted to extract any pattern from various log formats, aiding in analysis and debugging.