BASH
Extract Specific Data from Log Files using Awk
Efficiently parse and extract specific fields or patterns from large log files using 'awk', making it easier to analyze and debug web application issues.
#!/bin/bash
LOG_FILE="app_errors.log"
# Create a dummy log file for demonstration purposes
cat << EOF > "$LOG_FILE"
[2023-10-26 10:00:01] INFO User logged in: user_123
[2023-10-26 10:00:05] ERROR Failed to connect to DB for user: user_456
[2023-10-26 10:00:10] INFO Data processed for order: ORD-789
[2023-10-26 10:00:15] ERROR Invalid input for user: user_789 (IP: 192.168.1.100)
[2023-10-26 10:00:20] DEBUG Cache refresh initiated
[2023-10-26 10:00:25] ERROR API timeout for service: external_service (Request ID: ABC-123)
EOF
echo "--- All ERROR messages ---"
awk '/ERROR/ { print }' "$LOG_FILE"
echo "
--- Timestamp, Type, and Message for ERRORs ---"
# Here, $1 is date, $2 is time, $3 is log level. We get the rest of the line from $4 onwards.
awk '/ERROR/ { print $1, $2, $3, substr($0, index($0,$4)) }' "$LOG_FILE"
echo "
--- Extracting User IDs from 'Failed to connect to DB' errors ---
# Using a regex match and then printing the last field ($NF) assuming it's the user_ID
awk '/Failed to connect to DB for user:/ { print $NF }' "$LOG_FILE"
echo "
--- Extracting IPs from 'Invalid input' errors ---
# This assumes the IP is always enclosed in parentheses after 'IP:'
awk '/Invalid input for user:/ { match($0, /IP: ([0-9.]+)/); if (RSTART) print substr($0, RSTART + 4, RLENGTH - 4) }' "$LOG_FILE"
# Clean up the dummy log file
rm "$LOG_FILE"
How it works: This Bash script demonstrates powerful log parsing capabilities using `awk`. It first creates a sample log file and then shows various ways to extract information. Examples include printing all lines containing 'ERROR', extracting specific fields (timestamp, log level, and the rest of the message), and isolating user IDs from specific error messages. A more advanced example uses `match` to extract an IP address from a structured error line, showcasing `awk`'s versatility for sophisticated text processing tasks.