PYTHON
Parsing Custom Log Entries with Regex Groups
Learn to parse structured data from custom log strings by extracting specific fields like timestamp, level, and message using Python regular expressions with named capturing groups.
import re
def parse_log_entry(log_line):
# Example log format: [YYYY-MM-DD HH:MM:SS] [LEVEL] Message goes here.
# Uses named capturing groups: ?P<name>
log_pattern = re.compile(
r'^\[(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] '
r'\[(?P<level>[A-Z]+)\] '
r'(?P<message>.*)$'
)
match = log_pattern.match(log_line)
if match:
return match.groupdict()
return None
log1 = "[2023-10-27 14:30:05] [INFO] User 'john.doe' logged in."
log2 = "[2023-10-27 14:31:10] [ERROR] Database connection failed."
log3 = "This is not a log entry."
print(parse_log_entry(log1))
# Output: {'timestamp': '2023-10-27 14:30:05', 'level': 'INFO', 'message': "User 'john.doe' logged in."}
print(parse_log_entry(log2))
# Output: {'timestamp': '2023-10-27 14:31:10', 'level': 'ERROR', 'message': 'Database connection failed.'}
print(parse_log_entry(log3)) # Output: None
How it works: The `parse_log_entry` function in Python is designed to extract specific data from log lines with a predefined format using regular expressions. It defines a pattern using named capturing groups (`?P<name>...`) to easily retrieve the `timestamp`, `level`, and `message` components. `re.compile` is used for efficiency when the same pattern is used multiple times. The `match.groupdict()` method then returns a dictionary where keys are the group names and values are the matched strings, making it easy to access the parsed data.