PYTHON
Group Data Efficiently Using Python's defaultdict
Simplify data grouping tasks in Python by leveraging `collections.defaultdict`, automatically handling missing keys and appending items to lists or other structures.
from collections import defaultdict
users_data = [
{'name': 'Alice', 'city': 'New York', 'role': 'admin'},
{'name': 'Bob', 'city': 'London', 'role': 'editor'},
{'name': 'Charlie', 'city': 'New York', 'role': 'viewer'},
{'name': 'David', 'city': 'London', 'role': 'admin'},
{'name': 'Eve', 'city': 'Paris', 'role': 'editor'}
]
# Group users by city
users_by_city = defaultdict(list)
for user in users_data:
users_by_city[user['city']].append(user['name'])
print(f"Users by City: {dict(users_by_city)}")
# Group users by role, storing full user dicts
users_by_role = defaultdict(list)
for user in users_data:
users_by_role[user['role']].append(user)
print(f"Users by Role: {dict(users_by_role)}")
How it works: `collections.defaultdict` is a subclass of `dict` that overrides `__missing__`. It provides a default value for a nonexistent key, preventing `KeyError`. This is extremely useful when grouping items. Instead of checking `if key in dict:` before appending, you can directly append to `defaultdict(list)`, and if the key doesn't exist, an empty list will automatically be created and then the item appended. This leads to cleaner, more concise code for data aggregation tasks common in web development, such as organizing query results or log data.