PYTHON
Aggregate Data with collections.defaultdict
Efficiently aggregate and group data in Python using `collections.defaultdict`, perfect for building dictionaries where keys might not exist initially, like grouping items into lists.
from collections import defaultdict
# Example 1: Grouping items into lists
log_entries = [
{'user_id': 101, 'action': 'login', 'timestamp': '2023-01-01'},
{'user_id': 102, 'action': 'view_product', 'timestamp': '2023-01-01'},
{'user_id': 101, 'action': 'add_to_cart', 'timestamp': '2023-01-01'},
{'user_id': 103, 'action': 'login', 'timestamp': '2023-01-02'},
{'user_id': 102, 'action': 'purchase', 'timestamp': '2023-01-02'}
]
user_actions = defaultdict(list)
for entry in log_entries:
user_actions[entry['user_id']].append(entry['action'])
print(f"User Actions: {dict(user_actions)}")
# Expected: {101: ['login', 'add_to_cart'], 102: ['view_product', 'purchase'], 103: ['login']}
# Example 2: Summing values by category
sales_data = [
{'category': 'Electronics', 'amount': 1200},
{'category': 'Books', 'amount': 50},
{'category': 'Electronics', 'amount': 300},
{'category': 'Apparel', 'amount': 150},
{'category': 'Books', 'amount': 75}
]
total_sales_by_category = defaultdict(int)
for sale in sales_data:
total_sales_by_category[sale['category']] += sale['amount']
print(f"Total Sales by Category: {dict(total_sales_by_category)}")
# Expected: {'Electronics': 1500, 'Books': 125, 'Apparel': 150}
How it works: This snippet demonstrates the versatile use of `collections.defaultdict` for aggregating data. Unlike a regular dictionary, `defaultdict` automatically assigns a default value (e.g., an empty list or zero for an integer) to a key if it's accessed and not yet present. This simplifies code when grouping items into lists or summing values, as it eliminates the need for explicit `if key not in dict:` checks before appending or adding, making the code cleaner and more efficient.