PYTHON
Grouping Data with Python `defaultdict`
Discover how `collections.defaultdict` simplifies grouping items by a common key, perfect for organizing lists of objects or database query results into categories.
from collections import defaultdict
# Example data: a list of dictionaries representing orders
orders = [
{'order_id': 101, 'customer_id': 'A', 'amount': 150},
{'order_id': 102, 'customer_id': 'B', 'amount': 200},
{'order_id': 103, 'customer_id': 'A', 'amount': 50},
{'order_id': 104, 'customer_id': 'C', 'amount': 300},
{'order_id': 105, 'customer_id': 'B', 'amount': 75},
]
# Group orders by customer_id using defaultdict
orders_by_customer = defaultdict(list)
for order in orders:
orders_by_customer[order['customer_id']].append(order)
print("Orders grouped by customer_id:")
for customer_id, customer_orders in orders_by_customer.items():
print(f" Customer {customer_id}: {customer_orders}")
# Another example: counting occurrences (can also use Counter for this)
from collections import defaultdict
fruits = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
fruit_counts = defaultdict(int)
for fruit in fruits:
fruit_counts[fruit] += 1
print(f"
Fruit counts: {dict(fruit_counts)}")
How it works: `collections.defaultdict` is a specialized dictionary subclass that provides a default value for a nonexistent key. This snippet demonstrates its utility for grouping items. Instead of checking if a key exists before appending, `defaultdict(list)` automatically creates an empty list for a new key, making it clean and efficient to aggregate related data, such as orders belonging to a specific customer ID, without explicit key checks.