PYTHON
Grouping Items Efficiently with collections.defaultdict
Learn how to efficiently group items in a list or iterator by a common key into lists, sets, or sums using Python's collections.defaultdict.
from collections import defaultdict
data = [
{'category': 'fruit', 'item': 'apple'},
{'category': 'vegetable', 'item': 'carrot'},
{'category': 'fruit', 'item': 'banana'},
{'category': 'dairy', 'item': 'milk'},
{'category': 'vegetable', 'item': 'spinach'},
]
# Example 1: Grouping into lists
grouped_by_category_list = defaultdict(list)
for entry in data:
grouped_by_category_list[entry['category']].append(entry['item'])
print(f"Grouped into lists: {grouped_by_category_list}")
# Example 2: Counting occurrences
item_counts = defaultdict(int)
items = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
for item in items:
item_counts[item] += 1
print(f"Item counts: {item_counts}")
# Example 3: Grouping unique items into sets
grouped_unique_items = defaultdict(set)
pairs = [('A', 1), ('B', 2), ('A', 3), ('C', 1), ('B', 4)]
for key, value in pairs:
grouped_unique_items[key].add(value)
print(f"Grouped unique items: {grouped_unique_items}")
How it works: `collections.defaultdict` provides a convenient way to handle missing keys in a dictionary. When you try to access a key that doesn't exist, it automatically creates it using the `default_factory` (e.g., `list`, `int`, `set`) provided during initialization, and returns its default value. This eliminates the need for explicit `if key not in dict` checks, simplifying code for grouping, counting, and accumulating data into various structures without boilerplate.