PYTHON
Grouping Data by Key using collections.defaultdict
Simplify data grouping or counting frequencies in Python using `collections.defaultdict`. Automatically handles missing keys, making code cleaner and more concise.
from collections import defaultdict
data = [
{'category': 'fruit', 'item': 'apple'},
{'category': 'vegetable', 'item': 'carrot'},
{'category': 'fruit', 'item': 'banana'},
{'category': 'vegetable', 'item': 'spinach'},
{'category': 'fruit', 'item': 'orange'},
]
# Group items by category
grouped_data = defaultdict(list)
for entry in data:
grouped_data[entry['category']].append(entry['item'])
print(f"Grouped data: {dict(grouped_data)}")
# Count occurrences
word_list = ["apple", "banana", "apple", "orange", "banana", "apple"]
word_counts = defaultdict(int)
for word in word_list:
word_counts[word] += 1
print(f"Word counts: {dict(word_counts)}")
How it works: `collections.defaultdict` is a subclass of `dict` that calls a factory function (like `list` or `int`) to supply missing values. When you try to access a key that isn't present, `defaultdict` automatically creates an entry for it with the default value returned by the factory. This eliminates the need for explicit `if key not in dict:` checks, making code for grouping, counting, or building dictionaries with complex values much cleaner.