PYTHON
Simplify Data Grouping with `collections.defaultdict`
Streamline data categorization and aggregation in Python by automatically initializing list or set values in a dictionary using `collections.defaultdict`, avoiding key errors.
from collections import defaultdict
transactions = [
{'item': 'Laptop', 'category': 'Electronics', 'amount': 1200},
{'item': 'Keyboard', 'category': 'Electronics', 'amount': 75},
{'item': 'Book A', 'category': 'Books', 'amount': 25},
{'item': 'Mouse', 'category': 'Electronics', 'amount': 30},
{'item': 'Book B', 'category': 'Books', 'amount': 40},
]
# Group transactions by category
grouped_by_category = defaultdict(list)
for transaction in transactions:
grouped_by_category[transaction['category']].append(transaction)
print("Grouped Transactions:")
for category, items in grouped_by_category.items():
print(f"- {category}: {[item['item'] for item in items]}")
# Count items using defaultdict (alternative to Counter for simple counts)
word_counts = defaultdict(int)
sentence = "the quick brown fox jumps over the lazy dog the quick brown fox"
words = sentence.split()
for word in words:
word_counts[word] += 1
print(f"
Word Counts: {dict(word_counts)}")
How it works: `collections.defaultdict` is a subclass of `dict` that calls a factory function to supply missing values. It's incredibly useful for grouping data without needing to explicitly check if a key exists before appending or adding to its value. For example, `defaultdict(list)` initializes a new list for any accessed key that doesn't exist, making it perfect for appending items to categories. Similarly, `defaultdict(int)` provides a starting value of `0` for integer counts.