PYTHON
Group Items by Key Using collections.defaultdict
Discover how to simplify data grouping tasks in Python using collections.defaultdict, eliminating verbose key existence checks and streamlining code.
from collections import defaultdict
data = [
{'name': 'Alice', 'city': 'New York'},
{'name': 'Bob', 'city': 'London'},
{'name': 'Charlie', 'city': 'New York'},
{'name': 'David', 'city': 'Paris'},
{'name': 'Eve', 'city': 'London'},
]
# Group people by city
people_by_city = defaultdict(list)
for item in data:
people_by_city[item['city']].append(item['name'])
print(f"People grouped by city: {dict(people_by_city)}")
# Another example: counting words by first letter
words = ['apple', 'apricot', 'banana', 'berry', 'cat', 'dog']
words_by_initial = defaultdict(int)
for word in words:
words_by_initial[word[0]] += 1
print(f"Word counts by initial letter: {dict(words_by_initial)}")
How it works: The `collections.defaultdict` is a subclass of the built-in `dict` class that overrides one method: `__missing__`. It takes a 'default factory' argument (a function that returns the default value for a new key). When a key is accessed for the first time, if it's not present, the `defaultdict` automatically creates it and initializes its value using the default factory, avoiding `KeyError` and simplifying grouping logic significantly.