PYTHON

Grouping Data by Key with collections.defaultdict

Discover how collections.defaultdict simplifies grouping items by a common key without needing to check if the key already exists, making dictionary construction cleaner.

from collections import defaultdict

data = [
    {'name': 'Alice', 'city': 'New York'},
    {'name': 'Bob', 'city': 'London'},
    {'name': 'Charlie', 'city': 'New York'},
    {'name': 'David', 'city': 'London'},
    {'name': 'Eve', 'city': 'Paris'}
]

# Grouping people by city
grouped_by_city = defaultdict(list)
for person in data:
    grouped_by_city[person['city']].append(person['name'])

print(f"People grouped by city: {dict(grouped_by_city)}")

# defaultdict with int for counting
word_counts = defaultdict(int)
sentence = "the quick brown fox jumps over the lazy dog"
for word in sentence.split():
    word_counts[word] += 1
print(f"Word counts: {dict(word_counts)}")

How it works: collections.defaultdict is a subclass of dict that calls a factory function (e.g., list, int, set) to supply missing values when a key is accessed for the first time. This eliminates the need for if key in dict: checks when appending to a list or incrementing a count, making the code significantly more concise and less error-prone, especially for grouping operations in data processing.

Grouping Data by Key with collections.defaultdict

Related PYTHON Snippets

Group a List of Dictionaries by a Key

Invert a Dictionary (Swap Keys and Values)

Implement a Basic Graph using Adjacency List

Need help integrating this into your project?