PYTHON

Grouping Iterables by Key with `itertools.groupby`

Learn to efficiently group consecutive identical elements in an iterable using `itertools.groupby` in Python, a powerful tool for data aggregation and structured processing.

from itertools import groupby

# Grouping by a simple attribute (requires sorted data)
data = [
    {"city": "New York", "name": "Alice"},
    {"city": "London", "name": "Bob"},
    {"city": "New York", "name": "Charlie"},
    {"city": "London", "name": "David"},
    {"city": "New York", "name": "Eve"},
]

# For groupby to work correctly, the iterable must be sorted by the key
data.sort(key=lambda x: x["city"])
print(f"Sorted data: {data}
")

grouped_data = {}
for key, group in groupby(data, key=lambda x: x["city"]):
    grouped_data[key] = list(group)

print(f"Grouped by city (using groupby): {grouped_data}")

# Example: Grouping file paths by extension
file_paths = ["doc1.txt", "image.png", "report.pdf", "doc2.txt", "logo.png"]
file_paths.sort(key=lambda x: x.split(".")[-1]) # Sort by extension
print(f"Sorted file paths: {file_paths}
")

grouped_by_extension = {}
for extension, files in groupby(file_paths, key=lambda x: x.split(".")[-1]):
    grouped_by_extension[extension] = list(files)

print(f"Grouped by extension: {grouped_by_extension}")

How it works: `itertools.groupby` is a powerful function for grouping consecutive identical elements in an iterable. It returns an iterator that yields pairs: `(key, group_iterator)`. A crucial point is that `groupby` only groups *consecutive* elements. Therefore, the input iterable must first be sorted by the desired grouping key to ensure all items with the same key are adjacent. This snippet demonstrates grouping lists of dictionaries by a common key and grouping file paths by their extensions after sorting.

Grouping Iterables by Key with `itertools.groupby`

Related PYTHON Snippets

Basic Stack Implementation Using Python Lists

Combine Multiple Dictionaries with collections.ChainMap

Making Custom Python Objects Hashable for Collections

Need help integrating this into your project?