PYTHON

Grouping Iterables by Key with `itertools.groupby`

Learn to efficiently group consecutive identical elements in an iterable using `itertools.groupby` in Python, a powerful tool for data aggregation and structured processing.

from itertools import groupby

# Grouping by a simple attribute (requires sorted data)
data = [
    {"city": "New York", "name": "Alice"},
    {"city": "London", "name": "Bob"},
    {"city": "New York", "name": "Charlie"},
    {"city": "London", "name": "David"},
    {"city": "New York", "name": "Eve"},
]

# For groupby to work correctly, the iterable must be sorted by the key
data.sort(key=lambda x: x["city"])
print(f"Sorted data: {data}
")

grouped_data = {}
for key, group in groupby(data, key=lambda x: x["city"]):
    grouped_data[key] = list(group)

print(f"Grouped by city (using groupby): {grouped_data}")

# Example: Grouping file paths by extension
file_paths = ["doc1.txt", "image.png", "report.pdf", "doc2.txt", "logo.png"]
file_paths.sort(key=lambda x: x.split(".")[-1]) # Sort by extension
print(f"Sorted file paths: {file_paths}
")

grouped_by_extension = {}
for extension, files in groupby(file_paths, key=lambda x: x.split(".")[-1]):
    grouped_by_extension[extension] = list(files)

print(f"Grouped by extension: {grouped_by_extension}")
How it works: `itertools.groupby` is a powerful function for grouping consecutive identical elements in an iterable. It returns an iterator that yields pairs: `(key, group_iterator)`. A crucial point is that `groupby` only groups *consecutive* elements. Therefore, the input iterable must first be sorted by the desired grouping key to ensure all items with the same key are adjacent. This snippet demonstrates grouping lists of dictionaries by a common key and grouping file paths by their extensions after sorting.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs