PYTHON
Efficiently Find N Smallest or Largest Items with heapq
Master Python's `heapq` module to quickly retrieve the N smallest or largest elements from a collection without fully sorting, ideal for leaderboards or top-N analysis.
import heapq
data = [10, 3, 7, 1, 9, 2, 8, 4, 6, 5]
salaries = [50000, 60000, 30000, 120000, 80000, 40000, 95000]
# Find the 3 smallest items
smallest_3 = heapq.nsmallest(3, data)
print(f"3 smallest items from data: {smallest_3}")
# Find the 2 largest salaries
largest_2_salaries = heapq.nlargest(2, salaries)
print(f"2 largest salaries: {largest_2_salaries}")
# Find items based on a key function (e.g., objects by attribute)
class Task:
def __init__(self, name, priority):
self.name = name
self.priority = priority
def __repr__(self):
return f"Task('{self.name}', {self.priority})"
tasks = [
Task('Deploy Backend', 5),
Task('Fix UI Bug', 2),
Task('Write Docs', 8),
Task('Refactor Auth', 3),
Task('Optimize DB', 1),
]
# Find the 3 highest priority tasks (smallest priority value)
highest_priority_tasks = heapq.nsmallest(3, tasks, key=lambda t: t.priority)
print(f"3 highest priority tasks: {highest_priority_tasks}")
How it works: This snippet demonstrates how to use Python's `heapq` module to efficiently find the N smallest or largest items from a collection. Unlike sorting the entire list, `heapq.nsmallest()` and `heapq.nlargest()` provide a more performant way to get the extremal elements. It's particularly useful for scenarios like identifying top performers, lowest latency requests, or highest priority tasks without incurring the overhead of a full sort, making it ideal for web service optimization and data ranking.