PYTHON
Manage Priority Queues and Find K-Smallest/Largest with `heapq`
Master Python's `heapq` module to efficiently implement min-heaps, create priority queues, and quickly find the smallest or largest N elements in a collection.
import heapq
# Example 1: Basic Min-Heap (Priority Queue)
# heapq functions operate on regular Python lists to maintain heap property
min_heap = []
heapq.heappush(min_heap, 3)
heapq.heappush(min_heap, 1)
heapq.heappush(min_heap, 4)
heapq.heappush(min_heap, 1)
heapq.heappush(min_heap, 5)
print(f"Min-heap after pushes: {min_heap}") # Note: list itself isn't sorted, only heap property holds
# Pop smallest element
smallest = heapq.heappop(min_heap)
print(f"Popped smallest element: {smallest}")
print(f"Min-heap after pop: {min_heap}")
# Pop next smallest
next_smallest = heapq.heappop(min_heap)
print(f"Popped next smallest element: {next_smallest}")
# Example 2: Finding the K largest elements
data = [1, 8, 2, 23, 7, -4, 18, 5, 23, 42, 37, 2]
k = 3
# Using nlargest to find the 3 largest elements
largest_k = heapq.nlargest(k, data)
print(f"Original data: {data}")
print(f"The {k} largest elements: {largest_k}")
# Using nsmallest to find the 3 smallest elements
smallest_k = heapq.nsmallest(k, data)
print(f"The {k} smallest elements: {smallest_k}")
# Example 3: Using a custom key for comparison (e.g., tuples)
items = [(1, 'apple'), (3, 'cherry'), (2, 'banana'), (0, 'date')]
# Find the 2 items with the largest 'priority' (first element of tuple)
largest_priority_items = heapq.nlargest(2, items)
print(f"Items with 2 largest priorities: {largest_priority_items}")
How it works: The `heapq` module in Python provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. It uses a regular Python list to maintain the heap invariant: `heap[k] <= heap[2*k + 1]` and `heap[k] <= heap[2*k + 2]` for all `k`. This ensures that the smallest element is always at index 0. Functions like `heappush` and `heappop` maintain this invariant efficiently. `nlargest` and `nsmallest` are convenient for finding the k-largest or k-smallest elements without fully sorting the entire collection, which is highly efficient for large datasets.