← Back to all snippets
PYTHON

Efficiently Find N Smallest/Largest Elements with heapq

Learn to use Python's heapq module to efficiently manage priority queues, making it easy to find the N smallest or largest elements in a collection without full sorting.

import heapq

def find_n_smallest(data, n):
    """
    Finds the N smallest elements in a list using heapq.nsmallest.
    For largest, use heapq.nlargest.
    """
    if not isinstance(data, list):
        return []
    if not isinstance(n, int) or n < 0:
        raise ValueError("N must be a non-negative integer.")
    
    # heapq.nsmallest returns a list of the n smallest elements
    # It's more efficient than sorting the entire list if n is much smaller than len(data)
    return heapq.nsmallest(n, data)

def find_n_largest(data, n):
    """
    Finds the N largest elements in a list using heapq.nlargest.
    """
    if not isinstance(data, list):
        return []
    if not isinstance(n, int) or n < 0:
        raise ValueError("N must be a non-negative integer.")
        
    return heapq.nlargest(n, data)

# Example usage:
numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
n_val = 3

smallest_three = find_n_smallest(numbers, n_val)
largest_three = find_n_largest(numbers, n_val)

# print(f"Numbers: {numbers}")
# print(f"{n_val} Smallest: {smallest_three}") # Expected: [1, 1, 2]
# print(f"{n_val} Largest: {largest_three}")   # Expected: [9, 6, 5]

# For a custom priority, use a key function (like sorted or min/max)
people = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob', 'age': 24},
    {'name': 'Charlie', 'age': 35},
    {'name': 'David', 'age': 28}
]

youngest_two = heapq.nsmallest(2, people, key=lambda p: p['age'])
# print(f"Youngest two people: {youngest_two}")
How it works: This snippet demonstrates the use of the `heapq` module for efficient retrieval of the N smallest or largest elements from a collection. Unlike sorting the entire list (which takes O(N log N) time), `heapq.nsmallest` and `heapq.nlargest` perform these operations in O(M log N) time, where M is the total number of elements and N is the number of elements to retrieve, making them significantly faster when N is much smaller than M. This is achieved by internally managing a min-heap or max-heap. It's ideal for priority queue implementations or when you only need a subset of ranked data.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs