PYTHON

Fetch All Data from a Paginated API Endpoint

Learn how to systematically retrieve all available data from a paginated API by iterating through multiple pages until no more results are returned, ensuring complete data collection.

import requests
import time

def fetch_all_paginated_data(base_url, api_key=None, page_param='page', per_page_param='per_page', initial_page=1, page_size=100):
    all_results = []
    current_page = initial_page
    headers = {}
    if api_key:
        headers['Authorization'] = f'Bearer {api_key}'

    while True:
        params = {page_param: current_page, per_page_param: page_size}
        print(f"Fetching page {current_page} from {base_url} with params {params}")
        try:
            response = requests.get(base_url, headers=headers, params=params, timeout=10)
            response.raise_for_status()  # Raise HTTPError for bad responses (4xx or 5xx)
            data = response.json()

            # Adjust this logic based on your API's pagination structure
            # Common patterns: 'results' key, 'data' key, or direct array
            # Also, check for 'next' link, 'total_pages', 'has_more', etc.
            items = data.get('results') or data.get('data') or data # Assuming API returns list under 'results' or 'data'

            if not items or len(items) == 0:
                break # No more items, exit loop

            all_results.extend(items)
            
            # If API provides a 'next' link, use that directly
            # next_link = data.get('next_page_url')
            # if next_link:
            #     base_url = next_link # Update URL for next request
            # else:
            #     break # No more pages

            # If pagination is purely by page number:
            # If fewer items than page_size are returned, it's likely the last page
            if len(items) < page_size:
                break 

            current_page += 1
            time.sleep(0.1) # Be kind to the API, add a small delay

        except requests.exceptions.RequestException as e:
            print(f"Error fetching data: {e}")
            break
    
    return all_results

# Example usage:
API_PAGINATED_URL = 'https://api.example.com/products'
MY_API_KEY = 'your_secret_api_key'

print("Starting to fetch all products...")
all_products = fetch_all_paginated_data(
    API_PAGINATED_URL,
    api_key=MY_API_KEY,
    page_param='offset', # Some APIs use 'offset', others 'page'
    per_page_param='limit', # Some APIs use 'limit', others 'per_page' or 'page_size'
    page_size=50
)

print(f"Total products fetched: {len(all_products)}")
# print("First 5 products:", all_products[:5])
How it works: This Python snippet demonstrates a common pattern for fetching all data from a paginated API. It uses a `while` loop to repeatedly make requests, incrementing the page number (or offset) with each iteration. The loop continues until the API returns an empty list of items, or fewer items than the `page_size`, indicating the last page has been reached. It includes customizable parameters for the page and per-page query parameters, API key handling, and robust error handling using the `requests` library. A small `time.sleep` is added to prevent overwhelming the API with rapid requests.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs