← Back to all snippets
BASH

Implement Retry with Exponential Backoff in Bash

Build a robust retry mechanism with exponential backoff in Bash scripts, ideal for handling transient network failures or unavailable services when executing critical commands or API calls.

#!/bin/bash

MAX_RETRIES=5
BASE_DELAY=1 # seconds

# Function to execute a command with retry and exponential backoff
function run_with_backoff {
  local cmd="$@"
  local attempt=1
  local delay="$BASE_DELAY"

  while [ $attempt -le $MAX_RETRIES ]; do
    echo "Attempt $attempt: Running command \"$cmd\"..."
    if eval "$cmd"; then
      echo "Command succeeded on attempt $attempt."
      return 0
    else
      echo "Command failed on attempt $attempt."
      if [ $attempt -lt $MAX_RETRIES ]; then
        echo "Retrying in $delay seconds..."
        sleep "$delay"
        delay=$((delay * 2)) # Exponential backoff
        attempt=$((attempt + 1))
      else
        echo "Max retries reached. Command failed permanently." >&2
        return 1
      fi
    fi
  done
}

# Example usage:
# Replace 'false' with your actual command that might fail temporarily
# For example: run_with_backoff curl -sSf https://api.example.com/health

echo "
--- Testing successful command ---"
run_with_backoff echo "This command will succeed."

echo "
--- Testing command that fails ---"
run_with_backoff bash -c 'exit $(( RANDOM % 2 ))' # Fails randomly

echo "
--- Testing command that always fails ---"
run_with_backoff false

if run_with_backoff ping -c 1 -W 1 non-existent-host.example.com; then
  echo "Ping succeeded."
else
  echo "Ping ultimately failed."
fi
How it works: This script implements a generic retry mechanism with exponential backoff. The `run_with_backoff` function attempts to execute a given command up to `MAX_RETRIES` times. If the command fails (returns a non-zero exit status), the script waits for a `delay` period before retrying. This `delay` doubles with each subsequent attempt, providing an exponential backoff strategy that reduces load on potentially struggling services and gives them time to recover. This is vital for robust automation when dealing with transient errors.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs