BASH

Efficiently Find Large Files in a Directory and Subdirectories

Locate and list the largest files within a specified directory, including subdirectories, sorted by size, useful for disk space management and web asset optimization.

#!/bin/bash

# Default directory to search
SEARCH_DIR="." # Current directory

# Default number of top files to display
TOP_N=10

# Check for custom directory argument
if [ -n "$1" ]; then
    SEARCH_DIR="$1"
fi

# Check for custom number of files argument
if [ -n "$2" ]; then
    TOP_N="$2"
fi

# Validate if SEARCH_DIR exists
if [ ! -d "$SEARCH_DIR" ]; then
    echo "Error: Directory '$SEARCH_DIR' not found."
    exit 1
fi

echo "Searching for top $TOP_N largest files in '$SEARCH_DIR'..."
echo "----------------------------------------------------"

# Find files, sort by size, and display the top N
# find "$SEARCH_DIR" -type f: find files in the specified directory
# -print0: print full file names on the standard output, followed by a null character
# xargs -0 du -h: read null-separated items and execute du -h for each
# du -h: display human-readable disk usage
# sort -rh: sort by human-readable numbers, reverse order (largest first)
# head -n "$TOP_N": display only the top N lines

find "$SEARCH_DIR" -type f -print0 | xargs -0 du -h | sort -rh | head -n "$TOP_N"

if [ $? -ne 0 ]; then
    echo "An error occurred during file search. Check permissions or path."
    exit 1
fi
echo "----------------------------------------------------"

How it works: This script efficiently identifies and lists the largest files within a specified directory and its subdirectories. It uses `find` to locate all files, pipes their paths to `xargs` to execute `du -h` (disk usage in human-readable format) for each, then sorts the results in reverse human-readable order (`sort -rh`) to put the largest files first. Finally, `head -n` displays the top N results, aiding in disk space management.

Efficiently Find Large Files in a Directory and Subdirectories

Need help integrating this into your project?