BASH
Efficiently Find Large Files in a Directory and Subdirectories
Locate and list the largest files within a specified directory, including subdirectories, sorted by size, useful for disk space management and web asset optimization.
#!/bin/bash
# Default directory to search
SEARCH_DIR="." # Current directory
# Default number of top files to display
TOP_N=10
# Check for custom directory argument
if [ -n "$1" ]; then
SEARCH_DIR="$1"
fi
# Check for custom number of files argument
if [ -n "$2" ]; then
TOP_N="$2"
fi
# Validate if SEARCH_DIR exists
if [ ! -d "$SEARCH_DIR" ]; then
echo "Error: Directory '$SEARCH_DIR' not found."
exit 1
fi
echo "Searching for top $TOP_N largest files in '$SEARCH_DIR'..."
echo "----------------------------------------------------"
# Find files, sort by size, and display the top N
# find "$SEARCH_DIR" -type f: find files in the specified directory
# -print0: print full file names on the standard output, followed by a null character
# xargs -0 du -h: read null-separated items and execute du -h for each
# du -h: display human-readable disk usage
# sort -rh: sort by human-readable numbers, reverse order (largest first)
# head -n "$TOP_N": display only the top N lines
find "$SEARCH_DIR" -type f -print0 | xargs -0 du -h | sort -rh | head -n "$TOP_N"
if [ $? -ne 0 ]; then
echo "An error occurred during file search. Check permissions or path."
exit 1
fi
echo "----------------------------------------------------"
How it works: This script efficiently identifies and lists the largest files within a specified directory and its subdirectories. It uses `find` to locate all files, pipes their paths to `xargs` to execute `du -h` (disk usage in human-readable format) for each, then sorts the results in reverse human-readable order (`sort -rh`) to put the largest files first. Finally, `head -n` displays the top N results, aiding in disk space management.