How to Find large files linux with find and du commands

In this guide we a going to check how to find largest files in linux. If you have a linux server hosting applications and the server is running out of space, you will probably be asking yourself how you would go about finding what is eating up all that space.

The Linux commands find and du will come to your rescue.

# Using the du command

The du command is used to estimate file space usage on Linux system. It shows the disk usage information.

Lets use du to check the content of /boot:

# du
0   ./efi/EFI/centos
0   ./efi/EFI
0   ./efi
2400    ./grub2/i386-pc
3176    ./grub2/locale
2504    ./grub2/fonts
8096    ./grub2
4   ./grub
250556  .

The values on the far left are the disk usage, followed by the specific directory responsible for that usage. The bottom row is a summary of the entire /boot/ directory.

Here is a list of important du options

  • -h , --human-readable prints size outputs in a human-readable format.
  • -s, --summarize can be combinend with -h get a summary of the directory’s usage in a human-readable format.
  • -a, --all lists the sizes of all files and directories in the given file path. You can combine with -h

Now in our case, we would want to file space usage and to check the largest first, we can sort them using the sort command. If we want to limit our result, the head command will come in handy,

Getting the largest files with du:

du -a / | sort -n -r | head -n 20

The above command uses du to get disk usage, then the content is piped to sort, then head will only output the first 20 items.


# du -a / | sort -n -r | head -n 20

du: cannot access ‘/proc/20633/task/20633/fd/3’: No such file or directory
du: cannot access ‘/proc/20633/task/20633/fdinfo/3’: No such file or directory
du: cannot access ‘/proc/20633/fd/4’: No such file or directory
du: cannot access ‘/proc/20633/fdinfo/4’: No such file or directory
43855216    /
38679004    /var
38558036    /var/log
38486524    /var/log/asterisk
18136900    /var/log/asterisk/
15786756    /var/log/asterisk/
2408012 /var/log/asterisk/cdr-custom
2105416 /usr
2101960 /var/log/asterisk/cdr-csv/Master.csv
2101960 /var/log/asterisk/cdr-csv
2074504 /var/log/asterisk/cdr-custom/Master.csv
1014364 /opt
1002432 /usr/lib
635300  /home/centos
635300  /home
634528  /home/centos/
490704  /tmp
480260  /tmp/pip.log
403512  /opt/instana/agent
403512  /opt/instana

Often times you will get some errors before you get your list of large files. This often comes from either files that you are not allowed to access or any stderr output. Use 2>/dev/null to ignore those like in this command:

du -a / 2>/dev/null | sort -n -r | head -n 20

# Using the find command

You can use the find command to target only files in a search and find the size of each, then use a combination of sort and head to filter out the content.


find / -type f -printf '%s %p\n' | sort -nr | head -10

The above command searches for all files in the system, then prints the size and path using the %s and %p directives . The result is then piped to sort to filter from the largest to the smallest then the head will limit to 10 result. The -n is for numeric sort and the -r passed to sort will reverse the result of comparisons.


# find / -type f -printf '%s %p\n' | sort -nr | head -10
140737486266368 /proc/kcore
18595594900 /var/log/asterisk/
16179399327 /var/log/asterisk/
2146986543 /var/log/asterisk/cdr-csv/Master.csv
2114553859 /var/log/asterisk/cdr-custom/Master.csv
649754355 /home/centos/
484122304 /tmp/pip.log
309011589 /var/log/asterisk/cdr-custom/Simple.csv
141488931 /usr/lib/jvm/java-11-openjdk-
106075056 /usr/lib/locale/locale-archive

# Conclusion

From the above explanation, you learnt how to get the largest and biggest files and directories in Linux. We also learnt how to use the sort command to sort the returned output and the head command to only limit the result to the number we specified.

To check more on the commands we used, don’t hesitate to use the man pages. Use these commands:

man du  
man find  
man sort  
man head  
man tail
Last updated on Mar 20, 2024 17:19 +0300
comments powered by Disqus
Citizix Ltd
Built with Hugo
Theme Stack designed by Jimmy