Last week we took a look at the df command, which is for getting information on filesystems. A related command is the du command, which tells you how much disk space your files and directories use. I like the du -- disk usage -- command because it is faster and more powerful than clanking around with clumsy graphical file managers. It's familiar to Linux oldtimers, but perhaps not so familiar to newer Linux users. It tells you the size of a single file:
$ du -h grundfos-321.pdf 2.6M grundfos-321.pdf
The -h option tells it to use nice human-readable numbers like 2.6M, instead of 2648. This works for directories, too. This example adds the -s option to display the total size for the directory and everything in it:
$ du -sh z-audacity 307M z-audacity
If you omit -s then it prints the sizes of all subdirectories, and a grand total. To see the sizes of all files and subdirectories, use -a:
$ du -ah z-audacity 4.0K z-audacity/chapter-schedule 900K z-audacity/futurans_1.3-1_all.deb 72K z-audacity/jack-howto_files/patchbay1socket.png 4.0K z-audacity/jack-howto_files/book.css
Our data grow as fast as our storage media
This particular du incantation is an old favorite for finding the top ten biggest files in a directory. This example finds the ten largest files in my data directory:
$ du -a storage | sort -nr | head -n 10 105367464 . 61717620 ./archives 36011036 ./archives/Music 31444216 ./unsorted-pics 31384836 ./unsorted-pics/Pictures 23290472 ./archives/Pictures 20403728 ./archives/Music/Terry-albums 13239772 ./archives/Pictures/2009-4-7 11878452 ./camcorder 7805196 ./unsorted-pics/Pictures/yarrow-june-201
You can query multiple directories at once:
$ du -sh unsorted-pics receipt_files archives 30G unsorted-pics 396K receipt_files 59G archives
Use wildcards to select subdirectories. The first example only queries the second level directories, the second example queries the second and third level directories, and the third example skips over the second level:
$ du -sh unsorted-pics/* 3.4M unsorted-pics/albums 34M unsorted-pics/bratgrrl-pics 1.8M unsorted-pics/coast-album
$ du -sh unsorted-pics/*/* 8.0K unsorted-pics/albums/index.html 1.8M unsorted-pics/albums/outdoors 264K unsorted-pics/bratgrrl-pics/2eaglessmall.jpg
$ du -sh unsorted-pics/../* 8.0K unsorted-pics/../5-foss-groupware-servers.html 8.0K unsorted-pics/../ardour1.html 2.5M unsorted-pics/../Bow_Violin.pdf
You can query the current directory for particular filetypes:
$ du -h *.jpeg *.png 32K figure1.jpeg 128K figure2.jpeg 8.0K figure3.png
Use the -S option to get the size of a directory, without including its subdirectories:
$ du -Ssh ~ 17M /home/carla
One of my favorite options is -x, which keeps du in the local filesystem. My home directory includes remote mounts and links to directories on other filesystems. Compare the results of the two examples-- the first example tallies up everything in my homedir, including remote mounts and linked directories, while the second example only adds up files on the local filesystem:
$ du -sh ~ 1.9T /home/carla
$ du -sxh ~ 6.8G /home/carla
So, when you have a conglomeration like this, how do you know where your files are actually stored? With the df command, of course. As is the case with so many of these little old Linux commands, there are many more things du can do, and you will find these in man du.



Exclusive 


Comments
Subscribe to Comments Feeddjf Said:
Very good - and so was the df
woo Said:
very useful post, here i found out about the "-x" option of "du". :)
Ken Jennings Said:
Finding big things is only part of the battle. After several backups from different computers to the file server there can be a a lot of duplication of small things. It would be nice if there was a common command that easily identifies identical files wherever they occur (or even if named differently), and be able to delete all the redundant copies. Merging multiple similar directory trees with variable collections of files would be cool too -- retain only the final complete version that has all the files while deleting all the rest. I read about something called fslint which may do something along these lines, but it doesn't appear to be built for my distro.
saf1 Said:
thanks for the very useful information
Amit Said:
TYVM ,very useful information.
Pieter Said:
I use FSlint and it's great at finding duplicates (even when their names are differrent) and then creates hard-links betwwen them. According to its website it's written in python, so you should be able to download and run the source code on any distro with python installed. Check its website here http://www.pixelbeat.org/fslint/.
Zilvis Said:
get ncdu - U wont regret!
Alex Borrell Said:
old command, new tricks. Thanks