June 16, 2011

Things You Can't Do With a GUI: Finding Stuff on Linux

What's better, a graphical interface or the Linux command line? Both of them. They blend seamlessly on Linux so you don't have to choose. A good graphical user interface (GUI) has a logical, orderly flow, helps guide you to making the right command choices, and is reasonably fast and efficient. Since this describes a minority of all GUIs, I still live on the command line a lot. The CLI has three advantages: it's faster for many operations, it's scriptable, and it is many times more flexible. Linux's Unix heritage means you can string together commands in endless ways so they do exactly what you want.

Here is a collection of some of my favorite finding-things command line incantations.

File Operations

In graphical file managers like Dolphin and Nautilus you can right-click on a folder and click Properties to see how big it is. But even on my quad-core super-duper system it takes time, and for me it's faster to type the df or dh commands than to open a file manager, navigate to a directory, and then pointy-clicky. How big is my home directory?

$ du -hs ~
748G    /home/carla

How much space is left on my hard drive or drives? This particular incantation is one of my favorites because it uses egrep to exclude temporary directories, and shows the filesystem types:

$ df -hT | egrep -i "file|^/"
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/sda2     ext4     51G  3.6G   32G  11% /
/dev/sda3     ext4    136G  2.3G  127G   2% /home
/dev/sda1     ext3    244G  114G   70G  63% /home/carla/photoshare
/dev/sdb2     ext3     54G  5.8G   45G  12% /home/carla/music

What files were changed on this day, in the current directory?

$ ls -lrt | awk '{print $6" "$7" "$9 }' | grep 'May 22' 

May 22 file_a.txt
May 22 file_b.txt

Using a simple grep search displays complete file information:

$ ls -lrt | grep 'May 22' 
-rw-r--r-- 1 carla carla 383244 May 22 20:21 file_a.txt
-rw-r--r-- 1 carla carla 395709 May 22 20:23 file_b.txt

Or all files from a past year:

ls -lR | grep 2006

Run complex commands one section at a time to see how they work; for example, start with ls -lrt, then ls -lrt | awk '{print $6" "$7" "$9 }', and so on. To avoid hassles with upper- and lower-case filenames, use grep -i for a case-insensitive search.

Want to sort files by creation date? You can't in Linux, but you can in FreeBSD. Want to specify a different directory? Use ls -lrt directoryname.

Which files were changed in the last three minutes? This is quick slick way to see what changed after making changes to your system:

find / -mmin -3

You can specify a time range, like what changed in the current directory between three and six minutes ago?

find . -mmin +3 -mmin -6

The dot means current directory.

Need to track down disk space hogs? This is probably one of the top ten tasks even in this era of terabyte hard drives. This lists the top five largest directories or files in the named directory, including the top level directory:

$ du -a directoryname | sort -nr | head -n 5
119216208	.
55389884	./photos
40650788	./Photos
37020884	./photos/2007
20188284	./carla

Omit the -a option to list only directories.

Biggest Files

It is well worth getting acquainted with the find command because it can do everything except make good beer. This nifty incantation finds the five biggest files on your system, and sorts them from largest to smallest, in bytes:

# find / -type f -printf '%s %p\n' |sort -nr| head -5

1351655936 /home/carla/sda1/carla/.VirtualBox/Machines/ubuntu-hoary/Snapshots/{671041dd-700c-4506-68a8-7edfcd0e3c58}.vdi
1332959240 /home/carla/sda1/carla/51mix.wav
1061154816 /proc/kcore
962682880 /home/carla/sda1/Photos/2007-sept-montana/video_ts/vts_01_4.vob
962682880 /home/carla/sda1/photos/2007/2007-sept-montana/video_ts/vts_01_4.vob

You really don't need to include the /proc pseudo-filesystem, since it occupies no disk space. Use the wholename and prune options to exclude it:

find / -wholename '/proc' -prune -o -type f -printf '%s %p\n' |sort -nr| head -5

There is potential gotcha, and that is that find will recurse into all mounted filesystems, including remote filesystems. If you don't want it to do this then add the -xdev option:

find / -xdev -wholename '/proc' -prune -o -type f -printf '%s %p\n' |sort -nr| head -5

Another potential gotcha with -xdev is find will only search the filesystem the command is run from, and no other filesystem mounts, not even local ones. So if your filesystem is spread over multiple partitions or hard drives on one computer, and you want to search all of them, don't use -xdev. I'm sure there is a clever way to distinguish between local and remote filesystems, and when I figure it out I'll share it.

Now let's string together a splendid find incantation to convert those large indigestible blobs of bytes into a nice readable format:

# find / -type f -print0| xargs -0 ls -s | sort -rn | awk '{size=$1/1024; printf("%dMb %s\n", size,$2);}' | head -5

1290Mb /home/carla/sda1/carla/.VirtualBox/Machines/ubuntu-hoary/Snapshots/{671041dd-700c-4506-68a8-7edfcd0e3c58}.vdi
1272Mb /home/carla/sda1/carla/51mix.wav
918Mb /home/carla/sda1/Photos/2007-sept-montana/video_ts/vts_01_4.vob
918Mb /home/carla/sda1/photos/2007/2007-sept-montana/video_ts/vts_01_4.vob
918Mb /home/carla/sda1/Photos/2007-sept-montana/video_ts/vts_01_1.vob

Yes, I know, you can do many of these things in graphical search applications. To me they are slow and clunky, and it's a lot faster to replay searches from my Bash history, or copy them from my cheat sheet. I even have some aliased in Bash, for example I use that last long find incantation a lot. So I have this entry aliased to find5 in my .bashrc:

alias find5='find / -wholename '/proc' -prune -o -wholename '/sys' -prune -o -type f -print0| xargs -0 ls -s | sort -rn | awk '{size=$1/1024; printf("%dMb %s\n", size,$2);}' | head -5'

In this example I have excluded both the /proc and the /sys directories.

Fast Locate Command

The locate is very fast because it creates a database of all of your filenames. You need to update it periodically, and many distros do this automatically. To update it manually simply run the updatedb command as root. locate and grep are powerful together. For example, find all .jpg files that are 1024 pixels wide:

locate *.jpg|grep 1024

Search for image files in three different formats for an application:

locate claws-mail|grep -iE "(jpg|gif|ico)"

Well here we are at the end already! Thanks for reading, and please consult the fine man pages for these commands to learn what the different options mean.

Click Here!