July 18, 2008

Use xfs_fsr to keep your XFS filesystem optimal

Author: Ben Martin

The XFS filesystem is known to give good performance when storing and accessing large files. The design of XFS is extent-based, meaning that the bytes that comprise a file's contents are stored in one or more contiguous regions called extents. Depending on your usage patterns, some of the files contained in an XFS filesystem can become fragmented. You can use the xfs_fsr utility to defragment these files, thus improving system performance when it accesses them.

When you copy a file onto an XFS filesystem, you usually end up with a file that has one extent that contains the entire contents of the file. If you want to extend the file or overwrite its contents with new data, the area after the file might not be available, so the file might be split into two extents at different locations on disk. Of course, the applications accessing files do not need to worry about this; they can just read the contents from start to end and lseek(2) around in the file as though it were a linear range of bytes. There is however a performance penalty for storing a file's contents scattered over the disk in many extents.

You can use the xfs_bmap utility to see the extent map for a file that is stored on an XFS filesystem. If you execute it with the -v verbose mode you can see the mapping of file offsets to blocks in the filesystem. In the case of the file shown below, I was unlucky; the filesystem split the 300MB tarball file over two extents.

# xfs_bmap -v sarubackup-june2008.tar.bz2
sarubackup-june2008.tar.bz2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..350175]: 264463064..264813239 10 (2319064..2669239) 350176
1: [350176..615327]: 265280272..265545423 10 (3136272..3401423) 265152

If you want to see what the fragmentation is like for the whole filesystem, use the xfs_db utility. The -r option tells xfs_db to operate in read-only mode, which lets you use it on a mounted and in-use filesystem, and is probably a good idea anyway unless you really want to modify the filesystem. The utility's frag command causes disk activity for a number of seconds and then reports the fragmentation of the filesystem, as shown below.

# xfs_db -r /dev/mapper/raid2008-largepartition2008
xfs_db> frag
actual 117578, ideal 116929, fragmentation factor 0.55%

The xfs_fsr(1) program is contained in the xfsdump package in Fedora 9 and in Debian-based distributions. This is a real shame, as xfs_fsr is an extremely useful tool, and placing it in xfsdump makes it a great deal less likely to be installed and used than it would be if placed it into the xfsprogs package along with mkfs.xfs. xfs_fsr is a filesystem reorganizer, designed to be run regularly from a cron job to defragment XFS filesystems while they are mounted.

You can run xfs_fsr in two ways; either pass it a duration and it will loop through all your XFS filesystems, attempting to optimize the most fragmented files on each filesystem until that duration has passed, or you can explicitly defragment a specific XFS filesystem or file on an XFS filesystem. When you run xfs_fsr with a duration and it runs out of time, it stores information about what it was doing to a file in /var/tmp so that it can continue from the same point the next time it is executed with a duration. This way you can have a cron job perform a little bit of optimization every day when your machine is experiencing a period of low activity.

To optimize a file, xfs_fsr creates a new copy of an existing fragmented file with fewer extents (fragments) than the original one had. Once the file contents are copied to the new file, the filesystem metadata is updated so that the new file replaces the old one. This implies that you need to have enough free space on the filesystem to store another copy of anything that you want to defragment. The free space issue extends to disk quotas as well; you cannot defragment a file if storing another complete copy of that file would exceed the disk quota of the user that owns that file.

Because xfs_fsr will by default defragment all your XFS filesystems when you give it a duration, there are a few subtle issues that might pop up extremely rarely. If you are using a boot loader like LILO that relies on its configuration file being at a fixed location on disk, xfs_fsr might break it by moving the file to defragment it. For such cases you can flag specific files or directories with a special no-defrag flag using the command xfs_io so that xfs_fsr will never attempt to defragment those files. If you mark a directory as no-defrag, files and directories created in that directory will inherit the no-defrag flag. See the xfs_fsr manual page for information about the no-defrag flag and how to set it.

Because the sarubackup-june2008.tar.bz2 file shown in the xfs_bmap output above contains two extents, we can use it to show the invocation of xfs_fsr explicitly on a file on an active XFS filesystem. Note that after running xfs_fsr below there is only a single extent used to store this file.

# xfs_bmap -v sarubackup-june2008.tar.bz2
sarubackup-june2008.tar.bz2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..350175]: 264463064..264813239 10 (2319064..2669239) 350176
1: [350176..615327]: 265280272..265545423 10 (3136272..3401423) 265152

# md5sum sarubackup-june2008.tar.bz2
123b9db92b31bea5f60835920dee88d5

# xfs_fsr sarubackup-june2008.tar.bz2

... 300Mb file, takes a few seconds on of grunting on the RAID ...

# xfs_bmap -v sarubackup-june2008.tar.bz2
sarubackup-june2008.tar.bz2:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..615327]: 267173832..267789159 10 (5029832..5645159) 615328

# md5sum sarubackup-june2008.tar.bz2
123b9db92b31bea5f60835920dee88d5

To run xfs_fsr regularly from cron, you can simply invoke it without any arguments, perhaps redirecting its output so that you do not get regular email from it. The only parameter that you are likely to want to use is -t to specify how long (in seconds) you would like xfs_fsr to run. The default is 7200 (two hours); for a desktop machine you might like to make it six hours and place it in your regular sleep time. as shown below:

# cd /root
# mkdir -p mycron
# cd mycron
# vi xfs-fsr.cron
30 0 * * * /root/mycron/xfs-fsr.sh
# vi xfs-fsr.sh
/usr/sbin/xfs_fsr -t 21600 >/dev/null 2>&1
# cat *.cron >|newtab
# crontab newtab

It is a shame that Linux distributions tuck this utility away with filesystem dump and restore tools rather than install it as prominently as mkfs.xfs, perhaps in the same xfsprogs package. If you have been running an XFS filesystem for a few years and do not know about the xfs_fsr utility, you could get improved filesystem performance by running it over your system a few times.

Categories:

  • System Administration
  • Tools & Utilities