August 3, 2012

Stresslinux Torture-Tests Your Hardware

Stresslinux is a lean, mean torture machine with 750MB of hardware-pummeling goodness for probing and load-testing your computer's hardware. Why, you ask, would anyone want to torture their nice hardware? Perhaps "torture" isn't the best word; think load-testing to expose defects, "burning in" a new machine, or to figure out some limits for overclocking.

Stresslinux runs from external bootable media: CD, USB stick, PXE boot, or you can run the VMWare image. My favorite is a USB stick because it is fast. There are good instructions for creating your chosen boot medium.

When you boot up, you have the option to allow sl-wizard to probe your system for sensors and then load the appropriate drivers (see the screen shot, below).

Probing system sensors at boot.

You can re-run this anytime after boot by deleting /tmp/sensors and then running the script as root, like this:

stress@stresslinux:~>;sudo rm /tmp/sensors

Exploring Stresslinux

This is not your ordinary stripped-down Linux. It was built with SUSE Studio, and is based on OpenSUSE. If you like playing with test builds there are a ton of 'em.

Stresslinux uses Busybox in place of the usual coreutils, fileutils, and other standard Linux commands. Busybox is a single stripped-down binary containing several dozen commands, and it uses the ash shell, so you may find that some of your favorite options are missing. The Busybox command reference should help you.

Stresslinux uses the Fn keys in an interesting way. There are 6 ordinary ttys on F1-F6, and it boots to tty1 on F1. The rest are normal login ttys. On US keyboards you can switch between these with ALT+Fn. (STRG+Fn on German keyboards, which is the same as CTRL+Fn.) F10 displays eth0 throughput (see above), F11 shows hard disk temperatures, and F12 displays lm-sensors readings.

Stresslinux screen shot

That Was Fun, Now What?

Now that Stresslinux is booted up and you have gazed upon your eth0 and sensor outputs, what's next? Let's spend some time with the stress command, because that is a good general-purpose workload generator. It operates by siccing a bunch of hogs on your system. This simple invocation puts a light load on the CPU, I/O, memory, and hard drive:

> stress --cpu 8 --io 4 --vm 2 --hdd 4 --timeout 30s --verbose
stress: info: [16275] dispatching hogs: 8 cpu, 4 io, 2 vm, 4 hdd
stress: dbug: [16275] using backoff sleep of 54000us                                             
stress: dbug: [16275] setting timeout to 10s
stress: dbug: [16275] --> hogcpu worker 8 [16276] forked
stress: dbug: [16275] --> hogio worker 4 [16277] forked                                          
stress: dbug: [16275] --> hogvm worker 2 [16278] forked
stress: dbug: [16275] --> hoghdd worker 4 [16279] forked
stress: dbug: [16275] using backoff sleep of 42000us
stress: dbug: [16275] setting timeout to 10s
stress: info: [16275] successful run completed in 13s

That snippet shows I wasn't kidding about the hogs. When it finishes with "successful run completed" that means there were no errors. If it did detect errors, it would either try to tell you what they were, or tell you to examine the syslog. Let's walk through this so we know what it's doing.

--cpu 8 tells it to fork 8 processes. Each process calculates the square root of a random number (by calling the sqrt() and rand functions) in a loop that stops at the end of your timeout, or when you stop it with CTRL+c. Are there any geezers out there who remember who said "Computer. Compute to the last digit the value of pi?" And why? This is similar, a way to keep the CPU constantly busy.

--io 4 forks 4 processes that call the sync() function in a loop. sync() flushes any data buffered in memory to disk. Most Linux filesystems use delayed allocation; that is, data are held in memory for a period of time before being written to disk. This speeds up performance because disk I/O is slower than RAM. Running this one by itself and trying out different values will give you an idea of your I/O performance.

--vm 2 thrashes your RAM by forking 2 processes to allocate and release memory. (Looping malloc() and free().)

--hdd 4 pummels your hard drive with writes, by calling the write function in a loop.

--timeout 30s tells stress to stop after 30 seconds. Or whatever time you want, of course, using s,m,h,d,y (seconds, minutes, hours, days, years). Always set a timeout, because this is your protection from the system locking up and becoming inaccessible. stress runs in userspace and can't cause any damage, but it would be sad to have to reboot to stop it.

--verbose makes it spit out many lines telling what it is doing.

The example above is pretty wimpy and won't stress out anything made after 1990. You can run one test at a time, for example. This puts a much larger load on your CPU:

> stress --cpu 2000  --timeout 30s --verbose

Just for fun, follow along with top, and press 1 to see all cores individually:

top - 18:58:17 up  8:56,  9 users,  load average: 419.86, 226.40, 88.02
Tasks: 2207 total, 1449 running, 756 sleeping,   0 stopped,   2 zombie
Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 98.1%us,  1.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu2  : 97.7%us,  2.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

When your stress run is finished, notice the difference in the Tasks statistics and load averages.

You can control the load on your memory with the --vm-bytes option. Suppose you have 8GB memory, which is not unusual in these here modern times. Try this:

> stress --vm 2 --vm-bytes 6G --timeout 30s --verbose

Be careful with this because thrashing your memory can make your system hang. The vm option allocates and releases memory; the vm-hang simulates low memory conditions by having each hog process go to sleep, rather than releasing memory. This example hogs 3 gigabytes of memory:

> stress --vm 2 --vm-bytes 3G --vm-hang --timeout 60s

What happens when your system is I/O bound? Try this:

> stress --io 8 --timeout 2m

Test your hard drive by writing a large file to disk:

> stress --hdd 1 --timeout 5m

The default file size is 1GB, and you can specify any size with the --hdd-bytes option, for example --hdd-bytes 5G writes a 5 gigabyte file.

stress is tidy and cleans up after itself, so you shouldn't have to worry about leftover hogs running amok. The documentation on stress isn't exactly lavish, and the most complete help is in info stress.

Other Commands

Visit the Stresslinux software page to see all of its system-testing commands. It includes reliable standbys like CPU burnmemtestbonnie++lshw (list system hardware, my fave), tiobench, and smartmontools. There are a lot of nice Linux bootable system rescue distros that do this and that and everything, but I think Stresslinux is tops for a good specialized distro.