July 6, 2005

An introduction to Delta Debugging

Author: Alessandro Giusti

Every developer knows that debugging -- the process of finding and fixing defects in program code -- is an important process. Often, the effort devoted to it outweighs the cost of all the other software development phases. Debugging is unpredictable, since a single bug can hold clueless developers hostage for long time. And unfortunately, debugging has always been a mostly manual affair -- but that's about to change, with the advent of Delta Debugging.

Today, the only step in the debugging process that typically benefits from even partial automation is automated testing, which allows developers to detect the existence of bugs. Debugger tools like gdb also help developers during debugging, but they are just passive tools for inspecting and manipulating program execution.

Delta Debugging, developed by Professor Andreas Zeller's group at Saarland University in Germany, is an automatic and systematic technique that deals with isolation of the source of the failure -- often the developer's longest, most demotivating, and most repetitive activity when carried out manually.

Think of a program that crashes every time you feed it the specific input file foo.txt. The programmer needs to know what's wrong with foo.txt, and usually starts the boring mechanical process of leaving out different portions of it and blindly retrying the execution until the program runs successfully. Eventually this trial-and-error procedure allows the programmer to isolate the source of the failure.

Delta Debugging deals with this automatically. Given the failure-causing input file, it iteratively executes new tests with original, carefully crafted input files, ultimately returning two similar files, of which only one induces the failure.

This innovative technique has proved itself in a number of real-world settings. For example, it helped to simplify a Mozilla bug where printing a very big Web page caused a crash; it turned out that a single select tag was a direct cause of the problem.

This automated failure induction alone is an exciting new opportunity, and, if implemented and well-integrated in IDEs, could save bug hunters a lot of time. However, Delta Debugging goes beyond that.

When integrated in a testing framework for interactive software, Delta Debugging can deal with user actions such as mouse clicks and keystrokes just as easily as it handles input files. This has great potential to simplify bloated bug reports, which open source bug tracking systems are full of. Developers are much happier if their bug-testing test cases are minimized. Stating that a failure occurs when confirming a dialog with a specific set of inserted values and selected checkboxes is not as helpful as a Delta Debugging algorithm that automatically pinpoints that just one check box is the source of the failure, while other options are irrelevant.

Delta Debugging can also be applied directly to the code. If a specific release or revision works flawlessly but a more recent one shows a specific misbehavior, the source code for the two revisions can be the starting point for an application of the Delta Debugging algorithm. With a procedure similar to the one presented before, Delta Debugging iteratively runs different combinations of the code modifications, until a single difference is highlighted as failure-inducing. Since 1999, when Delta Debugging was born, this approach has helped to fix an integration problem between ddd and two successive releases of gdb, comprising tens of thousands of changed lines.

A third application of Delta Debugging deals with multithreaded software -- programs that run in multiple concurrent instances, possibly performing different tasks, in a shared environment. In this complex setting, bugs often show up only when specific scheduling solutions are adopted -- for example, when some statement of a thread is executed after some other statement of the other thread; and since scheduling is not managed by the program itself, and is usually non-deterministic, failures happen in an intermittent, unrepeatable way -- a nightmare for developers! The problem has already been dealt with, and tools like DejaVU allow Java programs to be executed with a repeatable scheduling. Delta Debugging can take a scheduling that causes the bug to manifest, and one where it's latent. Following the exact same algorithm presented before, but now dealing not with input or code, but with scheduling solutions, Delta Debugging teams up with DejaVU and eventually pinpoints where the danger in the schedule lurks.

All these different applications of Delta Debugging are helpful for developers, but implementations published until now have been more proofs-of-concept than actual usable tools, except in some specific situations. For instance, an Eclipse plugin is available for simplifying failure-inducing input, but its installation on a recent release is not straightforward at all. Shiny new Eclipse plugins are promised for the next months, implementing Delta Debugging on code changes.

The most visible and important incarnation of Delta Debugging so far is AskIgor, a "public debugging server," which Delta Debugging applied to program states.

When a program is running, it is characterized by an evolving state, comprising the values of its variables and data structures. At any given moment of execution, the state (or, at least, a relevant part of it) can be captured and inspected using a debugger. If we have two executions, one of which fails, at any given moment we can take snapshots of their states; no doubt they will have differences. Applying Delta Debugging to these differences, we are able to determine the subset of the state (usually a small set of variables) that is relevant for the failure. This is precious information for developers, especially if the process is repeated at different moments of the two parallel executions.

To use AskIgor, you need a Linux i386 binary compiled with debugging symbols. You recreate the environment where the program will run by uploading an archive of files. Then, you just need to specify a passing and a failing invocation, and start the process. The server runs your software in a sandbox, guesses what the failure symptoms are (you can correct it later if the guess is incorrect), and applies the Delta Debugging procedure to your program's state. Some minutes later, it displays the results in an elegant HTML report. It lists the failure-inducing variables at different moments (usually, important functions' entry points), which represents a cause-effect chain that ultimately leads to the failure. Checkboxes allow you to ask for a more detailed explanation of a specific cause-effect step. Eventually, you will be able to zoom right to the instructions where unintended things start to happen.

The developers of AskIgor report a number of case studies that convey the power of this approach. One of them is about a specific GCC revision that crashed while compiling a simple, correct C program. Applying the procedure to GCC led to a fast diagnosis.

Also on the AskIgor site is a guided example with a simple flawed program and a detailed explanation of how AskIgor can help in the debugging process.

The software that powers AskIgor, which implements the Delta Debugging technique applied to program state, is available for download as open source. In coming months, expect to see more tools powered by this technology. Project leader Andreas Zeller is also writing a book on debugging, which also deals with Delta Debugging, planned for fall 2005.

Overall, Delta Debugging is an extremely interesting approach, with solid theoretical foundations. Real-world applications are still in their infancy, but they should reach a wide audience in the coming months.

Click Here!