Researchers speed, optimize code with new open source tools

5

Author: Jay Lyman

High-performance software developers may be getting a leg up on the latest hardware advances with a new set of open source software tools for developing scientific libraries created by U.S. university researchers. They claim the “new breed” of software they’ve created, dubbed “SPIRAL,” could revolutionize how computer code is written, particularly in light of the latest advances in high-performance hardware that is often, as in the case of IBM’s Blue Gene/L supercomputer, running Linux. The automatic code generator, which provides a broad range of solutions to identify optimal signal processing and math functions, spits out high-quality code that is less buggy, saving testing and time, Carnegie Mellon University professor and researcher Jose Moura told NewsForge/ITMJ in a recent interview.

He said the open source SPIRAL software, released under the GPL license, addresses the time it takes software developers to fine-tune and update digital signaling processing (DSP) algorithms and libraries according to hardware advancements from Intel, IBM, and other hardware companies.

“There are studies that show Moore’s Law and the increase in hardware capability is occurring,” he said. “Basically, the capability to develop software also is growing with time, but there’s a gap. The hardware’s on a higher slope than the software. As time goes on, the gap between them is getting bigger and bigger.”

Pumping out the platforms

Moura explained that hardware manufacturers such as Intel and IBM typically develop libraries of algorithms to be used in programs. These modules are used by programmers to process data and solve equations, but the process of producing the libraries — and the alternative of developing them separate from the big vendors — can be cumbersome.

“The vendors themselves want to provide high-performance libraries,” he said. “They have large teams of programmers. What happens is, as the new platforms come along, they need to re-develop their libraries.”

Highlighting the cost of several hundred programmers to produce the libraries for an increasing number of platforms, Moura said there were also issues of the time it takes to shake the bugs out of the code — a process that is accelerated using the SPIRA tools.

“The problem is, Intel and many others are putting out too many platforms,” he said. “It’s useful to shorten the cycles to produce good implementations of these libraries. That’s where these automatic tools come in. The code is high quality. And once the tool is updated for a new platform, you can, practically by pressing a button, optimize the libraries for the newer platform.”

Moura added that if the development is starting from scratch, the process is still sped up and improved because SPIRAL verifies the software it generates automatically.

He explained the SPIRAL tools basically express a broad set of the algorithms using very small constructs, which are used to assemble the algorithms based on rules. Moura, who worked with fellow Carnegie Mellon professors Markus Pueschel and Maria Manuela Veloso with the school’s electrical and computer science department, said the software helps end users better utilize the potential of their computers. Similar to other code generators and libraries, such as FFTW, the SPIRAL software is differentiated by its wider applicability, according to Moura.

“Others like FFTW generate fast code for their particular algorithm,” he said. “Our tools cover a much broader set.”

Super software for the most super computer

The SPIRAL software team — rounded out by University of Illinois, Urbana-Champaign Professor David Padua and Drexel University Professor Jeremy Johnson and funded by DARPA and the National Science Foundation (NSF) — had a connection with IBM, which was looking to generate applications to show off its top-ranked supercomputer Blue Gene/L.

“They wanted high-performance software for the machine, and they thought rather than human programmers, why not use our tool,” Moura said. “We tailored it to the machine and produced the software. It was really a nice demonstration of application of the tool and IBM’s machine.”

Blue Gene Systems Architect Jose Moreira told us in an email that the fastest FFT library for Blue Gene/L was generated for the world’s fastest computer using SPIRAL.

“Although SPIRAL is not the only alternative for FFT for Blue Gene/L, FFT is an important numerical kernel. A very fast implementation, as the one generated by SPIRAL, can significantly help Blue Gene/L applications.”

Moreira said SPIRAL does, in fact, represent a new generation of self-optimizing scientific libraries, also emphasizing the importance that it be open source.

“The fact that SPIRAL uses an automated approach to code optimization results in scientific libraries that can be highly optimized to each specific architecture, including Blue Gene/L,” he said. “It is very important to us that all potential IBM customers can have access to SPIRAL and the generated scientific libraries.”

Moreira called the automatic optimization of the SPIRAL tools “an important emerging software technology.”

“It can significantly reduce the time to produce optimized libraries for a new architecture, such as Blue Gene/L,” he said. “In that sense, it helps software development stay in sync with hardware development.”

Automatic generation trend

Gartner analyst Thomas Murphy said SPIRAL comes at a time when many groups are working on code-generation schemes, widely viewed as the possible key to the next big breakthrough. This has not yet materialized, however. Murphy’s biggest question on SPIRAL was how broadly usable the open source software would be.

While Moura indicated the team is working to make SPIRAL more robust, more user friendly, and extend it to more applications, Murphy also questioned the assumption that hardware is ahead of software. “Often, hardware has a hard time keeping up, just because the software applications are so complex,” he said.

The analyst added that there is only a narrow set of business applications that need such an optimized solution, but he also indicated the code-generation efforts were paving a path to the future.

“As new hardware becomes more capable, we’ll look at what kinds of things we can use it for and what kinds of problems can we solve,” he said. “We expect generating code and model-oriented development will push productivity and, more importantly, improve software quality.”