Linux cluster goes Orbital

34

Author: Tina Gasperson

Orbital
Sciences
senior engineer Heather Holst and her
colleagues are smart. That’s to be expected, since
they work for a company that designs and tests
rockets. This highly intelligent group of people
created makeshift clusters in order to perform a
demanding simulation technique called Computational
Fluid Dynamics (CFD). There was only one problem….CFD looks at fluid flow, which happens continuously
all around us as we breathe, drink, move, and work;
and it happens as machines and systems function as
they were created to do. If a machine’s design is
faulty, fluid flow can negatively affect its
performance.

CFD applications use enormous amounts of computing
power to translate fluid volumes into algebraic
equations which can be solved in order to predict,
for example, maximum rocket performance or fuel
efficiency.

The greater the computing power available, the
greater the accuracy of the results of the algebraic
formulas. The reverse is also true: you can’t get a
very clear picture of fluid flow effects without
lots of memory. And if you try to push the numbers
through inadequate hardware, you’ll bog down very
quickly.

This is what was happening at Orbital. Each engineer
was running his own Sun box, according to Holst.
When they started doing CFD, they’d create their
own mini-clusters by logging in to each other’s
machines and “paralleling it out.”

“The parallel solver goes in, pulls up your grid,
and parcels it out into four sections. It solves the
equations, then goes back to the main computers and
reports the results,” said Holst.

It seemed like the perfect solution, but there was a
huge snag. “You are slowed down by the transmission
rate,” she said. “We were getting some very
difficult problems [to solve] and our parceling was
very inefficient.”

Things went from bad to worse. Several projects came
up all at once, and Holst and colleagues realized
that the make-do Sun cluster was not going to make
do much longer. They had to go to management and ask
for extra money to subcontract the CFD processing
out to another company.

The higher-ups weren’t thrilled about handing over
the bucks, but at that point there wasn’t much
choice. They contracted with Fluent, which
solved the equations with greater computational
power and handed back the results.

Even though it was much better than a bogged-down
homemade cluster, it didn’t take much time for
Orbital to decide that subbing out was inefficient,
problematic, and cost more than they wanted to
spend.

“Sometimes it’s very difficult to sub out because
with certain CFD problems there are restrictions on
who can handle it,” Holst said. The confidential
nature of some defense-related projects, for
instance, calls for U.S. citizens only, or
computations done only on computers that are not and
will not be connected to the Internet. So “we
decided to cut out the middleman,” Holst said.

Orbital began to consider investing in its own
high-power cluster. Fluent was using big Linux Networx
clusters to do computations for Orbital and its
other clients. A Linux cluster would be horizontally
scalable, able to expand as Orbital’s business grew.

“We looked at some Sun clusters, but finally decided to go with Linux Networx because of the lower costs involved,” Holst said. So in October 2003, Orbital bought and installed one of Linux Networx’ “Evolocity” clusters, equipped with 24 Intel Xeon processors and Fluent 6.1 CFD software.

Vince Allen, aerodynamics manager at Orbital, said the jump in performance since the migration to Linux is phenomenal — they’ve been able to run problems 30 times faster than with the old Sun system, and the need to outsource projects has dropped to almost nothing.

“[It] allows us to run bigger problems than ever before and more numerous design variations on smaller cases, allowing us to refine our analytical predictions to levels that were not attainable at Orbital before,” Allen said.

Orbital engineers knew that by switching to a Linux cluster they’d garner a greater ability to sail through huge, complex CFD problems. One of the surprise benefits, however, has been the increased efficiency on small projects. Orbital engineers have found that results are even more refined and accuracy levels have soared, even on everyday calculations.

Holst agreed that the new system has been running smoothly. “When we first started using the cluster, there were some issues with heat. We just didn’t have the HVAC capacity to keep the processors cool enough. Once we got that fixed, really there have been no problems that I know of,” she said.