-By Grant Gross -
Researchers at the University at Buffalo, the State University of New York, have fired up a 2,000-node, 4,000 processor Linux cluster and hope the cluster's help in studying the human genome will lead to breakthroughs in the treatment of cancer, Alzheimer's and AIDS.
The cluster at the Buffalo Center of Excellence in Bioinformatics at SUNY Buffalo went online in mid-August and is often running at full capacity even though the set-up team is still doing some minor tweaking, says Jeffrey Skolnick, the soon-to-be director of the bioinformatics center.
Most of the computers in the cluster are 1.26 GHz Dell PowerEdge servers, with a few higher-speed Xeons thrown in. Subcontractor Sistina Software is providing cluster file system technology to manage the data traffic among the nodes.
Skolnick can rattle off all kinds of interesting statistics about the cluster. It will enable researchers to predict protein structure and run large-scale computer simulations, and work that would've taken 1,000 years on on processor will be done in three to six months on the cluster. "We're not just trying to collect computers, which is a nice little hobby," he says. "It enables us to do science we couldn't do elsewhere."
And managing these 2,000 machines will be two sysadmins. That's right, two of them. That's the beauty of running a cluster instead of a bunch of individual machines, of course, and Linux will help keep the maintenance costs down, Skolnick says. "I've got to get the most bang for my research dollar," he adds.
Why Linux instead of, say, Windows? Skolnick talks about Linux's scalability and stability. "We found it to be a very stable production environment," he says, gearing up for a not-so-subtle dig at Windows. "Stability is kind of nice. You don't want to be running around rebooting 1,000 machines all the time."
Another Linux advantage is that it's free. With Windows, Skolnick would have to shell out money for 4,000 licenses. "All of a sudden, you're talking about real money," he says.
The Sistina people also talk up the way their product makes managing the cluster easier. Sistina's Global File System provides enhanced data sharing and application managing functionality; otherwise "It'd be like sucking the ocean through a straw," says Joaquin Ruiz, v.p. of marketing and product management.
In providing that "glue" for the cluster, the Global File System contributes to the low maintenance costs, Ruiz notes. "You reduce the headaches of the poor system administrator who has to manage all those applications."
Kevin Noreen, senior marketing manager for clustering at the Dell Enterprise Systems Group, says this is one of the largest clusters his company has been involved with. And Skolnick expects to add another 40 to 50% capacity this spring.
Noreen says Dell was excited to be a part of a project that has such a large potential for doing good. "There's a lot of work going on that's going to to further the betterment of society," he says.
While clusters first became popular in academic circles for doing heavy-lifting research, Noreen and the Sistina crowd expect clusters to become more popular in private industry as well. Clusters are becoming popular in the petroleum industry, and Noreen says clustering is catching on in the financial services industry, to crunch numbers for portfolio management, risk analysis and financial simulations. Many of those companies aren't trumpeting their use of clusters, he says, because they consider those services a competitive advantage.
"Clustering is probably more pervasive than people think," Noreen says. "Instead of a supercomputer, where you have to spend millions of dollars and have to forklift out later, companies can continue to grow that cluster by just adding more servers to it."
Skolnick gives kudos for Dell on its response time -- the order for the cluster was placed in June. "It's a non-trivial logistical task to haul in 80,000 pounds of computers, assemble them, and lay two miles of wiring," he says. "I'm pleased with how quickly it's come to life."