January 7, 2003

Can Linux help save SGI?

- By Robin 'Roblimo' Miller -
One of Linux's supposed barriers in high performance computing is the "eight processor limit." SGI says their new Altix 3000 line, running a patched 2.4.19 kernel, handily breaks this barrier -- it can run up to 64 Intel Itanium 2 microprocessers -- and that "superclusters" built with SGI's Linux-based products can outperform generic Linux clusters in some applications by a large enough margin to justify their additional cost.

Addison Snell, SGI's product manager for high performance computing, says, "We've take taken the high end architecture of our SGI Origin 3000 and we run it on Intel Itanium 2."

In the past, SGI has concentrated on proprietary operating systems, notably its own IRIX, and its proprietary MIPS microprocessors. These have long been regarded as great products from a technical standpoint, but have been getting beaten consistently in many segments of the high performance computing (HPC) marketplace by lower-cost computing clusters running Linux and generic -- usually Intel -- microprocessors.

The company has been eyeing Linux for several years and has steadily moved toward becoming a Linux-oriented shop, to the point of assuming responsibility for respected Linux International leader Jon 'maddog' Hall's salary and expenses in late 2002.

SGI's Linux embrace may not have been completely voluntary. The company has been on the ropes financially for several years. It sold its Cray supercomputer business in early 2002, and before that SGI transferred a number of its 3-D graphics patents to Microsoft in a deal that netted SGI some much-needed cash and closed out some lingering graphics patent conflicts with NVidia and its partners, and apparently gave Microsoft a big piece of the intellectual property it needed to get the XBox out the door on time without any threat of patent conflicts.

Thinking of Microsoft, for a while SGI tried hard to get into the Win2K server and workstation market -- and lost massive of amounts of money trying. Now, in 2003, SGI is back to concentrating on its original strengths: advanced visualization, managing complex data sets, and high-performance computing. The company still has massive amounts of debt, but it's not in nearly as rough shape as it was a year ago, when the words "SGI" and "bankruptcy" often appeared together in IT business news articles.

Into the future with Linux

The high performance computing market is increasingly dominated by Linux clusters built from generic parts, but all such clusters tend to suffer from data transfer latency problems since, after all, they are essentially linked groups of individual computers that each run independently, and data must be moved from one to another through a network.

SGI's Altix 3000 architecture treats up to 64 processors as a single unit, running a single copy of Linux. It also treats huge quantities of memory as a single unit; Addison Snell says it can deal with "up to half a terabyte of memory."

Another claim Snell points to with pride: "We're getting over two gigabytes per second of I/O to disc with Linux. This shatters all previous claims. No one else has even claimed one gigabyte per second."

Snell notes that this is all "built on industry-standard Linux," specifically the 2.4.19 kernel. And, he says, "The Linux portion and patches to enable our system are all licensed under the GPL."

(SGI is keeping some code proprietary, notably portions of its SGI ProPack for Linux, details of which were not yet on the company's Web site when this article was written. But there is a comprehensive Open Source at SGI page that gives details of the company's Open Source efforts, well worth checking out for anyone interested in "big iron" computing with Linux and Open Source.)

Imagine a Beowulf cluster of these

One of SGI's big pitches is for groups of Altix 3000s linked in what they call superclusters. This is where they're going with their repeated talk about scalability. These superclusters are not cheap; an SGI press release draft we have obtained says, "A 64-processor SGI Altix 3000 system starts at $1,129,262 (U.S. list)." The draft press release also claims this price is "roughly one-third the price of a 64-processor IBM eServer pSeries 690-based system and 57 percent less than the HP Superdome," so while it is high by many Linux users's standards, the Altix 3000 line is apparently a bargain compared to its competition.

You can probably still perform many large computational tasks for a lot less with a "white box" 32-bit Linux cluster. But Snell says the extra price for extra speed and performance is worthwhile "if you have categories of applications that don't easily decompose into smaller islands of memory." He mentions "global climate simulations, a lot of engineering analysis, and airflow turbulence studies" as examples of problems where an SGI supercluster can outperform a generic Linux cluster by a large enough factor to be worth the extra cost.

Will Linux help save SGI?

Snell and SGI's promotional material both boast about how SGI's move to Open Source (and open standards for data handling, and use of generic Intel chips instead of SGI-made custom ones) combine the best of the Open Source development pattern with the best of SGI's high performance computing experience to create a new level of cost-effective supercomputing that can do things like take all the datasets from a bioinformatics company and read them into a single active memory space and manipulate them there instead of having to constantly read and write data from a disc.

This is another one of Snell's examples of situations where an SGI supercluster's performance edge over a white box cluster is worth its additional cost.

Snell and other SGI marketing people can reel off whole strings of areas where their products offer advantages other do not. They get paid to come up with and deliver these spiels, of course, so we expect nothing less from them. It all seems wonderful when we listen to them or look at slides they've prepared. But HP's and IBM's (and everyone else's) presentations are just as full of reasons why their high performance computing products are better than all others, too.

Admittedly -- and this may be one of the dumbest possible purchasing criteria for industrial-strength computing equipment -- SGI turns out some of the best-looking hardware around. Besides that, SGI has a huge reservoir of goodwill and a fair amount of what we might call "coolness factor" sympathy to draw upon, since it was the company that came up with many of the first truly jaw-dropping computer graphics tricks.

Whether these intangibles can be translated into enough sales to keep SGI afloat after a number of horrible years during which its market share eroded and most of the new products it introduced were pure-out flops is another question.

SGI's new Linux-powered Altix 3000 line certainly looks -- on paper -- like it has enough oomph per dollar to make some waves in the high performance computing marketplace, and since the amount of data industry and government bodies want to manipulate keeps growing, demand for high performance computing systems is also growing.

Differentiation from competitors is going to be the key to SGI's future, but use of Open Source software and open standards, which inevitably leads to more application portability and, therefore, less lock-in, will make it hard for SGI to maintain its cachet as a company that consistently produces unique products.

On the other hand, SGI has not done well recently -- in a business sense -- with unique products like its ultra-cool IRIX workstations, even though many of the people who have been lucky enough to get their hands on these lovely boxes say they are the best desktop graphics tools they have ever used.

SGI's move toward more Linux, less differentiation, and lower system prices may be the only rational course for the company to follow in today's ambivalent IT business climate. Whether or not this direction will lead to success probably won't be apparent until 2004 at the earliest, since high performance computing system sales can take a year or more from first contact to signed contract.

2004 will be a critical year for SGI, since that's when most of the company's $200 million-plus debt comes due.

According to its latest quarterly report, SGI isn't losing as much money as it was a year ago, which is certainly good news. But even if SGI's Linux superclusters are as market-eating as the company's employees and shareholders hope they will be, they're going to need to sell an awful lot of them to pay off their corporate debt and turn a profit.


