August 19, 2011

A Look at Oregon State University's Open Source Lab

One of the key themes at LinuxCon North America 2011 is the ubiquity of Linux. Many people use Linux in many ways, often totally unaware that they're depending on Linux. Likewise, those of us in the open source community depend heavily on Oregon State University's Open Source Labs (OSUOSL), but may not even realize just how much. Thanks to one of the final talks at LinuxCon by Lance Albertson, it's much clearer now just how important OSUOSL is.

If you're reading this article, you're making use of OSUOSL. But that's getting ahead of the story a bit. In the beginning, Albertson says that the Oregon State University president saw a sign with "," and didn't want to have the "worst" edu domain. (Read it again.) That led OSU's president to get in touch with the folks running the Web site. Having drawn the attention of the university leadership to the Web presence, OSU paid more attention to hosting and eventually got into the business of hosting for open source projects.

Timing was everything. In the post-dotcom bust, there was a lot of "dark fiber" nearby and the university managed to get "cheap" bandwidth by laying 11 miles of fiber to I-5 (at a cost of a mere $250,000 or so) where the "dark fibre" was at. That would be key to OSUOSL's success, and is one of the reasons that the lab is not easily replicated at other universities.

A little detail on the lab itself is in order. Albertson says that OSU has a 2770 square foot data center with 76 racks (for OSU, not just OSL), 55 ton cooling capacity, and dual independent power feeds as well as a generator for power outages. Albertson says that, knock on wood, the center hasn't had an outage in the data center since 2005.

The network for OSUOSL is provided by the Network for Education & Research in Oregon (NERO). The connection is 10Gbps to NERO, and the connection to the outside world is 2Gbps. "I hope by the end of the year we'll finally have IPv6." This hasn't been a technical issue, it's been a legal one that requires updating some contracts.

The lab has been growing quite rapidly. Four years ago, Albertson says the lab had less than 10 racks, about 60 machines and 30 virtual machines. Now OSUOSL comprises 22 racks, 366 machines, and about 130 virtual machines. Of the systems, OSL's various hosted projects are powered by 266 servers, with about 100 systems just for MeeGo. Projects often purchase their own servers, and projects that have their own racks at OSUOSL include Drupal, Apache Software Foundation, Linux Foundation, and MeeGo. Most are running Debian or Ubuntu, but some run CentOS, and ASF runs FreeBSD. OSUOSL also hosts servers for Freenode, Xiph, Fedora, CentOS, Inkscape, and others.

And, of course, people make the machines go. But not many people. OSL is eight full-time employees, not all of whom work directly on system administration, and about eight undergrad students. The lab obviously works on hosting, but also development, government outreach (including GOSCON), and Oregon Virtual School District (ORVSD). The lab also has a funded developer for MeeGo, to help get students involved with MeeGo development at OSU.

The students are a huge part of the hosting effort at OSUOSL, and take a great deal of responsibility. Albertson says that students have full root access, which is unusual for universities. Does this lead to problems? Albertson says that there's a technical gap, but misbehavior? No. He says that students respect the importance of projects they manage. What is difficult is that companies often swoop in and hire students after time in the lab thanks to the experience they've gained. This makes it rough, since Albertson says it takes about six months for students to come up to speed in the lab.

More people, eventually, will be needed says Albertson. "It's incredible we've been able to scale like this, but we're getting to the point of needing to hire more people." Tools and distributions used by OSUOSL? CFEngine and Git, Gentoo (115 machines), and CentOS (about 30 machines). Eventually, Albertson says that the project will be moving to Puppet, and do more server virtualization. He also hopes to do a server refresh, and deploy more data center management tools.

It might surprise many to know that OSUOSL actually hosts servers for Google's open source program office. Why? Because Google's infrastructure does not work well for servers that aren't part of the standard Google infrastructure.

And, of course, OSUOSL hosts including Remember what I said about making use of OSUOSL if you're reading this article? That's because is hosted with OSUOSL as part of the Linux Foundation's hosting with the lab. But if you use Linux and work with many open source tools, the odds are that at least some of those tools are hosted on OSUOSL infrastructure.

Let OSUOSL Host, You Code

What services does OSUOSL provide? Co-location hosting, smart hands support, virtual machines for small projects, managed hosting for some projects, and FTP mirror space. The lab also provides email forwarding and DNS hosting for projects hosted at the lab. Not all the machines managed by OSUOSL are hosted at the lab. The FTP mirrors are hosted in Oregon, Chicago, and New York.

According to Albertson, the hosting is tailored for each project. OSUOSL targets medium to large "high impact" projects, or projects that have potential to be high impact. Most of the projects have outgrown other hosting options, or had bad experiences with other hosting. This includes bad experiences with sites like SourceForge, but most of the time with companies that utilize a project and provide free hosting — and get what they pay for. The idea, says Albertson, is to let the projects code and let OSUOSL handle the hosting.

Albertson also talked about Ganeti and Ganeti Web Manager. Ganeti is cluster-based virtualization management software, and the Ganeti Web Manager is a Django-based interface for Ganeti. Ganeti supports Xen and KVM, but Albertson says that the lab has switched over to KVM after having problems with Xen. The primary cluster using Ganeti is across five machines and hosts 75 virtual machines — and hasn't had an outage in years.

The lab is also encouraging other projects to buy servers for their own Ganeti clusters, and says that it works out very well for them.

OSUOSL is also using Ganeti to create "Supercell," a cluster donated by Facebook for projects to use for continuous integration testing. Supercell provides "on-demand virtualization" powered by Ganeti and KVM. It's currently in beta, but has a half-dozen or so projects using it.

Interested in using OSUOSL hosting? Albertson says that they look for healthy projects that have a good community. Communities that are experiencing "a lot of strife" is something they don't want to get into. The project also has to fit within OSUOSL's constraints and resources. Distros, says Albertson, are generally bad candidates because they have a lot of hefty requirements. Mozilla, for example, has outgrown OSUOSL with the exception of a few servers.


How is all this wonderfulness paid for? Most of the projects hosted at OSUOSL do not pay for hosting, but there are exceptions. For instance, the Linux Foundation (including pays "at cost" for hosting, as well as MeeGo. The lab also gets donations from projects, corporations, and individual contributions. Very little comes from state funds. Albertson also says that the lab needs cash, not used hardware. "Don't send three year old hardware, give us money so we can buy what we need."

OSUOSL does need more hardware and support, because the lab continues to grow and serve new projects. The driver of the growth, says Albertson, is that "word spreads... we rarely go out to a project to offer hosting, they come to us."

That's fine with OSL, that projects seek them out. Albertson says that OSL wants to be "the" place for projects to come for hosting. Why not try to pull in other universities to help carry the load? He says that it's difficult because of the way that universities work, and that most of them do not have the same resources that OSU has in terms of bandwidth and so forth. It's something that OSL might do in the future but it's not in the plan right now.

The number of projects that are being served by OSUOSL is deeply impressive, especially considering that the lab is working with tight resources and somewhat limited staff. It's no doubt a great deal of work, but given the enthusiasm displayed by Albertson and other OSL staff I've talked to, it's a labor of love.

Click Here!