April 26, 2006

Wikis, gateways, and Garbee at LinuxWorld Toronto

Author: David Graham

TORONTO -- Yesterday's second day of the LinuxWorld Conference & Expo in Toronto saw the opening of the exhibit floor, two keynotes, and a variety of interesting but not entirely topical sessions.

I started the day off by attending a session by Peter Thoeny of TWiki, who led a discussion of what wikis are and why they are useful in the workplace. Thoeny said several different wiki implementations are available, including some in black boxes that can be plugged into a network ready to run. As far as common wikis, the best known is MediaWiki, the engine for the popular Internet resource Wikipedia, an encyclopedia that can be maintained by anyone on the Internet who cares to contribute. At present, Thoeny says, it has approximately 1 million English entries and about 100,000 registered users. He opened Wikipedia and entered a minor correction to an article without logging in to demonstrate the capability.

Thoeny posed the rhetorical question: with anyone and everyone being able to edit Wikipedia, won't it descend into utter chaos? Thoeny says no -- that on Wikipedia, experts in most domains keep their eyes on articles relating to their fields. If something incorrect is entered, it is usually fixed within minutes. Pages that are esoteric or that cover topics of narrow interest are not watched as closely, and errors can survive longer on those pages before being corrected. Wikipedia also polices for copyrighted material to ensure that there are no violations.

Content on Wikipedia is released under the GNU Free Documentation License (GFDL), which allows the free redistribution of the content of the site with the condition that a notice of the license be included with it, very much like the GNU General Public License's (GPL) permissions and requirements related to code distribution.

Asked if it is possible to restrict access to a wiki, Thoeny says that it is designed to be world writable, but in a corporate environment, for example, it is relatively easy to lock a wiki down as needed.

Graffiti and spam on Wikipedia are also usually resolved quickly, Thoeny says, as they're easily identifiable, and revision control allows a previous version to be recovered and reposted relatively simply. On public wikis, spam and graffiti are not uncommon, but the problem, he noted, is non-existent on internal corporate wikis, as they are not viewable by anyone.

Thoeny also described some of the many features of wikis, such as WikiWords, which automatically link to other articles by the name of the word, and the ability for most wikis to accept plugins to include special characters or features.

Thoeny's own background is as the lead developer of TWiki, a corporate environment oriented wiki developed by five core developers, 20 more developers with write access, and around 100 more contributors.

The various different wiki systems have no standard, though a conference to discuss the topic will take place in Denmark this August. About the only specific feature all wikis have, aside from being reader-editable, is that in text entry a blank line will create a new paragraph. As far as how to create special text or make text bold, each wiki server has its own way of doing things, Thoeny says.

He demonstrated TWiki's capabilities extensively, noting that the program is downloaded around 350 to 400 times per day.

A geneographical keynote

Nearly halfway through the three-day conference, the show's opening keynote was delivered by IBM Computational Biology Centre researcher Dr. Ajay Royyuru. While interesting, his keynote had little to do with Linux or open source.

In brief, Royyuru discussed a joint project between National Geographic and IBM to use aboriginal DNA from all over the world to try and trace pre-modern migratory patterns for humans. The data analysis is performed on IBM-provided Linux systems, according to one of the slides that appeared briefly on the screens at the front of the hall, providing a tenuous link to the topic of the conference.

Multi-homing redundant network backup

Burhan Syed of Shaw Cable's Big Pipe subsidiary gave a late morning talk on multi-homed network routing solutions around Border Gateway Protocol (BGP) and Linux.

Syed discussed the fact that many companies want network connections that won't die. In the case of one e-business-based customer, connection down-time costs around $37,000 per hour in lost business. He touched on the many solutions companies have come up with to solve the problem. The cheapest and most common, Syed says, is for a company to buy two connections to the Internet. One of them is the primary connection on which the servers run, and the other sits disconnected on the floor waiting for an outage on the first. When then outage comes, the tech guy will come, hopefully relatively quickly, and unplug the downed connection, plug in the backup connection, and reconfigure the network to use the freshly connected alternate.

For a company running a Web site that is central to its business, this has some serious drawbacks. Aside from having to reconfigure its own network to use the different Internet service provider, the company's Domain Name Service table needs to re-propagate to make the Web server accessible at its new location. DNS is what allows browsers or other services to interpret an IP address into a number that can be tracked down to a server. IP addresses are assigned by service providers, and when you're using multiple service providers in this way, the address changes, and anyone trying to connect will need the current information to succeed.

After discussing other better but still similar solutions, Syed explained how to do redundant Internet connections correctly.

You need a block of IP addresses assigned to your company by the American Registry for Internet Numbers (ARIN), BGP-aware routers, and an Asynchronous System Number (ASN), also assigned by ARIN. All of this adds up to a lot of money, and IP addresses and ASNs are limited. In the case of ASNs, only around 65,000 are available for the entire world.

BGP works, he explained, as a network-to-network routing protocol. A BGP router will exchange routing tables with other BGP routers to determine the most direct way between any two networks on the Internet. It won't go for the fastest or cheapest route, necessarily, just the most direct one, and it ignores the number of actual hops to get through a network and out the other side, only being interested in the actual network.

To apply this to your business, you need an IP address block with at least 254 addresses (a /24) -- otherwise Syed says that other BGP networks will ignore you for being too small. You then connect to two or more ISPs using your own IP addresses, which is made possible by your ASN. A separate BGP router is used to connect to each of the ISPs, and the routers also communicate directly with each other, so they know when a connection has gone down and can trade BGP routing tables.

This is where Linux comes in, Syed explained, as hardware routers from the likes of Cisco can run into the thousands of dollars, and lack enough memory to hold a large BGP routing table, which can run to around 128MB for 136,000 routing entries, while many routers come with only 64MB of RAM to work with. You can save a lot of money by using Linux systems running a routing program called Zebra, which can be administered using Cisco commands, on top of a Linux system.

Syed warned that ISPs should not be charging for access to BGP through their service, and to be wary of ones that do. Syed's own ISP offers the service and has an internal BGP-based system that uses ISP-assigned ASNs internally for a less expensive solution.

Corporate lunch

Having never been to one before, I dropped by IBM's sponsored Media Lunch. With a small handful of other members of the media, I ate a couple of sandwiches that made airplane food taste like gourmet fare and sat back to listen to what IBM's presenters had to say.

The first presenter was Dr. Ajay Royyuru of keynote fame, who gave an abbreviated 10-minute version of his keynote, followed by representatives from the Bank of Canada, the University of Toronto, and iStockphoto.com, all giving similar presentations about their success built on top of Linux-based IBM systems.

Reaping the benefits of open source

Bdale Garbee speaks about open source at HP

Following IBM's thinly disguised marketing event, I went upstairs for the day's second keynote speech, delivered by Hewlett-Packard's Linux Chief Technologist and former Debian Project Leader Bdale Garbee.

Garbee's keynote was interesting but poorly attended, with fewer than 100 people in the audience. Garbee described HP's role as he saw it in the Linux and open source community as market stewardship. It's HP's job, he said, to help companies deploy Linux systems.

He noted that HP is the only major Linux-supporting company not to write its own open source license. Instead, Garbee said, it's HP's policy to look through the GPL, BSD, Artistic, and other licenses to understand what they are and how they work and for HP to work within that framework.

As to why companies should use Linux, one reason Bdale gave is that companies can download and try it and its associated software off the Net before committing to a large scale rollout, while many commercial programs require simply purchasing and rolling out their software. Open source allows the avoidance of vendor lock-in, he added.

Garbee addressed the issue of whether Linux and open source is less secure because malicious people can read code and find vulnerabilities that have not been fixed. His answer was that it works both ways. More eyes are indeed on the code, but many of them are good, and more security vulnerabilities are found and fixed than might otherwise be the case.

For commercial adoption of Linux, Garbee said we're at the point where nearly all companies use Linux in at least some capacity, whether it be for a simple DHCP, DNS, or Web server or for the entire company's database systems. Even if the CIO of a company is not aware of Linux being present in the company, it is generally there if you ask the tech guys on the ground.

Open source middleware is at an early stage of adoption, he says, and open source is beginning to be at the leading edge of full specialized applications for the corporate environment.

In the server market, Linux and Windows are both gaining ground from Unix, but Linux is growing faster than Windows, Garbee says, though Linux has some of what he termed "pain points," such as the perennial complaint about support accountability. In a traditional IT department, there is a need to identify the party that will take ownership for a particular problem, and that party is usually a vendor who will resolve the problem. The question may be harder to answer for Linux and open source.

At HP, Garbee says, Linux is not a hobby, it's a business strategy that is paying off. He described Linux and HP's relationship as symbiotic, and listed statistics relating to HP's involvement in the community. HP employs around 6,500 people in the OSS-related service sector, 2,500 people as developers for open source software, has some 200 open source software based products, and has instigated at least 60 open source projects, as well as releasing numerous printer drivers as open source.

Garbee spent an extended period of time following his presentation taking questions from the floor. Among the questions asked was about HP's Linux support in its consumer desktop and laptop computers. Garbee described that side of the business as not high margin and indicated he was working on trying to get better Linux support from within the company for those products. Business-oriented machines, though, he says, including workstations and laptops, are generally Linux-compatible, as he says HP uses better hardware.

The tradeshow floor

Following Garbee's keynote, I took a brief sweep of the tradeshow floor to see what was cooking, but the floor was not particularly busy. I took a few pictures of the proceedings and carried on, satisfied that I wasn't missing much.

Click Here!