May 11, 2004

Bitkeeper after the storm - Part 1

Author: Joe Barr

It has been a couple of years since the Linux kernel mailing list was debating the issues of Linus Torvalds' scalability and the use of a proprietary source management tool called Bitkeeper to handle kernel patches. Now that the dust has settled, and intrigued by a press release from Bitkeeper author Larry McVoy that claimed impressive productivity gains for Linus Torvalds and other kernel hackers using Bitkeeper, NewsForge decided it was time to talk with McVoy on the current state of affairs between the free software hackers and his proprietary code. This is part one of that interview; part two will appear tomorrow.

NF: It has been about a month since the press release with the startling news
that BitKeeper more than doubled Linus Torvalds' productivity. What was the reaction to the news by other kernel hackers?

McVoy: It wasn't really news to the senior developers. They already knew.

Here's how that announcement came about. I asked someone we were
considering hiring why he wanted to come work for us. His response was,
"I hang out on the kernel list and it is obvious that Linus is ten times
more effective since he switched to BitKeeper." That sounded pretty nice,
but I didn't believe it. I knew things were better, but ten times better?
That sounded a little too good to be true.

I know some of the senior kernel people personally so I started asking
around. I spoke with Dave Miller, Jeff Garzik, Greg Kroah-Hartman,
Andrew Morton, and Linus about this. Dave was the first person I spoke
with and he said that he thought that 10x wasn't at all unlikely, and
it was certainly 8x. Interesting. So I talked to Jeff and
his comment was, "Oh, man, it's so much better, it has to be 10x."
Greg had a fairly similar reaction. I was having lunch with Linus,
Andrew and Ted T'so to talk about digital signatures for the kernel (those
are implemented now, by the way) and I brought this up as a question.
Andrew thought that anything would have been an improvement over what
Linus was doing before and he agreed that BitKeeper was a lot better
than CVS. But his take was that just a move to CVS would have been an
improvement. Linus disagreed. Linus was adamant that if he had moved
to CVS it would have slowed him down. So in Linus's mind, whatever
improvement had happened was due to BitKeeper.

Greg has written a paper about the rate of change since the switch to
BitKeeper. He has a lot to say about how BitKeeper has helped -- you
might ping him for details. Some of the things I remember are:

  • It's a lot easier to track what Linus is doing because you can see
    his tree long before an initial release. Linus pushes close to daily to linus.bkbits.net.
  • Independent development works much better. You can just use BK to
    do the merges and track what is in Linus's tree.
  • It's trivial to see if Linus has merged your changes.
  • The whole system is asynchronous; you can do work while someone else does work and BK will merge it for you when you sync up.

The senior developers were well aware that things are better. The 2x
announcement wasn't news at all. From their point of view, the 2x claim
is an understatement because for them the improvement is bigger than that.
But any claim is likely to be challenged, so what we did to arrive at that
number was to simply measure the amount of change over the two-year period
in BitKeeper and contrast that with the two-year period before BitKeeper.
It worked out to about 2.5x more change. The metric I'd love to have
is the number of patches integrated. We're all in agreement that it is
far more than 2.5x more than it was before. Linus is processing around
50 patches a day, 365 days a year. That's an amazingly high number.
Nobody in the software industry has ever processed that much change to
my knowledge and I have worked at SCO, Sun, SGI, and Google as well as
a few smaller companies.

Before we made the productivity announcement we talked it over with Linus.
It was Greg who suggested the idea of measuring the diffs as a way of
getting a quantitative handle on the problem. I showed the numbers to
Linus and asked him if he agreed and he did. So that's how it happened.

NF: What is the level of acceptance for BitKeeper now as opposed to when
Torvalds first announced he was going to use it?

McVoy: Most people use it so the acceptance level seems high. There was concern
at the beginning that maybe we were trying to exploit the kernel team
somehow. Our position has always been that we were and are sincere
in our desire to help the community. Nobody believes in a free lunch,
so many people try to figure out "what's the catch?" The more vocal
we were regarding our sincerity, the more suspicious people became.
That's human nature and there isn't much we can do about it other than
continue to demonstrate that we will do the right thing. I used to
work at Google and their "do no evil" motto is something that I took
away from them. It's a good way to run a business, but it makes people
wonder a bit. People expect corporations to be "evil," but not all of
the corporations are evil. Google is a very visible company trying to
do the right thing; we're a far less visible company but we are also
trying to do the right thing. It is possible to do the right thing and
make money and maybe Google's example will inspire other companies to
follow suit.

I believe that a lot of the concerns have faded away because it is
years later and we are still here and still supporting the free use
of BitKeeper. Linus has used BK for more than two years, but the Power
PC folks have been in BK since 1999, so we have been supporting kernel
people in BK for at least four years. The MySQL folks have been using BK
for about the same amount of time, so it ought to be clear that we are
committed to helping the free software community.

It's worth pointing out that we are profitable and have no outside
investors. That means that we, the employees and owners of BitKeeper,
decide if it is a good idea to support the kernel and the other free
users. We, not some outside money-focused investors, decide if what
we are doing is a good thing. And we like the free software community.

There are some people who will always be worked up about any
infrastructure that isn't GPLed. We understand their concerns and
that's why we built the BK2CVS gateway. That way people know that no
matter what they have the history in a GPLed tool. We do the export
nightly and mirror the CVS root to master.kernel.org so there is simply
no question that the data is available in a free form.

Along with BitKeeper itself, we provide bkbits.net, a free hosting service
for BK repositories (Linux is there, MySQL is there, so are lots of
other projects), and we provide a free public server (kernel.bkbits.net)
that anyone can use if they are working on the kernel, BK user or not.
The amount of service that we provide for free should, in theory, help
convince people that our intentions are good and we are really trying to
help the community of free software developers. People didn't trust that
initially, but the longer we keep helping the more people tend to trust us.

NF: Is your pro bono work for Linux kernel development paying off in
sales of your proprietary product?

McVoy: Absolutely. People look at how the kernel is being managed in BK and
they believe that if BK can do that then it can handle their problems
as well. A big marketing win for us is bkbits.net, our free hosting
service. Managers look at that and at the sheer volume of the data
(6 million files in 55GB of data) and when they learn that we spend
less than a man-week a year on supporting bkbits.net, they are sold.
That's a good thing for everyone; we're providing a useful service and
we get some marketing value from it.

We derive benefit from the pro bono work in other ways as well. When we
are testing out a new release we can put it on bkbits.net and we know in
seconds if we have broken something important; people use old versions
of BK to talk to bkbits.net every few seconds.

We are strongly committed to helping the Linux kernel community and
other open source projects. Not everyone may believe this, but we'd be
doing it even if there was no benefit to us. It is our way of giving
back some value for all the great free software we use every day.
We run our business on free software, we develop our product with free
software, the free software community has been great for our business.
All companies who benefit from free software ought to find a way to help
the people who are producing that software.

I'm aware that some in the community would prefer that we gave back by
adding to the pool of free software, but our product space doesn't seem to
work well in that model. So we give back in other ways. The majority
of the people in the community has come to trust that we will continue
to do so.

Category:

  • Linux
Click Here!