The problem was first noted by Larry McVoy of BitMover, whose BitKeeper products are used heavily by Linus Torvalds and other Linux kernel developers. Only the kernel's public CVS tree was modified. The BitKeeper tree was not, and it was the difference between the two that made the malicious attempt obvious.
In an email to NewsForge this morning, Larry McVoy gave his account of the situation:
As a service to the kernel community, we provide a public machine.
That machine hosts BitKeeper, CVS, and Subversion trees of the kernel;
it's name is kernel.bkbits.net and it is aliased as cvs.kernel.org and
svn.kernel.org. That machine was broken into.
BitMover created a gateway from BitKeeper to CVS for those people who
prefer CVS. We run and maintain that gateway on internal BitMover
machines and mirror the BK->CVS converted files to kernel.bkbits.net.
After the mirroring is complete, just to be sure, we run a little remote
comparison script over the BitMover data and the kernel.bkbits.net data.
That script runs each night and sends mail if there is a problem.
The problem got flagged yesterday, I looked into it, corrected the
problem after saving the bad version of the file, and then sent mail
to the lkml.
As some of the kernel developers have noted, this was only caught because
we're paranoid. BitKeeper users have taught us that if anything can go
wrong, it will go wrong, so you need to have safety checks built into your
application. We did the same sort of thing with the CVS gateway and it
It's also worth noting that Linus himself pointed out that it was unlikely
that this sort of thing could be done in BitKeeper because of BitKeeper's
A few minutes later, this email from Linus Torvalds arrived:
It wasn't really bad at all - except of course in the sense that it's
always a nasty surprise that somebody would try something like that. But
on a scale from "insignificant" to "very very serious" I'd call the hack
Inserting a back-door into a project CVS tree could be very serious in
theory, and in that sense everything was done "right" - the code was made
to look innocuous, and the CVS history looked fairly sane. Successfully
doing something like that into the core CVS tree would quite potentially
be very hard to detect, and two lines of code out of five million is
obviously not a sore thumb that sticks out.
But the thing is, unlike a lot of other projects, Linux kernel development
isn't actually done using a central CVS tree _at_all_. That CVS-tree was
just an automatically generated read-only export for people who want to
use CVS, and the back-door was never in any "real" tree in that sense. The
real trees aren't public, and never have been.
So no damage done, and the thing was noticed when the CVS tree was
re-generated. We're certainly taking it seriously in the sense that I
think it's serious that somebody tried to insert a back-door, but it
didn't matter in a _technical_ sense, if you see what I mean..
We're talking with the people whose machine was apparently used to try to
do the thing, but it was a university machine so in a fairly relaxed
environment. It's been cordoned off and they'll be taking a look at it.
The problem of coders inserting (or attempting to insert) malicious features in operating systems, compilers, and other low-level programs is far from new. Unix creator Ken Thompson demonstrated the possibility of inserting a Trojan into the C language compiler way back in 1984 in an article titled Reflections on Trusting Trust that contained these words:
You can't trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of source-level verification or scrutiny will protect you from using untrusted code. In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well installed microcode bug will be almost impossible to detect.
In other words, no matter how secure your code development process may be, at some point, there must be some level of trust. While Thompson's statement, "No amount of source-level verification or scrutiny will protect you from using untrusted code," may have some truth to it, right now it's the best protection we have, especially when human code-vetting efforts are augmented by audits and code comparisons performed by software we have now, but didn't exist back in the 80s.
In its latest security test -- which is probably not what the malicious code-inserter wanted his or her efforts to become -- the Linux development process passed with flying colors and will use the experience to become even more attack-resistant in the future.
Could your code development process -- proprietary or open source -- could pass a similar test?