This week marks the 10-year anniversary of the day Linux creator Linus Torvalds released the first version of the Git distributed revision control system. Thousands of open source projects now rely on this popular software development tool, which fuels the growth of the collaborative development model that now dominates tech innovation.
In celebration of this milestone we've asked open source project maintainers and leaders to share with us throughout the week how and why they use Git, tell their Git success stories, and give some pro tips for getting the most out of the tool.
Linus Torvalds himself started the series yesterday. Today we'll hear from Paolo Bonzini, a principal software engineer at Red Hat, QEMU contributor and maintainer of KVM – the Linux kernel-based virtual machine. And stay tuned for the Git stories behind Qt, Drupal, Puppet, Wine, and Tor.
Linux.com: Why does KVM use Git?
Paolo Bonzini: KVM is "just" a subsystem of the Linux kernel, so we use Git just like other parts of the kernel. I am currently the overall maintainer, so I apply patches (from myself or others) and process "pull requests" with patches already vetted and tested by my submaintainers. My tree has three branches: one for patches ready for the current version of Linux, one for patches ready for the next version of Linux, one for patches that should be okay for the next version of Linux but haven't been tested enough yet.
But you cannot use KVM alone, you need some other code that uses it, and that's why most KVM contributors also work on QEMU. Considering how Git is used, QEMU works roughly the same as Linux. Only one person, currently Peter Maydell, commits to the official QEMU repository; patches mostly come from submaintainers through pull requests. We found that the model works well, to the point that Peter (who unlike Linus is also a submaintainer) will send pull requests to himself!
What makes Git such a great tool?
The obvious answer would be its distributed nature. It allows submaintainers to work in parallel and provides a very easy match for the hierarchical structure of the projects. However, many very large projects work well without such a hierarchical structure.
Therefore, I'll say that Git is great because it provides version control in a very non-intrusive way, and because it provides version control very easily for individual projects, too. I'm using the word "projects" in a very broad sense, for example that includes features that are to be included later in Linux or QEMU. You don't have to be connected to the Internet, you don't have to setup a server, you don't even need a separate directory. You don't need to tell the world in advance what you're doing.
"git init" or "git checkout -b" are enough to start a project or a feature, and enjoy version control from the very beginning. I think that this leads to code that is better and more maintainable.
How many developers do you have collaborating on Git?
Each release of QEMU has contributions from roughly 170 people. The distribution has a very long tail: about 40 percent of those 170 people contribute only one patch, and about 60 percent contribute less than five.
KVM is smaller, with about 25 people contributing to each release. The same "long tail" effect is visible there, about half of the people only contribute one or two patches.
The long tail is very important. A lot of those "drive-by" patches are bug fixes.
How much do you personally use it?
Of the 1000 commands I have in my shell history, about 400 are invocations of git! (The front runner is vi, with a bit less than 200 invocations).
What's KVM's most active git repo right now and why?
It's difficult to say. Of course all the "action" ultimately becomes part of the top level repository; for KVM that would be mine, for QEMU the official one.
But development happens in the submaintainers' repos as well; in the case of QEMU, in practice it only happens there. For QEMU, the most active repositories are probably Peter's ARM repository and the "block device" repository. ARM is very active because there are so many kinds of ARM boards and people use QEMU for emulating them, not just for virtualization. Block devices are very active because... well, because there's a ton of work to do!
What is your favorite pro tip for using Git?
I have several "aliases" that simplify some git tasks. Here are the simplest of them:
changes = diff --name-status -r
diffstat = diff --stat -r
whatis = show -s --pretty='tformat:%h (%s, %ad)' --date=short
pwhatis = show -s --pretty='tformat:%h, %s, %ad' --date=short
The two "-r" haven't been necessary for several years, but those two aliases are 8 years old and I've never bothered to update them! The "changes" name comes from Arch, a distributed version control system from which I switched to git.
"whatis" and "pwhatis" convert a commit id to a format that can be pasted in an email. "pwhatis" is for pasting inside parentheses, "whatis" works outside parentheses. When discussing a patch it helps a lot to refer to past commits, and it's good to use a consistent format (id, subject, date).
Any Git success stories you can share?
I use it so much that I cannot think of any success story. It's just the reliable tool you use daily and you cannot live without anymore. Perhaps that already counts as a success story?!?
Anything else you'd like to say to mark the 10-year anniversary?
Just a shout-out to my former colleague Jeff Rose, who convinced me to switch to git. That was back in 2007!