Git has changed the way that software is built -- including the Ceph open source distributed storage platform, says Ceph Creator Sage Weil. Ceph has used the Git revision control system for seven years, since it switched from SVN. It has changed the project’s work flow and how they think about code.
“Instead of thinking in files and lines, you think in flow of changes. Instead of having a single repository that everyone feeds from and into, everyone now has their own repository, their own branches. The meaning of branch changed,” said Weil, Ceph principal architect at Red Hat. “Everything just fell in place, as if the people who designed it really knew software development at scale.”
In our final “Git Week” profile in celebration of Git’s 10-year anniversary, Weil discusses how and why Ceph uses Git, tells their Git success stories, and gives his best pro tip for getting the most out of the popular tool.
Linux.com: Why does Ceph use Git?
Sage Weil: There are a number of reasons that Ceph chose Git, but in general the flexibility and power that comes with Git is really hard to beat. Of course the distributed nature of the tool also appeals to a team that is working on distributed software as well.
Ceph has been using Git for more than 7 years. We switched from Subversion to Git when we started developing the Ceph kernel module, and never looked back. The simplest and most important feature at the time was a sane representation of merges. Now that the Ceph contributor base has grown we are completely dependent on distributed version control.
Ceph is deeply embedded in the Linux open-source world and everything there uses Git. There is no other tool which meets our needs for maintaining branches and developing features in parallel over long (and short!) periods of time.
What makes Git such a great tool?
The highlights are definitely the branching workflow (which we use for everything, and check out ‘Git stash’), it’s distributed nature (cherry picking and sharing between developers outside of the central repo), better history management (the ability to make local commits, rebase, clean up the history, and then submit a series of commits for merge to the central repo), and flexibility (version control shouldn’t get in the way). There is a learning curve, but it is worth the pain.
Looking at its history, it still amazes us how it came to the world, and the fact that they were able to pull it seemingly off the cuff, when we needed to move away from BitKeeper. At the time, thinking about version control, at least the free and open source solutions that existed, you ended up with either Subversion or CVS. That was what developers were used to, and we worked within these frameworks. Git changed the whole workflow. Instead of thinking in files and lines, you think in flow of changes. Instead of having a single repository that everyone feeds from and into, everyone now has their own repository, their own branches. The meaning of branch changed. It’s so cheap now. Everything just fell in place, as if the people who designed it really knew software development at scale.
How many developers do you have collaborating on Git?
At last check there are 242 contributors identified in our core repository. However, with our metrics dashboard (http://metrics.ceph.com/), we track about 460 developers across all of our sub-projects.
How much do you personally use it?
Git is part of the daily workflow for all developers. It’s the single most used tool other than the editors and the compiler toolchain.
What's Ceph's most active Git repo right now and why?
The most active repo is definitely the core ceph.git repo. While we have sub-projects on GitHub for some of our associated development, most of the development happens in Ceph itself.
What is your favorite pro tip for using Git?
The combination git gui’s ability to quickly stage and unstage lines or hunks into a commit and git rebase -i’s ability to reorder and combine commits is invaluable. This allows you to work with the history as a series of patches and rearrange the content of those patches into a clean history for submission upstream.
Any Git success stories you can share?
Just a couple of months ago we worked on merging a patch set that diverged for six months with hundreds of commits and with many conflicts that would not have ended successfully using any other tool. Trying to do the same thing with SVN (at least as it existed when we made the switch) would have been very challenging and would not have provided a clear view of the code changes. The whole way of thinking about code flow is now different.
Anything else you'd like to say to mark the 10-year anniversary?
It's hard to believe Git is 10 already. Ten years ago using Git was quite a pain. There were all these wrappers and frontends that made it easier to contribute to a project that was using Git. There is no such need anymore. Usability has improved and users have learned to understand what it’s all about (and internalize the UI inconsistencies and idiosyncrasies). I think that’s a success story.