Email2git: Matching Linux Code with its Mailing List Discussions
TL;DR: Email2git is a patch retrieving system built for the Linux kernel.
It exists under two forms:
As a cregit plugin: Retrieve patches of selected source code tokens
As the email2git search tool: Retrieve patches for entered commit IDs
The Linux project's email-based reviewing process is highly effective in filtering open source contributions on their way from mailing list discussions towards Linus Torvalds' Git repository. However, once integrated, it can be difficult to link Git commits back to their review comments in mailing list discussions, especially when considering commits that underwent multiple versions (and hence review rounds), that belong to a multi-patch series, or that were cherry-picked.
As an answer to these and other issues, we created email2git, a patch retrieving system built for the Linux kernel. For a given commit, the tool is capable of finding the email patch as well as the email conversation that took place during the review process. We are currently improving the system with support for multi-patch series and cherry-picking.
Email2git is available through two interfaces: as a cregit extension, and as a simple commit ID based search tool.
Email2git on cregit
The online cregit interface displays the Linux source code with highly accurate contributor information. The source code is divided into “code tokens” and can return the author of each token with an increased accuracy compared to the current line-level granularity of git blame.
As of today, email2git extends cregit by linking the tokens in a particular kernel release with the original patch introducing them and the email discussions reviewing them.
In your browser, navigate to https://cregit.linuxsources.org, then navigate by clicking on the appropriate directories until you find a source code file of interest. Click on the file to open it. Then simply click on a code token while browsing the source code to display the links to the patch / discussion.
Image 1: Linux source code as displayed on cregit.linuxsources.org.
The different colors in the source code represent the different contributors. The interface lets users hover over the “tokens” to display basic information such as commit date, commit message summary, and author.
Image 2: Tooltip displaying detailed information about the commit that introduced the token.
Now, users can also click on the token to display a list of links to the reviewed patch versions and email discussions (reviews) that introduced that token into the main kernel tree.
Image 3: Links to patches being displayed after user clicks on a token.
Since patches are often sent to multiple different mailing lists, we provide the links with all the different patches (when available) to give you access to as much discussion as possible.
Email2git commit search
We believe that email2git is a great addition to cregit, since it provides easy access to in-depth authorship information to anyone browsing the source code.
However, we understand that developers may be looking for reviews and discussion about a specific commit ID instead of having to browse the full code base on cregit. To address this, we created a simple commit ID based search. Paste the commit ID into the search box to retrieve the patch and discussion.
Image 4: commit-based patch search.
The source code for this work can be found at: https://github.com/alexcourouble/email2git
Do not hesitate to email us at firstname.lastname@example.org if you have any questions. We welcome any feedback and suggestions!
We will be presenting email2git at Linux Plumbers 2018 on September 13 in the following talk: “email2git: A Cregit Plugin to Link Reviews to Git Commits” if you would like to speak with us in person.
Alexandre Courouble is a Master’s student working under the supervision of Dr. Bram Adams at Polytechnique Montreal. As a part of his degree, he is working on email2git and on a research project aiming at measuring Linux developers’ expertise using dedicated metrics. Alex gave a related talk on cregit titled “Token-level git blame” at the 2016 Linux Plumbers Conference.