Measuring the Health of Open Source Communities

2415

Abstract: Tracking different types of metrics is essential for free and open source communities. Metrics give project insights into specific efforts and help get a feel of the community’s general perception. For that, tools that can pull data from various sources and develop a visualization of this data will help projects make informed decisions.

If you manage or want to be part of an open source project, you might have wondered if the project is healthy or not and how to measure key performance indicators relating to project health. 

You could choose to analyze different aspects of the project, such as the technical health (such as number of forks on GitHub, number of contributors over time, and number of bugs reported over time), the financial health (such as the donations and revenues over time), the social aspects (such as social media mentions, post shares, and sentiment analysis across social media channels), and diversity and inclusion aspects (such as having a code of conduct, create event inclusion activities, color-blind-accessible materials in presentations, and project front-end designs). 

The question is, how do you measure such aspects? To determine if a project’s overall health, metrics should be computed and analyzed over time. It’s helpful to have such metrics in a dashboard to facilitate analysis and decision-making.

Why do metrics matter?

“The goal here is not to construct an enormous vacuum cleaner to suck every tiny detail of your community into a graph. The goal is instead to identify what we don’t know about our community and to use measurements as a means to understand those things better.”

The Art of Community – Jono Bacon

Open source software needs community. By knowing more about the community through different metrics, stakeholders can make informed decisions. For example, developers can select the best project to join, maintainers can decide which governance measures are effective, end-users can select the healthier project that will live longer (and prosper), and investors can select the best project to invest in [1]. 

Furthermore, Open Source Program Offices (OSPO), i.e., offices inside companies that aim to manage the open source ecosystems that the company depends on [5], can assess the project’s health and sustainability by analyzing different metrics. OSPO is becoming very popular because around 90% of the components of modern applications are open source [6]. Thus, measuring the risks of consuming, contributing to, and releasing open source software is very important to OSPO [5].

How do we define which metrics to evaluate?

  • Set your goals: Measuring without a goal is just pointless. Goals are concrete targets to know what the community wants to achieve [3].
  • Find reliable statistical sources: After defining your goals, you can then identify the source to help you achieve your goals. It is essential to find ways to get statistics on the most important goals [4]. Some statistics are apparent, such as on GitHub, you can collect the number of stars, number of forks, and number of contributors to a repository. It is also possible to get mailing lists subscribers and the project website visits. Some statistics are not so obvious, though, and you might need tools to help extract such numbers.
  • Interpret the statistics: Interpret the statistics regarding the “4 P’s”: People, Project, Process, and Partners [4]. 
    • Look at the numbers mostly related to the People in the community, such as contributors’ productivity, which channels have the most impact, etc. 
    • Then, look at the velocity and maturity of your Project, such as the number of PRs, and the number of issues. 
    • After that, look at the maturity of your Process, i.e., what’s your review process? How long does it take to solve an issue? 
    • Finally, look at the ecosystem view regarding your Partnersthat is, statistics on project dependencies and projects that depend on you.
  • Use dashboards to evaluate your metrics: Many existing tools help to create dashboards to analyze and measure open source community healthiness, such as LFX Insights, Bitergia, and GrimoireLab.
  • Make changes: After measuring, it is necessary to make changes based on those measurements.

Learning from examples

Different projects use different strategies to measure the project’s health. 

The CHAOSS Community creates analytics and metrics to help understand project health. They have many working groups, each one focusing on a specific kind of metric. For example,

  • The Diversity and Inclusion working group focuses on the diversity and inclusion in events, how diverse and inclusive the governance of a community is, and how healthy the community leadership is. 
  • The Evolution working group creates metrics for analyzing the type and frequency of activities involved in software development, improving the project quality, and community growth. 
  • The Value working group creates metrics for identifying the degree to which a project improves people’s lives beyond the software project, the degree to which the project is valuable to a user or contributor, and the degree to which the project is monetarily valuable from an organization point of view. 
  • The Risk working group creates metrics to understand the quality of a specific software package, potential intellectual property issues, and understand how transparent a given software package is concerning licenses, dependencies, etc.

The Mozilla project collaborated with Bitergia and Analyse & Tal to build an interactive network visualization of Mozilla’s contributor communities. By visualizing different metrics, they were able to find that Mozilla has not only one community but many communities concerning other areas of contributions, motivations, engagement levels, etc. Based on that, they built a report to visualize how these different communities are interconnected.

LFX Insights

Many projects such as Kubernetes and TARS use the LFX Insights tool to analyze their community. 

The LFX Insights dashboard helps project communities evaluate different metrics concerning open source development to grow a sustainable open source ecosystem. The tool has distinct features to support various stakeholders [2], such as

  • Maintainers and project leads can get a multi-dimensional reporting of the project, avoid maintainer burnout, ensure the project’s health, security, and sustainability.
  • Project marketers and community evangelists can use the metrics to attract new members, engage the community, and identify opportunities to increase awareness.
  • Members and corporate sponsors can know which community and software to engage in, communicate the impact within the community, and evaluate their employees’ open source contributions.
  • Open source developers can know where to focus their efforts, showcase their leadership and expertise, manage affiliations and their impact.

The source code repository includes the number of commits in total and by contributor, the number of contributors, the top contributors by commits, and the companies that mainly contribute to the project. Users can extract Pull requests (PRs) from many tools such as Gerrit and GitHub. Furthermore, users, maintainers, and contributors to Linux Foundation projects, such as TARS, can extract various metrics from LFX Insights. 

Similarly to commits, the number of PRs in total, by contributor, and by company. The tool also calculates the average time to review the PR and the PRs that are still to be merged. You can also extract metrics for issues and continuous integration tools. Besides that, LFX Insights allows projects to collect communication and collaboration information from different communication channels such as mailing lists, Slack, and Twitter.

Projects might have different goals when using LFX Insights. The TARS project, part of the TARS Foundation, uses the LFX Insights tool to have a big picture of each sub-project (such as TARSFramework, TARSGo, etc.). Through the dashboards created by the LFX Insights tool, the TARS community can know the statistics of each project and the community as a whole (see Figure 1 and 2).

Using LFX Insights tools, the TARS community analyzes how many people contribute to each project and which organizations contribute to TARS. Additionally, they extract the number of commits and lines of code contributed by each contributor. The TARS community believes that by analyzing such metrics, they can attract and retain more contributors.

About the authors: 

Isabella Ferreira is an Ambassador at the TARS Foundation, a cloud-native open-source microservice foundation under the Linux Foundation.

Mark Shan is the Chair at Tencent Open Source Alliance and also Board Chair of the TARS Foundation Governing Board. 

REFERENCES

[1] Jansen, Slinger. “Measuring the health of open source software ecosystems: Beyond the scope of project health.” Information and Software Technology 56.11 (2014): 1508-1519.

[2] https://www.youtube.com/watch?v=hwTOrDg3LsI

[3] https://opensource.com/bus/16/8/measuring-community-health

[4] https://dzone.com/articles/-measuring-metrics-in-open-source-projects

[5] https://opensource.com/article/20/5/open-source-program-office

[6] https://fossa.com/blog/building-open-source-program-office-ospo/