How SPDX will Make FOSS License Compliance Easier



Compliance is a concept that’s been catching on with a lot of organizations and developers in the open source community these days, but it can be complicated. The Software Package Data Exchange (SPDX) working group aims to make compliance with FOSS licenses much easier.

The reasons for this are varied, but the core reason is very simple: free and open source software adoption is growing by leaps and bounds, and adopters have the challenge of properly complying with a whole new family of software licenses.

This is not to imply that compliance is strictly an open source issue. Simon Phipps, Chief Strategy Officer of ForgeRock, devoted an essay to the topic earlier this Fall, where he indicated “[t]here are issues that companies who are shipping open source code as a part of products need to keep in mind, but in my view they are no more complex and burdensome than the issues arising from shipping proprietary software.”

So why the hubbub now?

While there’s nothing inherent about free and open software that requires compliance more than proprietary licenses, companies seem to be paying more attention to the FLOSS-licensed software compliance because the license is a bigger component of why the software was adopted in the first place. Another reason is that FLOSS licenses can require developers using FLOSS software to actually do something with changes they might make to FLOSS code, such as contribute those changes upstream. With proprietary licenses, usually compliance is a one-way street: permission to use or not (unless you pay someone).

There are over 60 open source licenses sanctioned by the Open Source Initiative, which alone can set up a lot of complex interactions between licenses. But then factor in the almost 2000 known proprietary licenses, and the potential interactions become mind boggling.

This is the kind of meat-and-potato work that keeps companies like OpenLogic and Black Duck or in-house compliance teams pretty busy, but often the work is exceedingly difficult because the code files themselves may not have information about its license. This is even true of open source code, despite the fact that the source code is easily available to read. View any file of free and open source code and you may find license information on the file itself, or you may not. And that doesn’t even factor the individual components that may be found within the code.

“Free” or “open,” it seems, does not always equate to “informative.”

The solution, then, is to provide some information with the code that will convey the proper licensing, copyright, and component information for any code — FLOSS or proprietary — in such a way that compliance policies can easily be created and monitored.

This sort of metadata approach is the goal of the Linux Foundation’s SPDX working group, a FOSSBazaar-related team formed about the same time as the LF’s broader Compliance Program earlier this year.

On the surface, the mission of the SPDX seems pretty simple: create a metadata framework that will properly convey the necessary information. Right away, an XML approach makes sense, since an extensible markup language would give the developers the flexibility to add information-rich content and still be very mineable by any data-gathering software tasked to keep track of the metadata.

Indeed, that’s what the SPDX group, a group of about 20 organizations (including the aforementioned Black Duck and OpenLogic, as well as Canonical, HP, Red Hat, and the Mozilla Foundation, to name a few) is doing: building a resource description framework (RDF) specification for XML-based metadata.

Based on the organization of the working group itself, there are three main paths to getting the specification developed: technical, legal, and business. There are teams for each of these aspects of the specification within the working group, with the Technical team handling the tools, documentation, and actual structure of the specification.

Currently, the specification is in beta mode, with plans to have a release candidate in the first quarter of 2011. During the beta program, the SPDX working group is proactively seeking pairs of partners who need to exchange information about the open source licenses associated with a software package, such as a company and a supplier, a Linux distribution and an upstream project; or a hardware supplier and a Linux distribution. Testing these sort of relationships is key to making sure the metadata conveyed is actually useful for all parties.

The biggest advantage to the XML-based system that the SPDX group proposes is that, if done right, metadata can be conveyed about the contents of any code file, regardless of how proprietary the license for that code is. This will make compliance monitoring and policy-building much easier to manage for non-open source licenses which form a vast majority of the licensing landscape.

Developer and corporate involvement will be the key to making a system like this work. Proprietary software vendors will only want to participate in such an infrastructure if their users are clamoring for compliance-assisting tools like SPDX. Given the still-rising adoption rate of FLOSS, the pressure on proprietary vendors should be enough to start industry-wide adoption.