May 7, 2008

Creative Commons promotes standard license expression

Author: Bruce Byfield

If Creative Commons (CC) has any say in the matter, the Web will soon have a standard machine-readable notation for licenses. Named the Creative Commons Rights Expression Language (ccREL), the notation has been under development for the last few years, partly with the cooperation of the World Wide Web Consortium (W3). It is described in a paper by four Creative Commons employees and published by Communia, a European site that explores the relationship between technology and the public domain. Creative Commons plans future presentations of ccREL, and is also actively explaining the need for it -- which is what CC's Chief Technology Officer, Nathan Yergler, was doing when caught up with him at the recent Open Web Conference in Vancouver.

As the preliminary paper explains, the issue is to develop a consistent structure for describing a license so that it can be used with a variety of different markups. At the same time, the structure must be flexible enough to apply to a variety of different media, and to be easily modified as online works develop and the needs of their creators and users change.

This standardization effort is specialized but important because it defines exactly what a license is, and does so for all markup languages instead of leaving each to create its own definition. A consistent structure also means that separate tools for each language are not needed to extract license information.

Technically, Yergler says, calling ccREL a language is "a bit of a misnomer. ccREL is probably more accurately labeled ccRights Expression, Vocabulary, and Recommendations." In other words, ccREL is more of a definition than a language, although it already seems too late to consider changing the name.

The basis of CC's efforts to create this structure is the Resource Description Framework (RDF), a family of W3 specifications for describing metadata -- information about embedded objects -- that is a key part of the W3 plans for the next stage in the evolution of the Web.

Originally, CC tried developing a licensing structure using a variation of RDF for XML called RDF/XML. However, RDF/XML's structure was verbose and did little to make extraction of the information consistent or extensible. "When we first started doing this," Yergler says, "the metadata was included inside an HTML comment, and while that seemed to be the best option at the time, there are lots of reasons why it's not great. Parsers are free to throw away comments, and typically humans never see them -- never mind that a lot of publishing software will escape all of your angle brackets." Clearly, an alternative was needed.

As a result, in 2004, CC began working with W3 on RDFa, an RDF schema that uses existing HTML structures as much as possible and adds a few new ones, so that the information is simultaneously both human and machine readable. Yergler says RDFa is at the "last call" stage before being accepted by W3. The original RDF is now officially deprecated by CC.

RDFa defines two types of properties for a license: Work properties and License properties. The standard requires that at least the license must be defined in the Work properties, although, for a CC license, a statement will generally include a License property that defines what type of CC license is used. Users are also free to add other Work properties, such as title, and other License properties, such as jurisdiction.

According to Yergler, this flexibility beyond the bare essentials has the benefit of making individual users of the licenses the arbiters of what information goes into the definition, rather than making CC "the one true registry of copyright information," a position that it has neither the desire nor the ability to assert. Yergler says that the point is to allow users -- whether humans or crawlers and aggregators -- to quickly answer questions about your content, such as "How can I reuse it? Can I use it for commercial purposes? Can I make derivative works? And if I do reuse it, who do I actually attribute it to?"

"Of course, humans are inherently better at making certain judgments than machines about such things," Yergler says. "For example, I might trust assertions that are published on somebody's own domain more than ones on GeoCities, just because I have this human bias about what GeoCities has on it these days. But we can let our software give us a starting point and have it answer a lot of simple things for us."


CC feels relatively confident about ccREL. "We have been taking comments and feedback for the past few years, so we've gotten good insight into this," Yergler says. "We also have a really large corpus of data about what people are doing with licenses."

All the same, the next step is to encourage the use of ccREL -- a process that might take years before the group can declare success or failure. "We are working now on establishing metrics, so that, as opposed to having just a general positive feeling about our performance, we can say, 'Yes, we did meet these goals,'" Yergler says. He and the other authors of the specification paper will continue to present it at conferences in the coming months, as well as work with tool makers to implement the standard and encourage content sites to use ccREL. The other authors are Hal Abelson, a member of the CC board of directors; Ben Adida, a member of the CC technical advistory board; and Mike Linksvayer, CC vice president.

Meanwhile, Yergler notes that ccREL already shows signs of gaining acceptance. "It has already been more successful than our previous recommendation," he says, "because we are getting promotional Web sites interested in it and how they can implement it. But it would make me think it truly successful if we got other organisations that also provide licenses, whether for software or content, to also start to describe their licenses in ccREL. That would be a huge measure of success."

Already, Yergler and other members of CC have approached the Free Software Foundation and Freedom Defined, a free culture group that he describes as having a more specific definition of "free" than CC. However, both these discussion are in preliminary stages.

Talking about those who want to see the CC licenses become more limited and those who would see them relaxed, he says, "My personal feeling is that we have actually done a pretty good job of hitting the middle ground, where we have people pissed off at both ends. If anyone was completely happy with us, we'd probably be too far in one direction."

All the same, much remains to be done in order to promote ccREL. For now, the most that can be said is that Creative Commons has made a determined start.


  • News
  • Legal
Click Here!