The issue of open data is on the minds of a lot of technology users and watchers of late, thanks to the media blitzkrieg surrounding the WikiLeaks Web site and the legal battles facing WikiLeaks founder Julian Assange. In the face of such a controversial issue, it's only natural to ask questions about who owns data and what rights do they have to use and own it?
A more pedestrian — and far less controversial — application of the open data issue is this: who owns the data you put on more everyday Web services, such as Facebook, Yahoo!, or one of the many Google Web applications? The assumption "why me, of course" may actually not be true.
This is because data on the various Web services is often locked into the service itself. If Gmail were to go off the air tomorrow, how many millions of users would completely lose not only the ability to communicate presently, but also lose any archived messages? Are messages regularly backed up? In fact, how does one back up Gmail? (Hint: use a standalone POP or IMAP client and pull down the messages to your local machine... and hope you have enough storage capacity.)
And that's just Gmail: what about all the other Google services? Google's own solution, the Data Libration Front is a good start, but who has an open source stack equivalent to Google's to which to move the data? Without such a stack, what good is exporting a Google user's data out?
Then there is the issue of control, another hallmark of ownership. Who sees the data on Gmail? What processes are looking at those messages in the Inbox, seeing a few occurrences of the word "habit," and concluding an ad for a twelve-step program would be appropriate to display?
For all the emphasis the open source community has on the freedom surrounding software used, that same community is often willing to sacrifice the same measure of freedom for their data — a far more personal aspect of their online lives.
Open data is a solution to the problem that many users may not recognize they have: how to keep their data accessible and controlled in an environment where that data is increasing online and typically out of control and (at times) inaccessible.
Creating an Open Web
One strong proponent for open data on the Web is Stormy Peters, the former director of the GNOME Foundation and currently head of Mozilla's developer engagement program. Peters often speaks about the importance of open data, addressing the topic at this year's OSCON and at the Ohio LinuxFest, where this reporter attended her presentation. Peters cited several examples of services where data can get out of control. Facebook, for example, notoriously holds onto user data rather tightly, to the point where it's difficult for the users themselves to completely export the data to another service, should they choose. Not to mention Facebook's ongoing privacy travails.
Using ReclaimPrivacy.org is simple; just drag the bookmarklet up to your bookmarks, log into Facebook, and surf to your privacy settings. Once on the page, click the bookmarklet link and off the script will scan for slack privacy settings.
Facebook is just one example in a very wide world of data and privacy problems. Many Web-enabled services used today have a less-than-open data policy in place.
Peters notes some positive exceptions in her presentations: Identi.ca, the open source microblogging service, not only has open code, but open data policies as well. Similar openness is expected to be found on Diaspora, an open source version of Facebook's social services now in alpha development.
But will "open" be enough of a draw? Web service users might not want to go through the hassle of moving to more open systems just for privacy's sake, which will certainly lower customer pressure for open data. And, with so much money involved with mining and keeping customer data, don't expect openness to magically spread through service providers' policies without significant customer pressure.