When Fedora 7 was released, one of its standout features was Smolt, an opt-in program for collecting data about users' hardware. Since then, Smolt has provided a publicly available snapshot of systems running Fedora, and is in the process of being ported to other distributions. With features being rapidly added, Smolt has the potential to offer an unprecedented wealth of information, and to aid in quality assurance, tech support, and advocacy, not only for Fedora, but for GNU/Linux in general.
Smolt's scope has grown rapidly during development. The basic idea came to McGrath last January, when he saw a need for a hardware inventory management system at his day job. However, McGrath soon realized that such information would also be useful for quality assurance work with GNU/Linux.
"I was only going to target Fedora," he says. "But then I realized that other Linux operating systems wouldn't be too difficult to add. So I took a look and realized it was a two-line patch to get it working in SUSE." Since then, McGrath reports, he has heard from developers interested in porting Smolt to other distributions, including Debian, Ubuntu, and Mandriva.
When Smolt was announced, another concern that quickly emerged was the potential for invasion of privacy. As a result of early criticism, McGrath says, "I made sure to take every step I could to let people not use it and ensure that it really is private." Smolt runs in Fedora 7's first boot wizard, and can be run at any later time, but users must actively choose to transmit their hardware information. To further guard privacy, profiles are stored with a randomly generated identifier, rather by name -- although users may choose to give up their privacy when talking to Fedora members about their configuration problems. In addition, no hardware IDs or Internet addresses are stored in the database, and "we don't store the Web logs and database on the same machine," McGrath says. "It's a very anonymous submission, and those still having concerns can always not submit."
These safeguards seemed to have removed much of the initial concern about Smolt. As of late July, the project had collected data on almost 90,000 systems. Individual profiles can be viewed if you know their identifiers, but the most fascinating aspect is the overall view of the systems that run Fedora.
For instance, according to the publicly available stats, 78% of respondents were running Pentium IIIs or equivalents, and 20% x86_64. The dominant language used was US English at 67%, with Japanese a distant second at 6.1%. 34% were running systems with less than half a gigabyte of RAM, 37% were using half to a gigabyte, and 24% over two gigabytes. This is concrete information of a sort rivaled only by a handful of projects, such as the Debian popularity-contest, which records what packages users download, and the recently announced Ingimp, which is collecting data on how people use the GIMP.
Using the information
McGrath sees several uses for the information collected by Smolt. At the most basic level, it can be used to improve quality assurance for the distributions that participate. In Fedora, McGrath says, "We've already seen bugzilla reports that have said, 'I'm having issues with hibernation, here's my Smolt profile.' It helps the developer because they know exactly what they're looking at, and can find an identical machine to test with." Moreover, Smolt can improve response time, because, instead of answering each help request individually, in many cases developers can refer users to a profile similar to their own for more information. Eventually, tech support notes could even be stored according to Smolt profiles, so that users could go to a profile and find a list of steps that were needed to overcome problems with a particular hardware configuration.
Similarly, the recorded information can also help in the planning of new releases for a distribution. As an example, McGrath points to the language statistics collected so far, noting that only two-thirds of the systems recorded are using American English. "That blew me away," he says. "And that number continues to slowly creep down as other languages creep up." For McGrath, this statistic is evidence that shows "the need to make sure that everything is translated."
In much the same way, statistics about processor speed or RAM can help developers to set the minimal system requirements for a new release. Instead of guessing, they can use definite statistics to know what hardware their efforts will run on. Should they decide to increase the system requirements, they can see what percentage of users they might leave behind.
In the future, Smolt could also be used to encourage manufacturers to extend their GNU/Linux support. "We'll be able to say, 'Nvidia, look! All of these users can't fully use your cards because you won't open [source] your drivers,'" McGrath suggests. The more users and the more distributions that participate in Smolt, the more effective such advocacy work could be.
Future plans for Smolt
Smolt is still developing rapidly. In the future, McGrath hopes to refine the information that the program collects. He is thinking about dropping some collected statistics, such as whether a system has built-in speakers, and adding others, such as whether the system has been upgraded and whether it can be suspended or hibernate with the distribution it is running.
Other refinements he would like to see include a rating system for profiles, so that people could see how well a given system works, and a query tool for the statistics Web page that would allow, among other things, the ability to filter information according to distribution -- a feature that might help users decide what distribution was most suitable for their existing hardware.
Yet, for now, the main priority is to encourage the use of Smolt by other distributions. "I'd love to get it packaged for other distributions by people already in those communities," says McGrath, "and work with them to find things their developers would be interested in as well."