For years, Jeremy Allison has been one of the better known names in free software development. The lead developer of Samba's implementation of the SMB file server protocol, he is also generally credited as the project's co-creator. True, he jokes that description means that "Tridg [Andrew Tridgell] did all the hard bits, but I was there," and claims not to be current with all aspects of the project -- yet, all the same, few have more of an overview of Samba. Recently, Allison took time to give his personal view of the challenges involved in the recent Samba 4.0 release, and of the directions in which Samba might be heading next.
Jeremy Allison, Samba developer
Any change within Samba is major news in free software, especially for system administrators who face coordinating multiple servers that run different operating systems. Although Allison notes that "we're less important than we used to be, because essentially Windows is less important," Samba remains the key component for interaction between different versions of Windows on one hand, and Linux, BSD variants, Solaris, AIX, and other Unix-based systems on the other.
Samba also supports an extensive range of OEMs (Original Equipment Manufacturers). Proprietary rivals have virtually ceased to exist, according to Allison, who adds "I like to joke that if you go into Fry's [Electronics] and you buy a server, if it costs less than twenty thousand dollars or more than several million, it's got Samba in it." The advantages of free software for low end products is obvious, but Allison explains that it is also essential for high end ones because access to the source code is necessary to make specialized changes.
However, Samba 4.0 is important even by the project's usual standards. Released on 11 December, 2012, Samba 4.0 is noteworthy as the first release to include a free software implementation of Microsoft's Active Directory protocols. It also includes support for the SMB 2.1 file protocol, cluster support for file servers, and major rewrites of other components. The release is described in the release announcement as "the culmination of ten years' work."
Asked why the release took so long, Allison emphasizes that "Samba" is not a single project so much as a collective name for "a box of parts you can use to build solutions around" -- not only the SMB file server, but also an LDAP server, Heimdal Kerberos authentication server, a secure DNS server, and remote procedure calls for Active Directory. Every commit to each of these parts is tested using the so-called smbtorture test suite, which Allison describes as Samba's "move towards professional tests and development."
Many of Samba's components are modifications of other existing projects, and Samba developers often try to use these existing code bases. However, Allison says that, "to put all the pieces together, to make it a single whole so that it just works out of the box, is extremely hard." He cites one case in which Andrew Tridgell spent two weeks to set up the cryptographic changes to integrate code from another project, only to conclude that expecting users to do the same, even with a detailed how-to, was unrealistic. Often, it was simply easier to provide Samba's own modifications.
"It would be lovely if we could eventually integrate with existing [projects]," says Allison, "and, yes, we could end up doing so in the future. But right now, all we can guarantee works are the pieces that we've shipped and tested."
Added to this innate complexity was the difficulty of the process as a whole. "There was a great deal of stuff that had to be written, some experiments done that ended up being thrown away, and a new file server written that ended up not being used," Allison says.
In fact, the release:
"started as a re-write, and almost ended up as a fork. Essentially, what happened was that, while it was being written, the rest of Samba caught up. So, a few years ago, we decided that the best thing to do was just to put it all back together again, and that probably cost more time than putting the technical pieces together would have done. There were relationships that had to be rebuilt, code that had to be merged back together, things that instead of being re-implemented had to be re-unified into one single thing -- which caused a little friction, but not much. We're actually starting to be unified again."
How Microsoft Documentation Does (And Doesn't) Change Things
Another major reorientation occurred in 2007, when, thanks to the settlement of a European anti-trust case, Microsoft was required to release the protocol documentation necessary for other software to integrate with its workgroup server products -- a process that many Samba members took part in, including Allison, Tridgell, and Volker Lendecke, as well as members of Eben Moglen's Software Freedom Law Center.
Before this settlement, "we were essentially working cold," Allison says, trying reverse engineer Microsoft technology on Unix-like systems.
Often, the difficulty was not so much in functionality as in coordinating different components. For example, "Microsoft Kerberos stores everything in the same backend as the LDAP server, so if you update something by Kerberos, you also need the modification to be seen by LDAP. What was really needed was not so much changes to Kerberos itself -- although there as some of that needed because of Microsoft changes -- but mostly [the challenge] was creating that integrated backend."
In many ways, the process became easier as Microsoft wrote and released its protocol documentation in the years following the settlement. According to Allison, "relationships have always been good between engineers" at Samba and Microsoft, but, as the protocol documentation was realized, Microsoft employees like Sam Ramji began to realize the usefulness of helping other operating systems to interact with Windows. "We've worked with them fairly carefully, and a sign of that success is is that they gave us a quote for the Samba 4 release. I asked for that, quite cheekily, because I thought I was never, ever going to get it, and I've got to say there's some people at Microsoft who really moved heaven and earth to make it happen. Things have definitely gotten better since the EU case got settled."
Yet even the documentation didn't solve all Samba's problems. For instance, "There are many cases where the documentation on the file server essentially says, 'and we passed it through to the Windows magic too make this happen.' And we can't do that. We need to say, 'What is that magic, exactly?'" Often, the necessary additional information can be tracked down, but, even then, because of the complexity of interactions, it may not be correct, and, for all Microsoft's new-found willingness to help, finding who is responsible for the correction and receiving it can take time.
Even now, Allison says:
"When we discover something new and interesting about the protocol, we write a protocol test, and we probe against Windows, saying, 'Well, how does Windows behave? And at that point we can sometimes say, 'well, this is not actually something to do with the fileserver; this is something that NTFS does underneath, or this is something that the fileserver modifies before it's passsed down."
Such a sense of the Microsoft code can be especially important in routine procedures such as error codes. As Allison says,"error codes matter." Often, error codes are passed on to applications, so Samba needs to duplicate the expected error codes exactly in order to run Windows applications.
As a result, although the protocol documentation is frequently useful, understanding the scope of a programming task often depends on the individual expertise of developers. Within Samba, certain developers are known as the experts on certain Microsoft subjects -- for instance, Allison for path names, and Andrew Bartlett and Andrew Tridgell for Active Directory.
"We have specific areas," Allison says. "Knowing who to ask is often a big thing."
What Comes Next
These are still early days for Samba 4.0, especially with many users going on holidays soon after the release. All the same, Allison is already anticipating a 4.1 release by the end of the year, due to the bugs that are already being discovered now that the code is in general release.
In addition, Allison explains that, while Samba 4.0 includes basic support for Active Directory, much remains to be done, including implementing or improving "parts of the Active Directory Control pieces." Similarly, while Samba 4.0 includes the beginnings of support for the SMB 3.0 protocol, Allison describes it as "basically a skeleton framework." Thanks to Microsoft's increasing habits of making components selectable options rather than necessities, Samba's skeleton framework is functional, but full support for Active Directory still remains to be done.
Speaking personally, Allison also hopes eventually to see Samba reach the point where it can stop maintaining its own LDAP and Kerberos servers. Samba's own SMB clustering service may also be retired, as the project moves to replicate Microsoft's new implementation of the feature.
Other directions will be implemented by user demand. To that end, Allison suggests that as many OEMs as possible should contact the project. "I get the feeling that there are a bunch of users and OEMs who are really frightened, because they think that, if they told us they were using it, we might be angry with them," says Allison. "No, no, no! We want you to use it, and if you actually talk to us, we can actually help you with advance notice of security issues, and you can tell us what yo uneed it to do, and we can start to develop it."
Despite all his years working on Samba, Allison clearly remains committed to the project. "I went through a stage where I was saying that the best thing that could happen to Samba would be when we could close up shop and Windows has gone away. But I'm having too much fun. Yeah, I might be enabling Windows, but I'm having fun. I've never got much flack from people saying, 'You're enabling Windows.' Mainly because we are free software developers ourselves, and if we provide something that's not the purest, it's still damned useful. And 'damned useful' trumps ideological purity most of the time."