Say hello to XMPP

71

Author: Nathan Willis

The Extensible Messaging and Presence Protocol (XMPP) is the formalized incarnation of the Jabber instant message protocol. But what exactly does that formalization mean? And why should you care?

Andrew S. Tanenbaum said, “The nice thing about standards is that there are so many to choose from,” and that has certainly been true in the world of instant messaging. AOL, MSN, and Yahoo all control huge IM userbases, but they maintain completely isolated networks, so if you have friends in each place, to contact everyone you know can take establishing an account on all three systems. That means three sets of usernames and passwords and three separate applications to download, and really, who has time and screen real estate for that?

In 1998 Jeremie Miller decided he had had enough of such foolishness and started the Jabber project in response, creating an open IM system with a publicly documented, non-proprietary protocol to (he hoped) unite the IM world. It used XML for message streams, so it was easy to process, document, and extend. Just a couple of years later, Jabber was successful enough that the Jabber Software Foundation (JSF) decided to submit it to the IETF for official approval.

Last March, after internationalization work and a lot of refinement, the IETF approved the new XMPP standard, based on the Jabber protocol. The specification is outlined in four RFCs: RFC 3920: XMPP Core, RFC 3921: XMPP Instant Messaging and Presence, RFC 3922: Mapping XMPP to CPIM, and RFC 3923: End-to-End Signing and Object Encryption for XMPP.

The first two RFCs define the stream format and the message format, respectively. The third is a mapping of the XMPP protocol to the older standard called Common Presence and Instant Messaging, an abstract definition that the IETF defined (in part) to try and promote standardized IM development. The last RFC, as its title suggests, specifies encryption and security.

Let your XML do the walking

If all of that sounds complicated, don’t worry. Like all instant messaging systems, you use XMPP to send one-time messages and engage in chats with other users, and keep a roster of contacts visible so you can see who’s online.

What makes XMPP more interesting is the fact that clients communicate to XMPP servers and vice versa by passing simple blocks of XML. All communications are encapsulated in <stream /> elements, and the contents are called stanzas. The spec defines just three stanza types: <message />, <presence />, and <iq />. Don’t fear for your privacy; iq in this case stands for info query; your intelligence will be revealed only through the messages you send.

These XMPP stanzas contain elements like “to” and “from” fields, “id” attributes for internal identification, and a data payload. Obviously, for a simple instant message, the payload will contain text. But the XMPP spec allows for more structuring, including <subject />, <body />, <thread />, and <headline /> elements that enable the format to handle discussions potentially more complex than a simple one-to-one chat. There is a long-standing tradition of non-user “bots” on IM networks that perform all kinds of useful tasks, and XMPP must be friendly to their needs as well.

Similarly, a <presence /> stanza will contain a <show /> element of the familiar away/available for chat/do not disturb settings. It can also carry a subscribe/unsubscribe element, which is the mechanism XMPP uses to add one user to another’s roster of contacts. This system requires each person on your roster to approve your “subscription” to their presence, a requirement not all proprietary IM networks use. Should you desire at any particular moment to appear “out” when you are not, however, you can do so — the subscription system only grants you permission to put someone on your roster, but does not give you a way to circumvent their privacy.

When a client initiates a session with a server, <iq /> payloads carry queries and responses that set up the session, return the user’s roster of contacts, and other details. Other operations performed with <iq /> stanzas include adding, deleting, and managing contacts on the roster, and many server-to-server operations.

Yes, that’s right: server-to-server. The chief difference between XMPP and the proprietary IM networks is that XMPP is decentralized. By contrast, AOL controls not only the specification for the AIM protocols, but the exchange of messages as well. Every AIM message is sent into the walls of AOL before it emerges on its way to the recipient.

By contrast, anyone can run an XMPP server, and the XMPP servers are responsible for connecting to one another to pass messages. Like other decentralized networks, this has the advantage of fault-tolerance: If AOL goes offline, even momentarily, all AIM traffic is interrupted, but with XMPP, a problem at one server can affect only a small portion of the network at any given moment.

Another effect of decentralization is that the username-space is unregulated. AOL can ensure that no two users select the same screen name. But since anyone can run an XMPP server, the only way to provide the same guarantee among XMPP networks is to require identifiers (called JIDs) that conform to the Uniform Resource Identifier spec — basically meaning a username@domain.tld format, exactly like an email address. Some may find this similarity confusing, and decide the only simple solution is to try to make your JID the same as your email address, making a safety net for your disoriented but well-intentioned friends.

Xtra xtra, read all about it

If you can get over the fact that it takes its spelling cues from Mister Robinson’s Neighborhood, the eXtensible aspect of XMPP is pretty important. XMPP is designed to be extensible; new feature sets can be added without breaking the existing protocol.

Extensions are managed through an open standards process at the JSF called Jabber Enhancement Proposals (JEP). Executive Director Peter Saint-Andre says drafting and managing extensions is the main task of the JSF. After the completion of the XMPP Core, the first set of JEPs drafted by the JSF re-implemented common features of the original Jabber system, such as SSL encryption (the XMPP core supports only TLS), service discovery, and in-band registration of new user accounts.

Additional JEPs extend the base XMPP functionality in ways familiar to users of other IM networks, adding functionality for file transfer, user “mood”, and offline message retrieval. If you’re coming to XMPP from Yahoo Messenger or MSN Messenger networks, you are more likely to appreciate these features than you are the beauty of XML streams.

Some of the extensions managed by the JSF do interesting things that other IM networks don’t support. JEP-0080: User Geolocation allows you to broadcast your location via GPS coordinates and query the locations of your friends. JEP-0108: User Activity gives you a flexible system to describe what you are doing. JEP-0118: User Tune lets you broadcast the
tracklisting of your current music selection (though not the music itself) to the curious on your contact roster.

Lest you think that all the extensions are fluff, consider that the greatest thing about XML is that it is both human- and machine-readable. A well-defined mechanism like XMPP that allows communication of streams of XML across the Internet has a lot of potential beyond IM. Another quickly-adopted extension is JEP-0009: Jabber-RPC, enabling XML-RPC over an XMPP network, which lets applications exchange Remote Procedure Call packets with other computers over the Internet.

There is also a flexible data-gathering-and-reporting JEP called Data Forms, based on the same concept as XHTML’s XForms, but adapted to the streaming message world. Jep-0072: SOAP Over XMPP defines usage of XMPP for the Simple Object Access Protocol. JEP-0020: Feature Negotiation allows XMPP entities to negotiate advertised and unadvertised options for the services they support. JEP-0060: Publish-Subscribe is an extension to allow syndication and shared-content systems.

There are currently 75 JEPs in the approval process. Some are technical enhancements like the ones listed above, while others are procedural proposals that developers will want to get familiar with before working on their own extensions. You can follow the progress of all JEPs at www.jabber.org/protocol/jep.

Not your father’s global instant messaging network

It’s a common misconception that Jabber and XMPP are “multi-protocol,” allowing you to send instant messages to people on AIM or other networks. This isn’t actually the case; it stems from the fact that many Jabber servers run gateways that can translate XMPP messages to the proper formats for other IM networks and then route them to the appropriate servers. Since XMPP is cleanly defined XML, a program that understands the other formats can do this translation automatically, but such behavior is not part of the XMPP specification and in fact many public Jabber servers (including jabber.org) do not run any such gateways.

There are also applications — the popular GTK client Gaim for instance — that perform message-format translation on the client side. So you can use a single application to reach all of your IM friends and buddies, but it is not XMPP that brings you this power. When Jabber was new, many users needed such cross-network message forwarding simply to keep in touch with the broader IM market, dominated by the proprietary AIM. But as XMPP-compliant clients increase in market share, that need is likely to diminish.

Peter Saint-Andre says that although it is impossible to count the exact number of XMPP users, the public server at jabber.org has seen a steady increase in traffic, and at any given moment between 1,200 and 1,300 other Jabber servers are connecting to it to pass messages. Since these connections are built and taken down automatically on an as-needed basis, the actual number of XMPP servers (and thus users) is far higher.

Naturally, all this talk about protocols and extensions doesn’t do a lot of good without software that supports it. The JSF maintains a list of compliant server and client software on its Web site, which you can compare side-by-side on feature support, platform, and even license. My recommendation (and I feel confident the JSF would agree with me) is that you grab a client application, set up an account on one of the many public servers, and start learning about XMPP from experience. And if that just isn’t enough, remember that you can always run an XMPP server yourself — why let AOL have all the fun?

Several high-profile commercial software vendors — Apple and Hewlett-Packard among them — have announced support for XMPP in upcoming product releases. XMPP is poised to democratize instant messaging as SMTP and POP did for email.