Author: Scott Merrill
You may have seen the orange XML buttons on websites. Those buttons are links to eXtensible Markup Language, or XML versions of the web pages we read. Although XML documents can be read by human beings (with a little effort), they’re designed to be read and processed by computer programs. Clicking them in your browser will produce various results, depending on what web browser you’re using. You’re not supposed to click on those buttons directly. Instead, you’re supposed to copy the destination of that link and paste it into your aggregator.
Aggregators, or feed readers as they’re also called, are the programs that do all the work of visiting and collecting web updates for you. When new content is available, the aggregator fetches the data and makes it available for you to read. Aggregators generally list new items first, so you can quickly skim all your feeds, or subscribed sites, in the order in which they were updated. There are manydifferentaggregatorsavailable.
With desktop aggregators, your desktop computer needs to be on and connected to the Internet. If you use an aggregator on your PC at home and on your PC at work, you’ll have to skip past the content you’ve already read, which rather defeats the purpose of aggregating the content to begin with! Thankfully, there exist also several web-based solutions. The advantage to using a web-based aggregator like Syndic8 or NewsIsFree is that their computers do all the work of polling websites and obtaining updates, and you can read your list of feeds from wherever you may be. This means that you can check your list of feeds while eating breakfast at home, and then see a complete list of updated information during your lunchbreak at the office.
Content syndication saves website readers time, but it can save website operators money. Visitors using a web browser to read a website request and receive the entire site every time: text, graphics, layout information, advertising, and anything else that might be on (or in) the page. Web browsers can cache (or “remember”) some of this information, but there are a variety of reasons why this doesn’t always work, and the visitor ends up downloading most everything from that page every time they visit, regardless of whether there’s anything new (actually, advertising often causes this, as the advertisements change every time the page is loaded). Aggregators visiting a site’s feed first check the timestamp of the feed: if it hasn’t been updated since the last visit then the aggregator immediately stops. When updates are available, aggregators will receive only the content from the website, without advertising, background images, navigation buttons, and the like. Combined, these can prevent vast amounts of unnecessary traffic, allowing content producers to reach their audience without incurring astronomically expensive web hosting bills.
Several tangential benefits also arise from using content syndication. First, syndication-specific search engines look through feeds, allowing you to maintain a constantly-updated list of links to information in as close to real-time as currently possible. Second, the machine-readable format of syndication feeds makes it possible to create very interesting comparisons and analysis on syndicated data — much easier than personally visiting the pages to copy-and-paste the bits you want. Third, syndication can be used to includecontent from other sites into your own website.
Of course, there are plenty of challenges with content syndication. The biggest challenge is the machine language format used for the feeds. Although feeds are in the XML language, there are a variety of popular dialects to that language. Some feed-reading programs can speak them all, while others are limited to just one or two. There’s a joke that succinctly explains the situation: “The great thing about standards is that there are so many to choose from!”
The most common syndication format is RSS (which stands forReally Simple Syndication, or Rich Site Summary, or maybe RDF Site Summary), which is itself a little misleading because there are nine different types of RSS. The history of RSS is complicated, with several competing parties vying to establish the definitive standard. In common practice, only two or three of these formats are regularly used, but even that’s too many.
The other dominant syndication format is Atom, which is a community-driven format that is trying to avoid many of the perceived shortcomings of RSS. Atom isn’t (yet) as widely-supported as RSS, but it’s quickly gaining ground. Lengthydiscussions wage on about these competing standards. Thankfully services exist that translate syndication formats, so you don’t need to worry about which format is winning the debate.
Another issue with content syndication is that the syndication source (ie: the website offering the feed) chooses whether to provide the full content of new posts, or just an excerpt. Advertising-driven websites will often provide just an excerpt, or teaser, to tickle your fancy in order to get you to load the webpage in your browser and thus see the ads that are displayed there. Some such sites will send out the first couple of sentences for their feeds, which may or may not provide enough information for you to determine whether it’s worth your time to follow the link to the story. Other sites will carefully craft meaningful summaries of new items which you can quickly skim and decide whether to read the in-depth report.
Many syndication feeds come from personal weblogs, monitored by Bloglines and Kinja.com, but big businesses are recognizing the value of the technology. Reuters offers feeds for its news items. The BBC offers categorized news feeds. Microsoft offers feeds for developer resources. Apple offers RSS feeds for its iTunes Music Store to display new releases and top rated songs or albums. What are you waiting for? Start aggregating!