Sunday Afternoon Thoughts on the Design of RSS Aggregators
Emerging Technology SIG, Doug Cutting gave a talk on Lucene and Nutch. Before the talk, Doug casually mentioned that he used a server-based RSS aggregator. Similarly, in the responses to my blog entry on RSS Aggregrators, someone mentioned they use bloglines.
This is interesting to me. In my mind, and I was probably guided by the intuition that a "web browser is a client," RSS Aggregators were naturally client side. By which I mean, my first inclination was that RSS Aggregators naturally run on the end-user's machine, rather than on a centralized server farm. There are counterexamples, though. For example, Bloglines is an RSS Aggregator that runs out there somewhere and returns your results as a web page (and, by the way, Scott Rosenberg likes Bloglines).
Which led me to spend some time pondering: what's the boundary line between "standalone application" and "server-based" application. That is, when should an application live entirely on an end-user's machine, and when should it live on a server and be accessed through a client program (this distinction gets hazier in the case of RSS Aggregators, which are, in a loose sense, web-clients anyway).
The classic reasons for making an application a server-based application are:
The classic reasons for making an application stand-alone are:
Of course, I'm blurring the lines and ignoring fat clients that do more than provide a better gui (e.g. which slide some "server" functionality over the client). It's a simple list. And there's nothing in here about P2P applications or the ways in which the faster release cycles engendered by web-based applications can be a significant competitive advantage. But I still think it captures a lot of the considerations and so I'd like to ask:
Obvious thingsLet's start by making the easy comparisons. From the end-user's perspective, the standalone approach has the following advantages:
From the developer's perspective, the standalone approach has the following advantages:
From the end-user's perspective, the server-based approach has the following advantage:
From the developer's perspective, the server-based approach has the following advantages:
Applying the Server-Based / Standalone Bullet Points
With that out of the way, let's talk horse-racing. Given that you can build an RSS aggregator that's server-based or standalone, how do they compete with each other? How will they evolve?
How do you, as the designer of bloglines, make your application compelling? Well, you want to build something that is a classic server-based application (cause you're server based and it makes sense to leverage that). You want to add features that require resource sharing, data sharing, or connectivity (you've already got the accessibility thing nailed).
What do those look like? You might think connectivity's a nice one. If you can stay up 24 x 7, and you can cache RSS feeds, then people can find out about blogs which are currently off-line, but have changed. The problem is: this assumes the feed indicated a change, but then the site went off-line. And if a user is interested enough to wonder whether a feed changed, they might want to be able to fetch the article. Which means this isn't that big an advantage (the feed, or the site, being down is pretty much a bummer, unless your aggregator's going to cache a lot of data for people).
Data sharing? Well, there's potential here in that the RSS feeds are fetched much less often. This is a very good thing for authors with low-capacity servers and interesting weblogs. But it's not so compelling for the end-user. Unless we run into a scenario where a significant percentage of weblog's are up, but responding slowly. Or, a scenarios which is perhaps more likely, a significant percentage of weblogs decide to give higher priority to server-based RSS feeds on the theory that doing so will decrease their overall load.
Resource sharing? Here's where the server-based designs have a chance to shine. Bloglines has features like Top Blogs, Blog Recommendations, and the ability to subscribe to a search which are hard to imagine incorporating into a standalone design.
I think these resource sharing functions are the compelling advantage bloglines has. The interesting thing is, of course, that other applications which aren't RSS Aggregators (like Feedster) also offer some of them.
How about the other side? How do you, as the designer of FeedDemon, make your application compelling? Well, you want to build something that is a classic standalone application (cause you're standalone). You want to add features that require significant personal application load, personal information, or enable you to run even when you're not connected to the net (you've already got the performance thing nailed).
The last of these is the easiest-- it probably means building a local database and having a "fetch my web" feature for offline RSS browsing. Given that even the FeedDemon help is on-line right now (the help system sends you to online help pages), it would appear that this isn't a priority (in spite of the "work offline" button, which seems to simply prevent FeedDemon from attempting to talk to the world).
"Fetch my web" seems nice even when you're on-line too. Wouldn't it be great to improve the performance of the web by having a predictive cache? Of course it would. And by subscribing to feeds, I'm telling the web browser exactly how to build the cache. The software gets simpler, and better.
How about significant load or significant personal information? What could you add to an RSS Aggregator that would make it more useful along these lines? Well, the obvious thing is memory: Suppose the RSS Aggregator not only knew about your feeds, it know about which articles you fetched over time, and was somehow taking advantage of that big database of information. Suppose you could search the database for old blog articles (though, in a shamless personal plug, I'll point out that you can do this for bloglines by incorporating the toolbar I helped build into your web browser)?
Another point, which isn't necessarily client or server based, is that applications are platforms. By building a server-based application, and relying on a web browser for your client, you are doing two things: you are limiting the extensions that third parties can make to your application to browser-based plugins AND you are enabling the existing browser-based plugins to augment your application.
On the other hand, if you built a robust plug-in architecture into your standalone aggregator, it's possible that you could harness a intermediate-to-long-term competitive advantage-- as RSS grows in importance, and we all believe it will, people will want to customize their RSS experience (on the other hand, you have to support a developer community. Uuugh).
Comments on this weblog
1 to 8 of 8
1 to 8 of 8
Return to weblogs.oreilly.com.