Thursday, January 31, 2008

Microsoft's FeedSync: what the heck is it and why would anyone care about a trivial problem like data synchronization?

Jacob Reider, the master of the terse post, apparently likes Microsoft's FeedSync.

Of course, Jacob, you didn't bother to say why you liked it. Or even what it might be good for!

It turns out that FeedSync was originally a Ray (Lotus Notes -> Microsoft CTO) Ozzie project. I don't know what it started out as, but now claims to be an open source specification for enabling data synchronization.

Jacob is presumably interested for two reasons. One is general geekhood, the other healthcare related. First the geek stuff.

As a fellow-geek Jacob, like me, is constantly trying to synchronize data across platforms. Anyone who's been around the block with Outlook, Exchange, Palm, mobile phones, iPhones, Gmail, iSync, etc, etc, will have learned that this is a non-trivial problem even in the relatively trivial domain of synchronizing address books.

We geeks would like, for example, to move our images and metadata readily from Picasa to Flickr and back again. Good luck - even if Google claims they're opposed to Data Lock enabling synchronization between competitors is rather a difficult proposition -- particularly when the services define photo collections differently (include by reference or by copy?).

Heck, we'd like to move our metadata from iPhoto to Aperture -- two desktop apps Apple controls. We can't even do that. (ex: photo book annotations). Forget Aperture to Lightroom!

How hard is this problem? I have long claimed that data synchronization issues between Palm and Outlook/Exchange were one of the top three causes of the collapse of once promising Palm OS ecosystem. OS X geeks know that Apple has a long history of messed up synchronization even within the completely controlled OS X/.Mac environment. IBM has had several initiatives to manage this kind of issue (the last one I tracked was in the OS/2 era) -- all disasters. Anyone remember CORBA transaction standards? Same problem in a different form. The only experience I've had of synchronization working was with the original Palm devices synchronizing to the original Palm Desktop -- where everything was built to make synchronization work. Lotus Notes, of course, was into synchronization in a very big way -- that's how the different Notes repositories communicated with one another (hence Ozzie's interest). I don't know how well that really worked, but I'm told it took an army to make Notes work.

Personally, I think this problem gets fully solved about 10 milliseconds before Skynet takes over. There are too many nasty issues of semantics, of each system knowing what the other means by "place", to achieve perfect results between disparate systems. Even the imperfect results achieved by using language between mere humans requires a semblance of sentience, shared language, and even shared culture.

Reason two for Jacob's interest is, of course, his health care IT background. HL-7. SNOMED terminfo models. HITSP and Continuity of Care Records. Even Google's fuzzy Personal Health Record interchange services. Microsoft's various healthcare IT initiatives. Many HCIT vendor transaction solutions. They're really all about data synchronization on a grand scale -- even if the realities tend to be fairly modest.

Jacob, btw, is fond of those loosely-coupled mashup thingies.

So what's "FeedSync"? (emphases mine)

Windows Live Dev FeedSync Intro

The creation of FeedSync was catalyzed by the observation that RSS and Atom feeds were exploding on the web, and that by harnessing their inherent simplicity we might enable the creation of a “decentralized data bus” among the world’s web sites. Just like RSS and Atom, FeedSync feeds can be synchronized to any device or platform.

Previously known as Simple Sharing Extensions, FeedSync was originally designed by Ray Ozzie in 2005 and has been developed by Microsoft with input from the Web community. The initial specification, FeedSync for Atom and RSS, describes how to synchronize data through Atom and RSS feeds.

The FeedSync specification is available under the Creative Commons Attribution-Share Alike License and the Microsoft Open Specification Promise.

... FeedSync lays the foundation for a common synchronization infrastructure between any service and any application.

... Everyone has data that they want to share: contact lists, calendar entries, blog postings, and so on. This data must be up-to-date, real-time, across any of the programs, services, or devices you choose to use and share with.

Too often today data is “locked up” in proprietary applications and services or on various devices. As an open extension to RSS and Atom, FeedSync enables you to “unlock” your data—making it easy to synchronize the data you choose to any other authorized FeedSync-enabled service, computer, or mobile device. FeedSync enables many compelling scenarios:

  • Collaboration over the web using synchronized feeds
  • Roaming data to multiple client devices
  • Publishing reference data and updates in an open format that can be synchronized easily

... FeedSync enables multi-master topologies,

... publish a subset of his calendar more broadly using a FeedSync feed. Consumers of the publish-only feed can only see a subset of the calendar, and don’t have permission to make changes. Because of the FeedSync information in the feed, though, they are reliably notified of updates to Steve’s shared calendar. And unlike current feeds, when Steve deletes an item from the calendar, the item is deleted on everyone’s calendar.

... RSS and Atom were designed as notification mechanisms, to alert clients that some new resource is available on a server. This is a great fit for simple applications like blogging.

But those feed formats are not a natural fit for representing collections of resources that change, such as a contact list, or a collection of calendar items. Atom Publishing Protocol is designed for resource collections, but it is a client-server protocol and isn’t suitable (by itself) for multi-master scenarios. FeedSync extends RSS and Atom so that FeedSync-enabled RSS and Atom feeds can be used for reliable, efficient content replication and multi-master data synchronization.

One of the great benefits of FeedSync is that it doesn’t attempt to replace technologies like RSS, Atom, or Atom Publishing Protocol. Instead, FeedSync is a simple set of extensions that enhances the RSS or Atom feeds that people are already using today...

There you go. Nerdvana indeed.


Ok, I won't rain too hard on this parade. I said "perfect results" weren't feasible. We can't do synchronization for anything that's not trivial -- at least not without monstrous effort. The interesting question is whether there's some kind of "good enough" compromise that we can start with that, with a lot of time and evolution, might lead to some sort of emergent solution. Preferably without Skynet. Something that bears the same relationship to the original Palm synchronization that Google does to the original memex/xanadu vision...

No comments: