Wednesday, January 19, 2005

Managing complexity: the lifelong data repository

Faughnan's Tech: Yahoo! Desktop (X1) is the new champion

In my tech notes blog I posted a review of X1. I've been using it for a while. It needs work, it's not as polished in some ways as Lookout, but it's pretty good. We have a lot further to go, however.

Lookout works well because Outlook content has lots of metadata and context. Email has dates, links to people, descriptive text surrounding attachments, etc. Email tends by nature to provide focal chunks of context. In contrast Google works well on the web because web pages have links that can be weighted, a robust form of metadata. Heck, web pages even have descriptive titles.

By comparison today's desktop file store is a barren desert. There's very little to go on to help search tools work. The most useful tool is probably the folder name -- pretty meager fare.

This wasn't such a big deal when we managed a few MBs of data. But what of the dataset that grows over a decade? That repository may be vast. Unfortunately, due to lack of supporting metadata, it's easier to find documents on the web than it is to find them on the desktop.

The good news is there are no lack of ideas to make things better. Heck, even as one uses today's software to search for items, one can be layering metadata atop the file system. If I do a search and open a file, then it's clearly more valuable and might earn a higher value score. The list of ways to assign value is very long; it will be fun to see how they get instantiated. Some of those ideas are 50 years old (Vannevar Bush described most of them in 1945 or so), I doubt any of them are truly new -- but the implementations will bring surprises.

PS. This is an old interest of mine.

Update 2/21/05: I've taken to appending the string [_s#] where # is 1-5 to the end of filenames to provide some crude metadata value scores. Full text search programs that index file names can then be filtered by the suffix value.

No comments: