Friday, July 07, 2006

Google's weakness: devotion to the algorithm

I have four Google Blogger (Blogstpot hosted) weblogs I post to regularly. This is one. I've another that's pure geekery and product reviews and a third that's dedicated to special needs children. The fourth is purely for internal family use, nobody else knows the URL. All are hosted on blogger. All have only a modest number of regular readers (the tech blog generates the most traffic because people searching for answers to problems end up there fairly often).

Over the past few months all of them, 100%, have been tagged by "the blogger team" as spam blogs (splots). Every one of them, when I filed an appeal, were subsequently cleared:
Your blog has been reviewed, verified, and cleared for regular use so that it will no longer appear as potential spam. If you sign out of Blogger and sign back in again, you should be able to post as normal. Thanks for your patience, and we apologize for any inconvenience this has caused.
I hope they're done now, but I wouldn't be surprised if they did it again. What the heck are they using for an algorithm? What do they consider an acceptable 'false positive rate'? How many miscategorized bloggers simply give up? I'd ask whether Blogger would be so innaccurate if it cost them user revenue, but the answer to that question is just too obvious.

Which gets to the point of my post. Google's religion is the Algorithm, the belief that they can write rules against their de facto 'neural network' (web backmaps) and produce results competitive with human analysts. In this case the algorithms are failing, but Google persists in their use. That's a weakness. They need to emulate Amazon's Amazing Turk and use humans as their splog detectors ...

Update 7/9: Why are humans so good at detecting a splog, and computers so bad? One of the most common models for the evolution of intellect and sentience is that it's important for deception and deception detection. Spotting fakes and lies is fundamental to human cognitive function. A splog is nothing if not a lie. It's not surprising that humans will be very good at spotting them, and computers very weak ...

No comments: