Wednesday, April 07, 2010

Google's machine learning system

Classifiers are a classic component of traditional AI systems ....
Official Google Research Blog: Lessons learned developing a practical large scale machine learning system

... Several years ago we began developing a large scale machine learning system, and have been refining it over time. We gave it the codename “Seti” because it searches for signals in a large space. It scales to massive data sets and has become one of the most broadly used classification systems at Google.

After building a few initial prototypes, we quickly settled on a system with the following properties:

Binary classification (produces a probability estimate of the class label)


Scales to process hundreds of billions of instances and beyond

Scales to billions of features and beyond

Automatically identifies useful combinations of features

Accuracy is competitive with state-of-the-art classifiers

Reacts to new data within minutes...
I can think of several reasons why they named it Seti. For one, HAL was taken.

