Thursday, June 14, 2007

Botnets, ovarian cancer, MySpace "offenders" and Bayes

What do botnets, "symptom" definition for ovarian cancerMySpace "sex offenders", homeland security passenger screening have in common?

They all teach us that we need to start teaching Bayesian analysis in 7th grade.

Consider the FBI's guideline for knowing your home PC is a zombie bot:

BBC NEWS | Technology | FBI tries to fight zombie hordes

...The organisation said it was difficult for people to know if their machine was part of a botnet.

However it said telltale signs could be if the machine ran slowly, had an e-mail outbox full of mail a user did not send or they get e-mail saying they are sending spam.

Of these only the last is a useful clue, and it's a stupid bot that leaves such obvious traces. Does any Windows machine not run slowly? It's the very nature of XP that machines slow as they age, a disturbingly familiar trait. The emails "you have sent spam" are either the result of forged headers or they're traps by bot harvesters to recruit victims.

In other words, these tests have weak sensitivity, very weak specificity, and no predictive value. The advice is worse than worthless because, if followed, it would cause vast expense and produce no value. (ISPs can detect bots however, and they should be held liable for failing to detect and notify.)

Ovarian cancer?

...The symptoms to watch out for are bloating, pelvic or abdominal pain, difficulty eating or feeling full quickly and feeling a frequent or urgent need to urinate. A woman who has any of those problems nearly every day for more than two or three weeks is advised to see a gynecologist, especially if the symptoms are new and quite different from her usual state of health...

Gynecologist? Sigh. Family medicine is truly dead. Anyway, this basically translates to a very inefficient but reimbursable screening program. The symptoms are completely nonspecific, so we're basically doing massive amounts of vaginal ultrasound. It would probably be better to simply start a screening program, but focus on persons with known risk factors. Lots of easy money for gynecologists though. Gee, I wonder who wrote up the recommendation?

MySpace? We've covered that one before. A test with low specificity, low sensitivity, lousy predictive value, and it may be used by law enforcement too. 

Passenger screening? See MySpace. Same techniques, same problems. The new regulations that every US traveler to Canada or Mexico have a passport, aka a true national ID card, will make matches less unreliable however. The test will then become more specific.

Bayes, Bayes, Bayes. We need to start teaching it in 7th grade.

No comments: