Wednesday, January 13, 2010

Innovations in comment spam

Comment spam continues its rapid evolution. Despite my reluctant surrender to the Captcha I'm seeing novel mutations every few months.

A recent technique is to write a reasonably detailed comment about a fairly specific topic, like "junk DNA". A query engine then identifies all blog posts that have a high match to the comment. An automated posting process, perhaps with some tool-assisted human powered captcha processors (via Amazon's Mechanical Turk?), submits the post to thousands of blogs.

Even with human review, the comment submissions will be a good quality match to a meaningful number of blog posts. The comment gets posted, and the spammers get something of value (link referrals?).

The one I rejected today was clumsily written, so it was fairly easy to spot. It contained an unnecessarily specific reference to a "first post", the author name was a marketing phrase, and the grammar and phrasing could have been better. I've probably missed better ones!

We can expect rapid improvement. In time they might evolve to transiently novel insights statistically applied to the right spot at the right time. At that point, would we not welcome them?

In the meantime we do need Google to start filtering these comments the same way they filter email. This particular approach lends itself to statistical filters, and of course the use of author reputation in filtering algorithms. Alas, Google has forgotten all about poor Blogger ...
--
My Google Reader Shared items (feed)

Tuesday, January 12, 2010

Brave new world: China attacks Google

Based on the phrasing and response, it's clear that Google believes this attack was launched by parties working for the government of China. We can also assume that the "relevant US authorities" (FBI) agree with them. I wonder if the targeted companies used software with similar vulnerabilities.
Official Google Blog: A new approach to China

Like many other well-known organizations, we face cyber attacks of varying degrees on a regular basis. In mid-December, we detected a highly sophisticated and targeted attack on our corporate infrastructure originating from China that resulted in the theft of intellectual property from Google. However, it soon became clear that what at first appeared to be solely a security incident--albeit a significant one--was something quite different.

First, this attack was not just on Google. As part of our investigation we have discovered that at least twenty other large companies from a wide range of businesses--including the Internet, finance, technology, media and chemical sectors--have been similarly targeted. We are currently in the process of notifying those companies, and we are also working with the relevant U.S. authorities.

Second, we have evidence to suggest that a primary goal of the attackers was accessing the Gmail accounts of Chinese human rights activists. Based on our investigation to date we believe their attack did not achieve that objective. Only two Gmail accounts appear to have been accessed, and that activity was limited to account information (such as the date the account was created) and subject line, rather than the content of emails themselves...

... We launched Google.cn in January 2006 in the belief that the benefits of increased access to information for people in China and a more open Internet outweighed our discomfort in agreeing to censor some results. At the time we made clear that "we will carefully monitor conditions in China, including new laws and other restrictions on our services. If we determine that we are unable to achieve the objectives outlined we will not hesitate to reconsider our approach to China."

These attacks and the surveillance they have uncovered--combined with the attempts over the past year to further limit free speech on the web--have led us to conclude that we should review the feasibility of our business operations in China. We have decided we are no longer willing to continue censoring our results on Google.cn, and so over the next few weeks we will be discussing with the Chinese government the basis on which we could operate an unfiltered search engine within the law, if at all. We recognize that this may well mean having to shut down Google.cn, and potentially our offices in China.

The decision to review our business operations in China has been incredibly hard, and we know that it will have potentially far-reaching consequences. We want to make clear that this move was driven by our executives in the United States, without the knowledge or involvement of our employees in China who have worked incredibly hard to make Google.cn the success it is today. We are committed to working responsibly to resolve the very difficult issues raised.
This may be the end of Google's services in China. We should expect their share price to fall in the morning. Google's "evil score" has now dropped to the lowest possible level for a public corporation.

Update 1/13/10: There's a lot of commentary this morning, including comparisons to how the USSR hobbled itself by shutting out access to world knowledge. I'm wondering if Google's increasingly powerful and ubiquitous machine translation services played a precipitating role. Language has been the cultural equivalent of the Himalayas - preserving China from cultural invasion. I suspect the Chinese government is very concerned about widespread direct unmediated access to English language materials.
--
My Google Reader Shared items (feed)

Dark matter DNA

Our universe is largely built with matter that shapes large structures, but doesn't interact with electric fields - including light. It's dark matter.

There's a funny similarity to our DNA ...
Borna Virus Discovered in Human Genome - Carl Zimmer - NYTimes.com

...Fossil viruses are also illuminating human evolution. Scientists estimate that 8.3 percent of the human genome can be traced back to retrovirus infections. To put that in perspective, that’s seven times more DNA than is found in all the 20,000 protein-coding genes in the human genome.
In the physican universe dark matter is only about 70% of all matter, but in humans "dark DNA" is 97%+ of all DNA. So our DNA is about 2% protein coding, 8% retrovirus, and 90% other - including non-retroviral virus origin and "structural". (Yes, I know that's "four times" and Zimer says "seven times" - his numbers are more likely correct.)

So from a DNA perspective, are we basically an ambulatory viral ecosystem with a fraction of information capacity that does things like make brains and bodies? Seems a bit much, but it turns out even some of the most important protein coding DNA is of viral origin. In a companion post on his blog Zimmer writes ...
... a virus protein called syncitin ... is essential for placentas to develop. Cells push the protein to their surface, where it lets them latch onto other cells, fusing together to create a special layer through which nutrients can pass from mother to child. The protein got its start on viruses, which use it to latch onto host cells and fuse to them, allowing their genes to slip in.

But recent research has revealed an intriguing new twist to our viral legacy. It turns out that the viral surface protein in question has a second job. It also tamps down the immune system of its host...
So is there any non-structural DNA in humans that's not of fundamentally viral origin?

See also: Presser on the bornavirus article ... UTA News Center

PS. A search on Preeclampsia and bornavirus has 180 hits today, but I think they appear to be loose and coincidental relationships. I didn't see research relating bornavirus-like superinfection triggering auto-immune placental disruption and thus pre-eclampsia / toxemia.

Update 1/30/2010: io9 quotes Frank Kelly: "[T]he human genome has evolved as a holobiontic union of vertebrate and virus... ". A Coral holobiont is "the entire community of living organisms that make up a healthy coral head".

Sunday, January 10, 2010

Lessons from my leonine chat icon

If you inspect my profile on various OS X and Google systems lately, you'll see a theatrical yawn ...


There's a lesson in the yawn. When I created a new user account on my i5 running 10.6, I chose a standard animal icon. Since it's a family machine, I wanted to choose an icon that would impress the children (didn't work). Hence the lion.

I then connected that account to my MobileMe account and, just as I found on 10.5 11 months ago, the login image on the iMac propagated to all my MobileMe associated machines, wiping out whatever I had there.

It ate them.

Then, after I fiddled with iChat and Adium, it propagated to Gmail and GoogleTalk/Video Chat and the wider world.

None of this is documented of course. It just happens. It's an emergent behavior; a side-effect. One bit of whimsy, and bam -- I'm a lion everywhere.

There will be more of these things in years to come. More strange leakages and propagations.

If you want something private, keep it on paper. And keep the paper out of range of Vicon Revue wearing lifebloggers ...

Update 1/12/10: Today I notice the OS X 10.6 lion has metastasized to my Google Reader Shared By ...
I'm sure this is violating all kinds of copyright laws, but all of my actions were entirely correct. I think I'll just have to get used to my emergent avatar. Maybe he'll appear on my virtual tombstone.

Update 1/18/10: Here it is on my Google Profile.
This is really silly. I'm going to try restoring the GP image and see if it propagates the other way.

Update 2/9/10: Now it's spread to Google Buzz.
Only it's no longer affixed by my gmail address, it's attached to my corporate email!
--
My Google Reader Shared items (feed)

If you're wondering where your money went ...

Still way down from the peak, almost 10 years later ...
Bubbleheads II - Grasping Reality with Opposable Thumbs:

...
S&P 500, June 30, 2000 close: 1455
S&P 500, December 31, 2009 close: 1145
Consumer Price Index, November 2009/June 2000: 1.26

Real price decline: -37.5%...
--
My Google Reader Shared items (feed)

Saturday, January 09, 2010

How removing my car stereo gave me my Apple iSlate prediction

[Update: iPad is the name. My post-release verdict is even more flamboyant.]

Geeks are all tingly in the run up to Steve Jobs' iSlate/iPad/whatever announcement. The last time I remember this level of geek thrill was just before the Segway was announced.

Oh, you don't remember that? Well, it wasn't the Segue of a thousand jokes back then. It was a mysterious product that was going to transform the world. (Who knows, when gas is $12/gallon maybe it will.)

The Segway is a cautionary tale, but I'm rooting for Mr Jobs. Even his mistakes are interesting, and if anyone can make a slate exciting it's the man in the black shirt. Personally I'm much more interested in the $150 Chrome OS gBook, but I'll be tracking the fan sites nonetheless. I expect the slate to solve at least one problem I have, and to solve it in a way that will work for my iPhone and desktop too.

I expect Mr. Jobs to come up with a Digital Rights Management scheme for books that we can live with -- just as he (and his team) have done for video and apps. (BTW, do you think anyone notices that balanced DRM is the key to Apple's App Store windfall? The industry hasn't missed this, even though the media has.)

I want Apple to do this, because this morning I couldn't figure out how to get my ultra-geeky SONY car stereo out of my dying 1997 Subaru Legacy (we bought the Forester, not the whacked new Outback). I knew Crutchfield would have great directions, but they charge $10 for detailed directions unless you're buying a stereo -- and they US Mail them.

The price was a bit steep, but the real problem for me was US Mail. They do this, of course, because if they let users download a PDF they'd sell one copy of the directions.

What Crutchfield and I needed was a DRM approach that was a reasonable balance between their interests and mine. If they had that, they might sell the directions electronically for a more appealing $5.

That's my iSlate prediction. That Jobs/Apple will include a DRM solution for printed material that will, like their DRM for Apps, be a reasonable balance between the rights of publishers and the interests of consumers.
--
My Google Reader Shared items (feed)

Inbox zero - mastering email

I'm doing a 1 hour session on mastering email at my day job. I get to do this because, after 20 years of struggling with email, I have finally figured out how to do it.

For what it's worth I'll add a link to my presentation here after Jan 24th, but there's no great mystery to it. The most important intervention was reducing inflow. Of course I got rid of all email lists, newsletters and the like -- if an organization can't figure out blogs they're unlikely to have anything useful to tell me. Most of all though, I reduced the number of email replies and misdirected emails that I get.

I reduced the number of email replies by, paradoxically, spending more time crafting precise responses, and by being quicker to convert dysfunctional email to a meeting or phone call. I craft my response to an email so that no further correspondence should be necessary. If an email discussion goes beyond two cycles that's a meeting. It's almost always, in this context, a brief, productive, and satisfying meeting. The body of the meeting appointment, by the way, includes the last email sent. (In Outlook drag and drop the email on the calendar icon.)

I reduced the number of emails I had to reply to by gently educating my correspondents about what goes on the To line. The To line should include only people with tasks - such as the single person who should respond.

I reduced the time required to process and triage email by gently teaching about the correct use of the subject line. It should tell the reader what the email is about and what's needed. I change the subject line when I reply to precisely describe my replay -- including an answer summary. This subject line also makes my full-text search email archives more valuable.

These days the email I get is satisfying. It's increasingly well written, targeted, and easy to respond to. I'm now in a virtuous feedback loop; good email begets good email. (though example alone is not enough, cautious education is needed to).

More after the 24th of January.

See also some other posts of mine:
Update 11/8/10: Here's the presentation I promised. It should have all the corporate references expunged.

Friday, January 08, 2010

Bermuda

I came across Bermuda cruising the ocean floor on Google Earth...



There's a lot down there.

IOT: Samarkand, the Sogdians and the Silk Road

Once it was Maracanda, ruled by Alexandere. Centuries later, before Rome fell, the Persian speaking Sogdians flourished there, at the heart of the a historically trading empire that lasted from before 300 CE until after 700 CE. They were the traders of the Silk Road, and the conduits for Buddhism and much knowledge of China, India, Asia and places West.

Later their city became a place of Arab history - Samarkand.

Today Samarkand is in Uzbekistan ...


It's a hike, but it's a city of about 400,000 and it's open for tourism. In Google earth you can see their photos.

You can learn the story of the Sogdians, and a surprising amount of China's endless story by listening to ...

BBC - Radio 4 - In Our Time - The Silk Road

In 1900, a Taoist monk came upon a cave near the Chinese town of Dunhuang. Inside, he found thousands of ancient manuscripts. They revealed a vast amount of evidence about the so-called ‘Silk Road’: the great trade routes which had stretched from Central Asia, through desert oases, to China, throughout the first millennium....

Most of what we know of this people comes from a small cache of lost Sogdian mail, and the stories the Chinese told of the them. If not for that accident, we'd know almost nothing.

And yet, they changed the course of history.

Obama and the underwear bomber

I’ve not written much about the underwear bomber, mostly because the inanity of the public discussion is so depressing.

Schneier, as usual, has the most rational coverage. He points out that even our inevitably imperfect security measures do increase the challenges of bomb preparation, and thus the probability that an attack will fail. So even though metal-free recto-vaginal or intra-abdominal bombs can bypass millimeter-wave scanners or backscatter x-ray these devices will still increase the cost of a successful attack. (Though there are probably more cost-effective measures to increase security.)

One lesson from this attack is that we need to make an understanding of positive predictive value a requirement for high school graduation. It’s also clear that the controversial ridiculous fashion for teaching Latin is a major distraction from a desperate need to teach logic.

Lessons aside, I think the response of the Obama administration is interesting to watch. They clearly know that there’s not much that could have been done to stop this attack, and they know that they have to placate our spine-free hysterical nation. More interestingly, it looks like they’re trying to use this to attack the incompetent intelligence network we’ve inherited – even though, in this case, even a very good network would have failed.

It’s the equivalent of jailing a mobster for tax evasion when you can’t get ‘em for murder and mayhem.

PS. I’m so glad our heroic savior is a leftie foreigner who makes “low budget films”. At least we’ve been spared the usual celebratory histrionics.

Update: On further reflection, inspired by a polite comment, I was a bit harsh on the teaching of Latin. I do think there are substantially better uses of educational resources, but "ridiculous" was unmerited.

Update b: Schneier has summarized his recommendations. Perfect, as usual.

Wednesday, January 06, 2010

The spooky power of Google Suggest

Some people buy new cars every few years.

We drive our cars into the ground. Our 12 yo Subaru wagon won't start, so it's a goner. There's not much to salvage, except our uber-geeky SONY CDX-GT610UI MP3, AAC, USB, iPod etc car stereo.

I didn't know how to remove it, so I start typing "how to remove car .." and Google gives me several good options, including one that talks about doing without a "DIN tool".

Where can I buy a DIN tool? Google suggest throws up the Walmart.com: Scosche DIN Radio Removal Tool.

The searches themselves were anticlimactic. Google Suggest had already done the heavy lifting.

Amazing.

PS. Ok, so the reality isn't quite as magical as I make it out to be. I had the PDF installation guide and between it and some light Google work I figured out I actually need a special anti-theft SONY-specific "release key". Still, Google Suggest is seriously cool.

Update 1/9/10: Ok, so, in retrospect, my first search was wrong. Turned out that my 12 yo Subara installation didn't use the standard kit at all. Wiki Answers told me how to remove the unit from my 1997 Subaru Legacy. I did, however, discover something interesting about the business of selling answers.
--
My Google Reader Shared items (feed)

Archaic communications in 2010 - Gmail example

Dear Visitor from 2020:

I know you feel things haven't progressed very far, but you really need to take a look at how we did communications in 2010.

Believe it or not, in 2010 Google's Gmail could open 3 windows that looked like this ...

One was for something called email. Another was for something called "Chat" or "Instant Messaging". A third was for something called "SMS" or "Texting".

They all looked rather the same and did rather similar things, but they all worked somewhat differently with different phones and different computers. The SMS was the most restrictive, it was limited to less than 200 ascii characters! Despite being so limited, it cost much more than the others. It worked, however, with the archaic phones that persisted in the US until 2012.

Pretty bad eh? It gets worse. I'd tell you about Twitter, but you wouldn't understand it at all.

Aren't you glad you're not living in the dark ages any more?

john

Personal computing 2020: More and less

OpenDoc was ambitious (emphases mine) ....
OpenDoc was a multi-platform software componentry framework standard for compound documents, inspired by the Xerox Star system ...
...The basic idea of OpenDoc was to create small, reusable components, responsible for a specific task, such as text editing, bitmap editing or browsing an FTP server. OpenDoc provided a framework in which these components could run together, and a document format for storing the data created by each component..
... OpenDoc was one of Apple's earliest experiments with open standards and collaborative development methods with other companies...
... OpenDoc components were invariably large and slow. For instance, opening a simple text editor part would often require 2 megabytes of RAM or more, whereas the same editor written as a standalone application could be as small as 32 KB...
... each part saved its data within Bento (the former name of an OpenDoc compound document file format) in its own internal binary format...
OpenDoc failed of course. It's easy to say it was ahead of its time, but it may be more correct to say it was a part of a future that will never come.

In recent years even the much more modest Open Document Format seems to be fading away. The modern trend is to simpler user environments with smaller feature sets and fewer user demands. In many ways, we're returning to the pre-multifinder world of MacOS Classic system 6.

This makes sense. It's increasingly difficult to live in the modern world without net access, but it's obvious that the vast majority of humans cannot live in the world of Win 7 or Office 2010 or OS X -- much less the virus infested XP boxes in most homes. My best guess is that less than 15% of the American population can keep a single net connected computer running well - much less a family network.

So what will things look like 10 years from now?

Simpler.

This will be hard on us geeks. We aren't going to get DateBk 6 on our iPhones. We're going to have get used to a world in which computers are simultaneously more powerful and less capable. We will finally have a single integrated calendar view of personal and work calendars, but those calendaring and information management capabilities will be a shadow of what we once had in Ecco Professional or DateBk and other lost tools of the 80s and 90s.

I really don't know how the DRM wars will turn out; aggressive Digital Rights Management (copy protection) may ironically sustain the (rogue) classic personal computer.

Progress is funny. I think our computing world will be better and more productive in 10 years, but the geeks among us will have to get used to losing tools and capabilities along the way. We'll have to ... (yech) ... be flexible ...
--
My Google Reader Shared items (feed)

Tuesday, January 05, 2010

From iPhone users to Google: Thank you for the Nexus One

I love you Google. Thank you for the Nexus One Phone.

Sold unlocked even when subsidized. Google Voice baked in. Navigation. Location sharing. Speech recognition text entry. Speech UI. OLED. Removable battery. Memory on Micro SD card (to 32GB). Noise cancellation. Ogg Vorbis.

I've got six months left on my iPhone AT&T contract. I'm in no hurry to get a new contract now. You can buy this phone without a data plan, stick in a pay-per-use voice/data cash card, and, once Google announces their Google VOIP service, use it largely free with home and office WiFi.

Apple and AT&T will need to be very sweet to keep me.


Update: Arrington review online. The battery life is very short, but even with the iPhone I'm always near a charger. It's life.

Update b: As I read reviews on the Nexus One I'm a bit surprised by admissions of how weak yesterday's crop of Android phones truly are. If I'd bought one I'd have been b*tching big time. Sadly, most geek bloggers are too committed to defending their purchases. In my blogs, I savagely attack the things I own :-).

Update c: The very best sort of competition. Reminds me of the golden age before Microsoft crushed all competition on the PC platform.

Update d: The Nexus is up to 50% cheaper than the iPhone. I think this comparison overstates the gap, but it's technically correct. My bet is the Nexus is closer to 30% cheaper for most users.

Update 1/6/10: Pogue is not amused. I think he missed out on the advantage of using WiFi for data and a very cheap minimal voice plan for voice.

Sunday, January 03, 2010

You too can visit North Korea

I was intrigued by this book review ...
What to Read - Inside the Hermit Kingdom Salon.com

.... the bizarre spectacle of the vacant Ryugyong Hotel (aka the "Hotel of Doom") towering over Pyongyang...

... If you went out on a moonless night in the years after the nation's electrical grid effectively collapsed, the only way you could tell anyone else was around was by the coal of their cigarette burning in the dark. There's the writing paper sold in state stores, made of corn husks that "would crumble easily if you scratched too hard," so that people wrote on paper scavenged from the margins of newspapers. And then there's Vinalon, "a stiff, shiny synthetic material unique to North Korea," of which the fatherland was ludicrously proud. Vinalon takes dye so poorly that everyone's clothes (which were mostly uniforms to begin with) were limited to drab grays, blues and browns...

...With the factories and electricity shut down, the air over Chungjin is pristine again, and you can see every star in the night sky. Doctors provide herbal remedies, but only because they have nothing else; furthermore, they are required to spend weeks camping out in the mountainous countryside, harvesting wild plants. Some resort to growing their own cotton in order to have bandages. Most North Koreans have never seen a mobile phone and don't know that the Internet exists....
A cross between Shangri-La and Auschwitz, forever mysterious, untouchable, inaccessi...

Oh. Wait. What about Google Earth?

Yep, it works. You can visit the construction site of the Hotel of Doom, and tour Pyongyang from the air. You can even see the USS Pueblo, the only American ship in enemy hands:


There are many more Panoramio images than one might expect (blue boxes above), though only in the tourist parts of the city. There are several attractive sites; the infamous hotel is atypically ugly.

There are very few vehicles in the satellite images or the Panoramio pictures. One nearby city seems to have no significant roads and no vehicles in most of the residential areas.

The North Korean images are not very high resolution. There are no economic incentives to image North Korea, so you don't see anything like standard Saint Paul resolution ...

Within a few years though, even the low res treatment will show playground structures and perhaps pedestrians. The flying tours of North Korea will only improve.

We can see much of them, and they cannot see us at all.

It's an eerie sensation.

Which brings me to my first (and, thus far, only) prediction for the next decade.

The North Korean government will collapse.