Amir Lev's picture
Amir Lev

Security Levity

Spam Judo: ultimate solution or academic reinvention?

I saw an interesting article in New Scientist this morning: "To beat spam, turn its own weapons against it". I thought I'd talk a little about it in this week's Security Levity...

 
Article summary

A team of academics from ICSI Berkeley and UC San Diego have come up with a way of analyzing the spam email messages sent by a 'captured' zombie PC. They're calling it "Spam Judo". After watching the zombie for about 10 minutes, they can work out the underlying template used to construct the spam. This allows them to instruct spam filters to watch out for messages that match the template.

 
To explain a little further...

The vast majority of spam is sent from botnets: networks of malware-compromised PCs (known as zombies). These zombies take their instructions from the spammers, sending spam based on template messages. These templates allow the spam messages to be different from each other in random ways, in order to evade simple spam content filters.

So, using the academics' template-deriving technique, spam filters could look at the content of suspected spam and decide if it matched a known spam template.

 
An Ultimate Solution?

I congratulate the team; in many ways, it's similar to how our technology works. However, I'd like to suggest that the technique as described is going to be too simplistic for the real world. Ten minutes is far too long to derive the template: in ten minutes, a botnet can deliver millions of spam messages. The template can change quite frequently, too, rendering the work done to derive the template useless.

Spying on just one zombie at one location is a major limitation: you need a widely distributed system -- millions of nodes all around the internet -- in order to quickly capture sufficient breadth of data. And you need fast, automatic, efficient processing to collate all that information into spam signatures for filters to match against.

By the way, the article dubbed this technique "effectively perfect". Whenever I hear claims like that, alarm bells start ringing in my head... loudly. But in fact the team didn't use this phrase, as UCSD Associate Professor Stefan Savage points out on Slashdot:

We never suggested that we have a perfect spam filter per se, simply a new tool that has the benefit of being orthogonal to existing techniques. For existing botnets, our filters are extremely good, but the paper is also quite clear about the variety of ways that spammers might try to evade the approach.

So, despite the team's claim that it's "orthogonal to existing techniques", looking for fragments of text from templates is essentially what many spam filters do all the time -- including those based on our OEM engine.

We have internet instrumentation all over the world collecting these samples -- not just at a single zombie. It's called recurring pattern detection (RPD) and we've been doing it for nine years. It identifies the template-driven features of new spam campaigns in seconds, by examining billions of transactions from about a million different bots daily.

We'll be watching the progress of this work with interest; kudos to Professor Savage and the rest of the team: Andreas Pitsillidis, Kirill Levchenko, Christian Kreibich, Chris Kanich, Geoffrey M. Voelker, Vern Paxson and Nicholas Weaver. The academic paper will be presented in just over a month, at NDSS 2010. If Professor Savage and the team wish to contact us to discuss their ideas, I'll be happy to talk to them about it and share our perspective and experience.

 
[Thanks to TechMeme for the link to the article.]

 
Have you seen a new spam filtering technique that looks interesting? Feel free to add comments and ask Amir!

 
When he's not critiquing spam filtering techniques, Amir Lev is the CTO, President, and co-founder of Commtouch (NASDAQ:CTCH), an e-mail and Web defense technology provider.
MORE...