Saturday, July 29, 2006


Conference on Email and AntiSpam

I've been at the 2006 Conference on Email and AntiSpam (CEAS) on Thursday and Friday, and here's a summary of the conference.

Thursday opened with a fascinating invited talk by Rob Thomas about Internet crime. In "The Underground Economy" he told us about the many things that "miscreants" do to make money illicitly on the Internet. One of the most interesting bits of information to me was that the problem of compromised computers isn't limited to zombie computers, where Grandma's Windows machine is taken over by malware. In fact, there are a great many compromised servers and routers out there, boxes running Free BSD and Linux, for example — and they are the more lucrative devices to control. The talk was full of interesting information and anecdotes, and the speaker was full of animation. For me, this was by far the highlight of the conference.

Next we had three talks under the category "abuse detection and control". We heard about a method of doing port-25 blocking dynamically, rather than statically, to work around the issue of legitimate use of direct connections over port 25. Yahoo! told us how they're using DomainKey-signed mail to try to identify mail forwarding services, so they can avoid accidentally blacklisting them. And we saw a preliminary analysis of the effectiveness of DNS block lists in the face of zombie machines.

The next group of talks was in the rough category of "counter-attack". We saw how an artificial-intelligence team from University of Illinois at Chicago made a system that would interact with spammers to try to take up a human's time — particularly amusing in its exchanges with Nigeria-scam spammers. I presented our paper on "parasitic spam", the concept that spammers could attach spam to legitimate mail, making it more difficult to block (and New Scientist has an article on it, which came out on Thursday). And we heard about a study of techniques used in spam that help it fool users and spam filters, looking at how the use of the techniques has varied over time.

In the next segment there were discussions of learning-based spam filters, looking at differences among Bayesean filters, and other types of text analysis. In the "email enhancement" segment we heard about studies of using message similarity to reassemble discussion threads, and attempts to predict CC recipients and missing attachments based on analysis of the email message. The "social behaviour" group gave us talks about using game theory to predict spammer behaviour, a mechanism for developing sender-reputation information in a webmail service (gmail), and a mechanism for using social behaviour to detect anomalous "antisocial" behaviour, indicative of spamming.

Friday's sessions began with an invited talk by Professor Hector Garcia-Molina of Stanford University, and his graduate student Zoltan Gyongyi. They talked about web-based spam issues — in particular, using link spam to "game" search engines, and mechanisms to detect that.

We followed that with a brainstorming session on what sorts of research are "missing" — that is, what work needs to be done, to try to encourage researchers to attack those issues. After that, the "corpora" segment had talks about subsetting, refining, and developing email corpora, with references to the Enron corpus, widely used in the spam-analysis community.

In the "organizational impact" group, we had a talk by John Aycock about a class on spam and spyware that he's teaching at University of Calgary. I gave my talk about the deployment of our SpamGuru antispam system in IBM. And Richard Clayton, of Cambridge University, talked about a project that attempts to size the work in store for mail-abuse teams.

We finished the conference with a second "learning-based filters" session, with studies comparing the effectiveness of filters and investigating some learning techniques, and a talk by my colleague Rich Segal about work he's done on a fast mechanism for labelling large email corpora using learning techniques.

We wrapped up with the usual "business meeting", analyzing the conference and looking at ways to make it better next year. All in all, it was a better conference than 2005, I think. And the Firehouse Brewery, in Sunnyvale, provided a good dinner and beer for eight of us before it was time for me to head to the airport and the flight home.

