I've just returned from the Conference on Email and AntiSpam (CEAS 2007), in Mountain View, CA. I've been on the program committee for the past two years (translation: I've been a paper-review slave), and have been asked to be program chair for next year (translaton: slave driver). We're trying to expand the scope of the conference, but more on that later. First, a summary of the highlights of this year's conference... this is only my opinion of the highlights, with no attempt to cover everything. For more about any of these, see the associated conference papers.
We started with Wietse Venema as the invited speaker. Wietse's the author of the email program Postfix, and he talked about the development of Postfix, including the reasons he did it and the things he learned from it. It was a good talk, and I'm not saying that only because he works upstairs from me.
Other talks of note:
Mark Dredze from University of Pennsylvania talked about the work he and his colleagues have done in cutting the overhead of analyzing email images for spamminess. Their method is basically to figure out the analysis techniques and features that will take too long for this image and eliminate them in favour of the ones that will be faster and work well enough, recognizing the characteristics of the images that lend the image to be more efficiently analyzed by certain algorithms.
Zhe Wang from Priceton University gave a complementary talk about improving efficiency of spam image analysis by separating out a set of “features” from the images, and then using the features to analyze the similarities among images. That allows them to do the analysis with a more limited set of features, streamlining the process.
Lisa Johansen of Penn State discussed a technique that's being used more and more, analyzing email to detect common social networks — in this case, communities with common interests.
Amad Thomason, of the company Six Apart, discussed blog spam, and the issues it raises.
Enrico Blanzeri of the University of Trento (Italy) discussed their experiments with coupling a nearest neighbour algorithm with a support vector machine filter to improve the effectiveness (and decrease the false-positive rate), but at a high computational cost (every message has a different neighbour set, requiring recomputation of the SVM).
Zulfikar Ramzen, of Symantec, gave us a rundown on trends in “phishing” messages in 2006.
Aaron Zinman, of the MIT Media Lab, talked about spam-related issues in social-networking sites such as MySpace, showing that it's much harder to put a black-or-white evaluation on things in that universe.
Vijay Balasubramaniyan of Georgia Tech presented a method of using social network techniques to establish “reputation” for voice-over-IP calls, and using the reputation to fend off potential spam in VoIP telephony.
Victoria Bellotti of the Palo Alto Research Center (PARC) showed us a system they're working on, called TV-ACTA — an activity-centered interface for task management, tying together email, meeting agendas, calendars, and associated files in one user interface that allows you to managed all the information associated with a task or project in one place.
Of course, I won't mention the wonderful talks by my colleague Rich Segal and me. We kicked butt.
On to next year:
We still have to find one or two more program co-chairs, and then look at how to expand the reach of the conference, getting more papers on different kinds of spam and malware, and on other, non-spam aspects of email. We're also looking at venues for 2008... it might actually not be on the west coast next year, woo-hoo!