--- date: "2006-11-17T04:38:12Z" title: Over a Decade of Spam and I Still Haven't Killed Anyone (Yet) ---

I've been using SpamProbe to separate the wheat from the chaff for the last four years. That, along with the fact that I rarely delete email, gives me a reasonable set of data to analyze the performance of a spam filter. So, how does SpamProbe stack up?

Graphs With Lines and Stuff

SpamProbe: Classifications per Month (Count)

The exponential increase has flattened the numbers we really care about, and the logarithmic scaling plotting in Ploticus has failed me, so here's the same graph with correct classifications omitted:

SpamProbe: False Classifications per Month, 2002-2006 (Count)

That second graph is mildly depressing, but it reflects my day-to-day experience. Namely, more and more spam messages seem to be sneaking by SpamProbe and being incorrectly classified as legitimate messages. But how does the increase in false negatives stack up compared to the total amount of spam I'm getting? Let's take a look at the data again, but this time as a percentage rather than a sum:

SpamProbe: Classifications per Month, 2002-2006 (Percent)

And the same data again, without the correctly classified spam:

SpamProbe: Classifications per Month, 2002-2006 (Percent)

As you can see from the graphs, the percent of false positives, or legitimate mail incorrectly classified as spam, sits pretty steady around 0%, while the number of false negatives, or spam incorrectly classified as legitimate mail, has hovered below 5% for just over two years. Not too shabby for a lowly bayesian classifier. By the way, the large peaks in the percentage graphs are mostly anomalous (see below).

Caveats

Are aphorisms about liars and statistics bouncing around in your head right now? Good. Here's some of the gotchas with this data:

Conclusions

If I wanted to be scientific and objective and all that crap, or at least methodical and thorough, I would take several competing spam classifiers and feed them the same corpus, then compare the results. I'm not trying really trying to be objective, though; SpamProbe seems to be working pretty well, at least for now. Oh yeah, if you're interested in playing with the actual numbers, or if you're curious how I processed the data and generated the graphs, feel free to download the raw data.