Spam statistics and spamd
![]()
I discovered today that I left my [tag]procmail[/tag] deliveries logging all kinds of information. I had logs that went back a month and a half. I thought why not parse them up, and generate some [tag]stats[/tag].
My procmailrc sorts most of my mail into folders for me. When I was writing the script to parse I decided to categorize my folders to make the statistics more meaningful. This leaves me with 4 types of mail: work (automated reports, logs, and such), [tag]spam[/tag] (SpamAssasin, and discarded mail), lists (mailing lists), and Inbox (everything else).
These stats turned out to be quite interesting, at least to me. Since I am the sysadmin for an ISP, I get tons of email. I get the output for any and all cron jobs, interesting snippets of logs, and all mail addressed to common aliases (postmaster, root, webmaster, abuse, daemon, security, etc). This will cause my work category to be quite large. You can see that my work mail accounts for more than half of all deliveries. If you leave out the work category, my spam accounts for about 80% of all of my email, and that doesn’t count all the crap that SpamAssassin or my own filters don’t catch. Holy cow. Spam is a huge problem.
The big dip this week is caused by my experimentation with new anti-spam techniques. I tried out OpenBSD’s spamd. It is amazing. It reduces spam quite a bit, as you can see here. It would show even better results, but I only used it on one of several balanced incoming mail servers. It is a great implementation of [tag]greylisting[/tag]. However, this technique causes some legitimate mail to be delayed by 5min - a few hours. We had a few complaints from customers about delayed mail, so I had to turn it off. I highly recommend this technique for anyone who is battling spam, doesn’t have extremely picky users, and don’t mind slightly delayed mail from time to time.
Posted: December 22nd, 2006.
Tags: Linux/BSD, Work
Comments
Comment from fungus
Time May 21, 2007 at 8:51 am
Here, here! I agree. Too bad many environments have lots of users who don’t get this.
Comment from Natalia Gorbski
Time May 21, 2007 at 12:49 am
That’s stupid - e-mail isn’t about being instant communication - if people wanted quick discussion, they’d use IRC. E-mail is about getting there within 7 days, not within minutes. Anyone that bitches about speed of e-mail should be reminded that the more than 25 year old standard is not made for speed, then slapped in the face for being stupid enough to complain.