A day in my spam life

November 28, 2009 – 10:23 pm

Just for laughs, I looked at statistics for my spam yesterday. Here’s the results:

1) Spams caught by server-side SpamAssassin: 109

2) Spams caught by local bayes filter after passing SpamAssassin: 49

3) Spam marked by me that got through both filters: 2 (junkpercent scores were 63 and 66)

Total Spam: 160

For the server-side SpamAssassin filter, my spam detection limit is 5.0 This is the stock SpamAssassin filter supplied by default to all accounts by a large, inexpensive web hosting provider (hostmonster).

For my local bayes filter, my spam detection limit is set at 75.

I *never* have emails falsely marked as spam. I train spam reliably using the Uncertain folders in my JunQuilla addon. I have a limit of 300,000 tokens (with a current count of 118,690), 2097 good messages trained, 3748 junk messages trained.

Oh, and I use 2 customizations to tokenization using new hidden preferences available in TB3. First, I tokenize into words the Received header (it is disabled by default), plus I tokenize into words SpamAssassin’s X-SPAM-STATUS header (which is accepted as a single token by default, that is not broken into individual words.) I don’t believe these are very important, however, but I do think that they help a little.

Note to self: blog about customized tokenization settings in TB3, and try to do some analysis.

Post a Comment