Forum

Please consider registering
guest

sp_LogInOut Log In sp_Registration Register

Register | Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

sp_Feed Topic RSS sp_TopicIcon
Just what the doctor ordered.
May 14, 2009
3:24 pm
USA
Member
Members
Forum Posts: 6
Member Since:
May 14, 2009
sp_UserOfflineSmall Offline

This extension addresses most of the feature set that has caused me to continue using K9 as a windows bayesian anti-spam POP3 proxy since 2004.  The ability to filter based on a classificiation "probability number" inserted in the incoming mail headers allows me to automatically delete email that is above a threshhold that I feel comfortable with.  Junquilla is obviously aimed at providing the same sort of capability, but within TBird.  Great!  The only other capability I rely heavily upon in K9 is the "Blacklist" with wildcards and or regular expressions in order to short-circuit bayesian logic.  That is, to delete blacklisted emails before they are evaluated by the bayesian filter.  This is important because there are instances where I want to block certain senders based on email address or subject or something specific in the body of the message, but to still retain the option of changing my mind at a later date.  In that instance I simply remove them from the blacklist.  Without a blacklist one has to worry about how to change all the tokens in the bayes corpus that previously identiified those emails as junk to now reflect a "good" classification.  In other words, flexible and targeted retraining.  This is not something I can see easily accomplished in TBird's junk mail system, so a blacklist is still neccessary in order to accommodate my fickle notion of what is spam and what is good.  I've seen suggestions about how to configure a pseudo blacklist using the TBird addressbook. but it's hardly flexible enough and terribly cumbersome.

I've not taken the plunge and loaded the 3.0 Beta 2 of TBird because I've not found enough in the way of reviews that indicate it's truly stable enough for usage in my non-commercial, but heavy usage "home" environment.  This Junquilla extension is causing me to think seriously about taking a chance though.

Any thoughts or feedback?

Green just makes sense.

May 14, 2009
4:09 pm
Admin
Moderators
Forum Posts: 423
Member Since:
July 12, 2008
sp_UserOfflineSmall Offline

Blacklist can already be done using message filters. That is, you can create an addressbook called (for example) "Blacklist", and add addresses to it that you want to have marked junk. Then you need to add a message filter "if From is in addressbook Blacklist then mark as junk". That should do what you want for addresses. For other items such as Subject, message filters can also do this.

That may not be the user interface that you want though. If you could define an easier to use UI for blacklisting, I might consider adding it to future version of JunQuilla, since the core capability exists to support it.

I might also add that I have a patch to submit to the TB core soon that will allow filtering based on the bayes filter results. Currently, that cannot be done for incoming mail, as the bayes results are only calculated after the incoming filtering is done. That would allow you to use the "junk percent" field in the way that you describe. That feature will be a core feature, not part of the JunQuilla extension.

As to stability, I would trust TB3 beta 2 over any TB2 version myself, but then again I am more familiar with it. There are virtually no bug fixes going into TB2 anymore.

May 15, 2009
6:23 am
USA
Member
Members
Forum Posts: 6
Member Since:
May 14, 2009
sp_UserOfflineSmall Offline

My responses are embedded below...

rkent said:

Blacklist can already be done using message filters. That is, you can create an addressbook called (for example) “Blacklist”, and add addresses to it that you want to have marked junk. Then you need to add a message filter “if From is in addressbook Blacklist then mark as junk”. That should do what you want for addresses. For other items such as Subject, message filters can also do this.

This was the concept I referred to in my initial post.  I understand that a simplistic blacklisting capability can be obtained this way, but since we are using the address book I believe one must have a legitimate, single email address for each entry.  That is, you can't use wildcards or, better yet, regex syntax as the address matching criteria.  This is necessary capability when a sender uses a myriad of variations in their sending email addresses (i.e. subdomains or usernames).  Similarly, using a filter with "contains" for subject and or body restricts one to using strings with no wildcard or regex matching.  The inability to use flexible wildcard matches requires too much, maybe even unmanageable, maintenance on the filter list because of constant adding of new variations of straight string matching criteria.

That may not be the user interface that you want though. If you could define an easier to use UI for blacklisting, I might consider adding it to future version of JunQuilla, since the core capability exists to support it.

I'd be pleased and excited to provide input on such a capability.  It is something that a hoard of former Eudora users are clammering for also.  They had regex matching in the Eudora filtering and have lost it now.  It would actually be no more than a straight text file approach so nicely refined and reflected in the current implementation in K9.  I could forward my current blacklist file to you via an email attachment if you'd like to see it and it's indeed not too much of an imposition.

I might also add that I have a patch to submit to the TB core soon that will allow filtering based on the bayes filter results. Currently, that cannot be done for incoming mail, as the bayes results are only calculated after the incoming filtering is done. That would allow you to use the “junk percent” field in the way that you describe. That feature will be a core feature, not part of the JunQuilla extension.

This is a really important point.  I had not realized that the bayes calculations were done after the filtering.  I had been assuming that the numbers were going to be there to be used as a filtering criteria in my original post.  I had made the leap that it was possible already to filter based on the number junquilla was displaying because of the following sentence in the MOTIVATION paragraph on your JunQuilla page...

JunQuilla addresses this by presenting the classification percentage to the user, in a column that is available in normal message views, as well as used as a filter parameter in a search folder that shows only the uncertain messages.

Obviously I was making an unwarranted assumption that having the numbers available as a search filter equated to having the numbers available for incoming message filtering.

As to stability, I would trust TB3 beta 2 over any TB2 version myself, but then again I am more familiar with it. There are virtually no bug fixes going into TB2 anymore.

I think I'll give it a try based on this.  Maybe I'll hang on for Beta 3 though.  Wink


Green just makes sense.

September 7, 2009
8:07 am
USA
Member
Members
Forum Posts: 6
Member Since:
May 14, 2009
sp_UserOfflineSmall Offline

Hi rkent,

This is a follow-up post to see if you did indeed do the submission to the TBird core of the code that would perform junk calculations before filtering as per your May response to this post?  Is it in the current Beta 3?

Max

Green just makes sense.

September 7, 2009
8:58 am
Admin
Moderators
Forum Posts: 423
Member Since:
July 12, 2008
sp_UserOfflineSmall Offline

onlyme said:

Hi rkent,

This is a follow-up post to see if you did indeed do the submission to the TBird core of the code that would perform junk calculations before filtering as per your May response to this post?  Is it in the current Beta 3?

Max


The code to allow junk calculations to be used in filters will be in the upcoming beta 4, but is not in beta 3. I did an extensive blog posting about it as well here: http://mesquilla.com/2009/08/28/managing-spam-with-after-classification-filters/

Forum Timezone: UTC -8

Most Users Ever Online: 41

Currently Online: ritaHinc
2 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

BigMike: 14

David.P: 10

Jeff Wexler: 9

taa: 8

JPRuehmann: 8

bobkatz: 8

Member Stats:

Guest Posters: 217

Members: 2350

Moderators: 2

Admins: 1

Forum Stats:

Groups: 1

Forums: 7

Topics: 375

Posts: 1220

Newest Members:

AlbertKet, Kevintuh, LazaroVag, elinorgb1, Niki1Kevick, AnthonyPaino

Administrators: rkent: 423