Forum

Share

Please consider registering
guest

Log In RegisterMembers
Or log in with

Register | Lost password?
Advanced Search:

— Forum Scope —



— Match —



— Forum Options —




Wildcard usage:
*  matches any number of characters    %  matches exactly one character

Minimum search word length is 4 characters - maximum search word length is 84 characters

Topic RSS
Considering MaxEnt
October 1, 2009
11:20 am
Jakub Kaplan
Guest

I wondered if you considered using MaxEnt instead of what I presume is Naive Bayes?

Although this would probably make training a lot more time-intensive, it could be done in the background (as most users' resource usage is minimal) and in batches when sufficient number of new e-mails has been classified.

Then features like number of links in an e-mail, average length of sentences, etc. could be incorporated, which would probably improve the result.

Otherwise thumbs up for this. I almost never use beta's, but this pretty much compels me to go for it.

Share
October 1, 2009
12:36 pm
Admin
Forum Posts: 323
Member Since:
July 12, 2008
Offline

"I wondered if you considered using MaxEnt instead of what I presume is Naive Bayes?"

TaQuilla uses the same code as is currently used in the internal TB junk processor, which is Naive Bayes. At the moment I do not have any plans to change that or allow other options.

"I almost never use beta's, but this pretty much compels me to go for it."

I have not updated TaQuilla for a couple of betas, and at the moment it is a little out of date. If you are at all risk-averse, I do not recommend using it until I have done some updates on it.

Share
Forum Timezone: UTC -8

Most Users Ever Online: 18

Currently Online:
5 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

bobkatz: 8

BigMike: 8

t2m: 7

zabolyx: 7

taa: 6

onlyme: 6

Member Stats:

Guest Posters: 130

Members: 565

Moderators: 1

Admins: 1

Forum Stats:

Groups: 1

Forums: 7

Topics: 231

Posts: 802

Moderators: rkent (323)

Administrators: rkent (323)