Junk management for newsgroups in Thunderbird 3

July 21, 2010 – 8:14 am

Thunderbird since version 3 has had experimental support for junk filtering in newsgroups. The feature basically works fine, but the user interface mostly fights against your attempts to use it. I’d like to give brief instructions here for anyone who wants to try it.

You’ll need to install my addon JunQuilla to enable one critical piece of user interface. JunQuilla supports a folder property that lets you selectively enable or disable junk processing for a tree of folders. So after you’ve installed JunQuilla, enable processing of junk for a newsgroup:

This will run future posts sent to the newsgroup m.d.a.thunderbird through the bayes junk filter in Thunderbird. After this is enabled, some of the junk management controls on the folder should be enabled. Try “Run Junk Controls on Folder” to process existing posts for junk.

But nothing will appear to happen when you do, because there is no functionality to delete or hide the junk messages for newsgroups built in. Still, you can see the results using JunQuilla’s “junk percent” and “junk status +” columns:

The next step is to remove those junk messages from your view. The easiest thing to do is to create a virtual folder that shows the messages in the newsgroup without the junk messages. This is where the user interface fights against you. But you can trick it in the following manner.

Create a virtual folder on a message folder with search criteria “Junk Percent < 80″. Save the folder as a subfolder of a local mail folder. Now open the virtual folder and change the folders that it scans, removing the original folder and replacing it with the newsgroup folder:

Now the spam posts are hidden from this virtual folder:

You may need to specifically train the junk filter using a few junk and good messages in news. But the good news is that newsgroup spammers are not really optimizing against bayes filters, so it seems to be a lot easier to detect newsgroup spam than mail spam.

Thunderbird’s Strategic Dilemma

July 15, 2010 – 4:58 pm

In a recent tb-planning post, neandr wrote:

With all respect for the people working at Mozilla/Thunderbird and fully understand the limitation they are faced with, I would like to see a more detailed mission statement for the products (TB/LG) and the future of it. Only expressing TB is for individual users, SOHO and not for the Enterprise is a very vague  statement

I was going to respond to that in the thread, but I got wordy so I posted this blog entry instead.

At the recently completed Mozilla Summit, variations of this request were made by many people that are close to the Thunderbird project (including myself). But after listening to several days of Firefox people extolling the virtues of moving everything to the browser, and being “more like the web”, I have a new appreciation of how difficult the development of a vision statement is for the Thunderbird team.

The standard game plan that Mozilla projects are expected to follow is to develop an application with a significantly high market share so that they can use their market influence to fight for the rights of individual users. Mozilla is a fascinating organization as a hybrid commercial/public interest organization, and they take their values quite seriously.

Unfortunately Thunderbird, which is the only real product that MoMo currently has, hovers around 6 million users, which is much lower than the number they believe are necessary to have the influence they would like (I have heard 100 million users as a goal). Nobody currently has a concrete plan to develop a product with 100 million users. So the current strategy (as I see it) is to try a series of experiments to try to develop some concepts that might be used to specify the 100 million user product. I can’t resist naming things (my wife calls me “an Adam”), so let’s call this future product by the code name “Gigabird”.

One such experiment is Raindrop. Other experiments are going on in extensions to Thunderbird, which seem to be focusing on changes to the user interface. Right now, that is where the vast majority of the developer resources are focused at MoMo.

So if your real strategic mission is to develop Gigabird, what do you do with your legacy product Thunderbird? The big problem is that the “ordinary users” that are the primary focus of Firefox (and by implication also MoMo) are migrating away from email for many forms of messaging to other media – Twitter, Facebook, text messages, web forums, blog comments, etc. The hardcore users of email (who are likely to continue to use a desktop client) are sitting in office cubicles, yet going after these “enterprise” users is counter-cultural for Mozilla.

So what are the strategic choices available to MoMo?

the HailMary

The goal here is to try to come up with one or more really clever innovations that will form the basis of Gigabird. This is, after all, the way that some of the new messaging formats have occurred, with Twitter as the poster child. Using these innovations as a base, the basic plan for Gigabird will be formulated at some point in the future. This is the current MoMo strategy, at least as I see it. Given the existing Mozilla culture, I would probably do this as well. (Warning: should this strategy every become publicly revealed, the director will disavow any knowledge of these actions.)

the AboutFace

Here you notice that the values-driven direction that you felt so passionate about is actually not going to get you anywhere, so you make a major readjustment in values to allow you to pragmatically accept a new direction. Such moves have been done by Mozilla in the past, and are part of the standard corporate Myth propagated by Mitchell Baker (the story about how in the early days they were adamant about never shipping a binary). The application to MoMo could be to accept that what they have is an email product, and their future users are going to be sitting in cubicles. Users in cubicles should have rights too, so there could be a valid Mozilla Foundation purpose in fighting for the rights of these “enterprise users” and let Thunderbird develop into an enterprise product.

the SlowPlod

This is the direction that existing Thunderbird users are hoping for. The ultimate goal is to slowly improve Thunderbird until it is undeniably the best email client around. You fix any important bugs.  You support all of the hot new messaging concepts. You spiff up the user interface, incrementally adding new features that provide small improvements to usability. You keep your power uses happy with lots of extensions for specialized purposes. It’s pretty clear that dmose does not believe that he has sufficient resources to pursue this strategy, nor is he likely to have them in the foreseeable future. The Thunderbird code base is also really hard to adapt to these new media (witness the struggles that I have had or jcranmer’s blog ). I think that the MoMo team wishes us well, but believes that the future lies elsewhere.

the VacuumTube

Just because you can’t change the direction of humanity does not mean that you have nothing. Vacuum tubes are long gone – yet the guitar player at my church proudly uses his fancy amp with glowing tubes showing through plexiglass. The company that bought one of my previous businesses had also previously purchased a manufacturer of vacuum tubes, which had morphed into specialized purposes like lamps for spectroscopy, and nuclear-warfare-resistant cathode ray tubes. Email clients will be with us forever, and in the hands of people who love them could have a useful future in various niches.

My Prediction

MoMo will pursue the HailMary until they have enough ideas to formulate a real plan. At that point, they will want to devote all of their resources to Gigabird, and be looking for an honorable way to retreat from Thunderbird – which will be a variation of the VacuumTube. The likely retreat will probably be some sort of future custodianship by a conglomeration of companies that provide a freemium strategy. So if there was a basic, free Thunderbird product that could be enhanced with addons with commercial value (like my Exchange Web Services product, or Postbox as a Thunderbird addon), then MoMo could pursue their vision without abandoning their Thunderbird users, and let companies like MesQuilla and Postbox support Thunderbird.

Sending a Message (Mailnews Exchange Support)

July 15, 2010 – 1:33 pm

I can now send a message through Exchange server from my Thunderbird installation.

Perhaps it would be interesting to show how I hooked into the sending function in the user interface. I asked the usual suspects, and it was not clear to anyone that it could be done without adding backend hooks – which I would like to avoid as much as possible to increase my chances of getting some initial alphas of this released to work with existing TB 3.1 users.

It turned out to be fairly straight forward. In MsgComposeCommands.js there is an observer notification that occurs, called “mail:composeOnSend”. This occurs right before the UI is ready to call SendMsg on the nsIMsgCompose object gMsgCompose. So what I needed to do was to intercept that call, and implement my own version of SendMsg rather than the SMTP/Rfc822-focused standard C++ code. To do that, I create a new object when I receive the notification with the old object as its prototype, then include a custom SendMsg that only applies if the sending account is an Exchange server. The overlay code ends up looking like this (greatly simplified):

function observe(subject, topic, data) {
 // wrap the gMsgCompose object so that we can detect attempts to
 //  send using ews.
 let newCompose = new ewsCompose(gMsgCompose);
 gMsgCompose = newCompose;
}

// ewsCompose provides a wrapper around the compose object, so that we
//  can override functions.
function ewsCompose(oldCompose) {
 this.oldCompose = oldCompose;
 this.SendMsg = function ewsSendMsg
     (msgType, identity, currentAccountKey, msgWindow, progress) {
   if (incomingServer instanceof Ci.msqIEwsIncomingServer) {
     // sending using Exchange Web Services (details not shown)
     // ...
     ewsCompose.sendMsg();
     return;
   }
   else
     return this.oldCompose.SendMsg
       (msgType, identity, currentAccountKey, msgWindow, progress);
 }
 this.__proto__ = oldCompose;
}

This seems to work just fine. It was a little tricky getting the compose code to believe that I had actually sent the message, so it could quit complaining of an unsent message. What I ended up doing in my “sending succeeded” callback is to add a few shutdown calls:

let ewsEventListener = {
  // msqIEwsEventListener implementation
  onEvent: function onEvent(aItem, aEvent, aData, aResult) {
    if (aEvent == 'StopRequest') {
      // ewsProgress was saved from the "progress" variable
      //   in nsMsgComposeCommands.js
      if (ewsProgress)
        ewsProgress.closeProgressDialog(aResult == Cr.NS_OK ? false : true);
      stateListener.ComposeProcessDone(aResult);
      MsgComposeCloseWindow(true);
    }
  }
}

With the ability now to send and receive messages, I’ve completed a single vertical pass through all of the key functionality. There are many, many details that I have passed over in the process. But it’s time to start thinking about what I would need to make work to get a usable alpha release of this, perhaps in about six weeks.

Data Persistence (Mailnews Exchange Support)

June 25, 2010 – 2:55 pm

My project to provide Exchange Web Services (EWS) support to applications based on the Mozilla mailnews codebase entered a new phase this week, where I am starting to consider the issue of local persistence of data downloaded from the server. (In the previous week, I got two other things working: display of HTML emails, and updating of UNREAD status from the local app to the server).

EWS messages do not come from the server in RFC-822 format, so it seems like a pity to store them that way, though that is the common method used in the rest of the mailnews codebase. Instead, I decided to implement a local storage scheme based on SQLite and Mozilla’s Storage interface. Andrew Sutherland has done a lot of great work setting up an environment similar to this for the gloda database, so there are lots of good examples to pull from. Also, because the datamodel for EWS includes not only messages but also Calendar and Contact items, I can have a common database infrastucture that I can leverage over those other pieces once I get it working for the messaging part.

I’ve now replaced my previous in-memory datastore for message metadata with an SQLite version. This is equivalent to the datastore module in gloda, and the data it is storing is like the RFC-822 headers. I still have to do the storage for the body, and also hook this up with folder change state so that the code knows that it has data it can trust.

As I have done this, I’ve had a new set of insights into the relationship of the various objects in the Mozilla mailnews world (which I sometimes call Skink). Previously, I had sort of expected that the natural progression of gloda would be to slowly displace the role of the message summary database, nsIMsgDBHdr.  But now I see that a more natural progression would be for SQLite to be used as a replacement for the local mailstore (currently mbox, with maildir support moving forward as well.) Really the main issue is the async nature of the SQLite calls, which sort of precludes its easy use as a replacement for nsIMsgDBHdr. But the datastores are typically accessed async anyway. If the message metadata in the message stores was stored primarily in SQLite format, as I will be doing, then it would be much easier to hookup an SQLite-based global search facility to all of these databases. Yes that is what gloda does now, but it has to go through all of the work to maintain a separate version of everything. Why have three copies of everything (Mork, MBox, and Gloda) when you could only have two (Mork and Gloda)?

As another insight, while looking through the gloda code I noticed that a JSON object was being saved to store some of the items. I though that was a good idea at first – but then I tried to write a simple serializer to convert from my internal native format to JSON objects, and saw that it was not going to be an easy project. But then I remembered that SOAP is really just a mechanism to serialize typed objects, and I already have a SOAP encoder and decoder! So instead of using JSON, I use objects serialized with my SOAP XML encoder to store unindexed items in my SQLite store. So a message (sans body) ends up looking like this as a TEXT item in SQLite:

<Message xmlns="http://schemas.microsoft.com/exchange/services/2006/types">
 <Subject>Postini First Junk Email Safely Quarantined</Subject>
 <DateTimeReceived>2010-06-04T22:19:30Z</DateTimeReceived>
 <Size>2612</Size>
 <Importance>Normal</Importance>
 <DisplayTo>Kent James</DisplayTo>
 <Culture>en-US</Culture>
 <Sender>
  <Mailbox>
   <Name>Postini Support</Name>
   <EmailAddress>noreply@hostedmsexchange.com</EmailAddress>
   <RoutingType>SMTP</RoutingType>
  </Mailbox>
 </Sender>
 <ToRecipients>
  <Mailbox>
   <Name>Kent James</Name>
   <EmailAddress>rkentjames@caspia.org</EmailAddress>
   <RoutingType>SMTP</RoutingType>
  </Mailbox>
 </ToRecipients>
 <From>
  <Mailbox>
   <Name>Postini Support</Name>
   <EmailAddress>noreply@hostedmsexchange.com</EmailAddress>
   <RoutingType>SMTP</RoutingType>
  </Mailbox>
 </From>
 <InternetMessageId>&lt;0c34b5a4-5f3c-4654-bf9d-99c9a8cb439b@HUB02.4emm.local&gt;
 </InternetMessageId>
 <IsRead>1</IsRead>
</Message>

At first it bothered me to save what is essentially a duplicate of what is coming over the wire, but why not? It’s not conceptually any different than RFC-822, or JSON, in function.

Javascript filter action in Thunderbird with FiltaQuilla

June 16, 2010 – 3:54 pm

I received an email today asking that I add a feature to FiltaQuilla. Slightly edited, the author said:

Something I’ve found myself doing at work is creating a new filter for every folder I create. I work on technical cases and for each new case number I create a new folder and have all emails with that case number go into that folder.  The crappy part about it is that I literally have hundreds of cases I deal with, and hence hundreds of filters.  You’ve already got the regex match criteria in filtaqulla, I’d love to be able to take that match criteria and use whatever string it matches as the destination folder action criteria. Sample subject line contains the following: 2010-0609-518

One of the features added to my extension FiltaQuilla recently is the ability to add custom javascript actions. I thought that I would give a shot at doing this request as a custom javascript action. The author of the email was expecting this to get linked to the regex search term, but that is not the easiest way to do it. It is easier to just let the filter action also do the regex search.

After installing FiltaQuilla, you need to enable the custom javascript action, which is done on the Addon options page. After that, you can add a javascript custom action in the filter editor.

Then you need to write the javascript code for the action. I’ve done similar work in the past, so that took me about an hour to get correct. Then insert the code into the action field for the javascript action, and it’s ready to go! You have to have some sort of search in the filter as well. You could just search for everything if you wanted, or you could add a regex search for the precise term if you want. It doesn’t matter much, as all of the work is really being done in the action. You just need to make sure that the action gets called for each message that might need to be moved. There’s more documentation of this feature available at the FiltaQuilla page on this site.

The code I came up with is:

let digitsRegex = /20\d\d\-[0-1]\d[0-3]\d\-\d\d\d/;
let acctmgr = Cc["@mozilla.org/messenger/account-manager;1"]
                .getService(Ci.nsIMsgAccountManager);
let copyService = Cc["@mozilla.org/messenger/messagecopyservice;1"]
                    .getService(Ci.nsIMsgCopyService);
for (let index = 0; index < msgHdrs.length; index++)
{
  let hdr = msgHdrs.queryElementAt(index, Ci.nsIMsgDBHdr);
  let theDigits = digitsRegex.exec(hdr.subject);
  let folders = acctmgr.allFolders;
  let length = folders.length;
  for (let i = 0; i < length; i++) {
    let folder = folders.queryElementAt(i, Ci.nsIMsgFolder);
    if (folder.name == theDigits) {
      let messages = Cc["@mozilla.org/array;1"]
                       .createInstance(Ci.nsIMutableArray);
      messages.appendElement(hdr, false);
      copyService.CopyMessages(hdr.folder, messages, folder, true,
                               null, msgWindow, false);
    }
  }
}

With this filter, if you create a folder with a name like “2010-0616-001″, and your message has that in the subject, the message will get moved to that folder.

If you are having trouble getting it to work, an easy way to debug is just to insert statements like this into the javascript:

Cu.reportError("I am here");

Those printouts will show up on the error console, so you should be able to see if your filter action code is working or not.