Forum

Or log in with
 
Current User: Guest Login Register
Please consider registering


Register? | Lost Your Password?

Subject Regex and non-ASCII chars

Reply to Post Add a New Topic
UserPost

5:43 am
May 12, 2010


t2m

Member

posts 7

I have been using FiltaQuilla with great pleasure, Subject regex matching is wonderful. Thanks a lot!

 

However… I have a simple regex which intended to match a french word with a non-ASCII char:  /raté/i

… but this does not seem to work at all. :-(

 

Replacing the "é" by "…" seem to work in some cases, e.g. for emails where the Subject header is encoded in ISO-8859-1 as "Subject: =?iso-8859-1?Q?rat=E9?=".  But this workaround does not work for UTF-8 encoded subjects (e.g. "Subject: =?UTF-8?B?cmF0w6k=?=").

By contrast using Thunderbird basic string matching for Subject does work as expected for subjects encoded in these ways.

 

Could it be that FiltaQuilla works on the non-decoded strings ?

If so, that would be great to fix !

(and I guess it applies to To, From and Cc headers too)

 

Thanks

1:30 pm
May 12, 2010


rkent

Admin

posts 279

I'm going to have to check a little more to make sure if the "subject" variable I am using has been decoded or not. I thought it had, but looking at some other code I am starting to doubt that.

Anyway I'll add this as a bug to investigate, hopefully before the next release.

1:50 am
May 18, 2010


t2m

Member

posts 7

Ok, don't hesitate to ping me if you want me to test !

Thanks you.

5:57 am
July 2, 2010


t2m

Member

posts 7

Post edited 5:58 am – July 2, 2010 by t2m


After digging a bit mozilla developper documentation, I found that :

( from

https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIMsgDBHdr )

subject

string
Indicates

the subject of this message; the equivalent header is the

Subject: header. The value here will effectively be the unparsed header

content, so it will contain full MIME-encoded syntax.

….    
     

mime2DecodedAuthor

AString
Readonly:

mime2DecodedSubject

AString
Readonly:

mime2DecodedRecipients

AString
Readonly:

 

Knowning this, I cooked the following patch, which fixes the subject matching issue for me:

 

— filtaquilla.js.orig    2010-07-02 15:34:42.000000000 +0200

+++ filtaquilla.js    2010-07-02 15:37:57.000000000 +0200

@@ -834,7 +834,7 @@

       },

       match: function subjectRegEx_match(aMsgHdr, aSearchValue, aSearchOp)

       {

-        var subject = aMsgHdr.subject;

+        var subject = aMsgHdr.mime2DecodedSubject;

         let searchValue;

         let searchFlags;

         [searchValue, searchFlags] = _getRegEx(aSearchValue);

 

I can now use accents in my subject matching regexps.

 

Can you consider including this patch in a later revision ?

 

Note that you may also want to generalise this fix in other parts of the code, such as in :

if (/@SUBJECT@/.test(parameter))

-      return parameter.replace(/@SUBJECT@/, hdr.subject);

+      return parameter.replace(/@SUBJECT@/, hdr.mime2DecodedSubject);

and same for authors and recipients headers.

 

6:40 am
July 2, 2010


rkent

Admin

posts 279

I'll be happy to include that in a future release. Thanks for investigating this!

Reply to Post

Reply to Topic:
Subject Regex and non-ASCII chars

Guest Name (Required):

Guest Email (Required):

NOTE: New Posts are subject to administrator approval before being displayed

Smileys
Confused Cool Cry Embarassed Frown Kiss Laugh Smile Surprised Wink Yell
Post New Reply

Guest URL (required)

Math Required!
What is the sum of:
9 + 8
   


 
Share