
On 14 Sep 2018, at 09:57, Jostein Berntsen <jbernts@broadpark.no> wrote:
On 14.09.18,17:34, Andreas Schamanek wrote:
On Fri, 14 Sep 2018, at 15:40, Jostein Berntsen wrote:
when I use recipes like these to filter messages with Scadinavian characters (æ,ø,å) in Subject it fails to work. My locale is nb_NO.UTF-8. Is there a recipe that can be used to match these cases?
:0 * ^Subject:.*lån innboks/IN-spam/
A proper message must have such characters encoded. Look at the source of messages. You will see something like (for "lån")
=?UTF-8?B?bMOlbg==?=
When you match against this (mind the ? und escape them as \?) it should work.
Thanks. I solved it doing this:
:0 h * ^Subject:.*=\? SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print encode("UTF8",decode("MIME-Header",$_))'
By rewriting the message to include UTF-8 characters in the headers you have just made your message invalid as the mail headers can only contain 7-BIT ASCII and anything else must be encoded. However, it's your mail, do as you will. You *will* have issues if you try to do something else with those messages, ever. Like, for example, import them into a different client. Or put them on an IMAP server.
Something for the manual maybe?
No. Andreas gave you the right solution, match against the encoded text in the subject :0 * ^Subject:.*\UTF-8\?\V\?bMOlbg { do stuff } Or, save your UTF-8 decoded subject into a variable like UTFSUB=| formail… -- Space Directive 723: Terraformers are expressly forbidden from recreating Swindon.