On Sun, Mar 07, 2010 at 11:50:19PM -0500, Mike Devour wrote:
Here is a solution I've come up with which *SEEMS* to work:
:0 fhw * ^content-type:(.*\<)?multipart.*\<(boundary=\/[^"; ]+| boundary="\/[^"; ]+) { TESTVAR = $MATCH }
As I recall, you wanted to lose the quote marks in the match; why not simply use a regex like: * ... boundary="*\/[^"; ]+ In the simple tests I just ran, this extracts the boundary string, minus double quote chars, if present. Do you have examples of boundary strings for which this regex fails?
... it would make the most sense to match anything except double-quote, semi-colon, or whitespace, except that we don't seem to be allowed to use control characters, POSIX character classes or shortcuts ( e.g.: [:space:] or \s ) inside bracket expressions in procmail? This seems to be impossible to do:
[^";<something that stands for all whitespace characters>]
I wish I could understand why?
Cary, you say that procmail doesn't allow escaping of control characters. Does that mean things like \f \n \r \n \t \v will not be understood in any context? Or just inside bracket expressions? Can you or anyone point me to where that's documented, please?
man procmailrc, in the MISCELLANEOUS section says: The regular expression engine built into procmail does not support named character classes. I don't see any explicit mention of the \f... syntax but I don't think it's allowed. What's wrong with matching "anything except double-quote, semi-colon, or whitespace" with [^";XXX] where "XXX" is your favorite set of space, tab, ^M, etc.? Jim