I've recently had several regular messages bounced with
X-Diagnostic: already on the subscriber list
even though the messages contain nothing even remotely resembling "subscribe". Logs show that for some reason they are interpreted as administratice commands by rc.submit, sent on to rc.request which apparently treats as subscription requests everything that doesn't look like anything it knows.
What the offensive messages have in common is that they are short and begin with a number or with a non-ASCII letter.
The regexps in rc.submit are a bit convoluted, but if I read them right, any message shorter than 640 characters and 6 lines that doesn't begin with a letter (after indentation, if any) is passed to rc.request.
I guess such messages are quite rare in English, but in Finnish they're much more common (both because several Finnish letters are not included in [a-z] but also because sentences beginning with numbers are common for grammatical reasons).
If I understand correctly, the key regexp is this:
* -100^0 B ?? ^^([ ]|$)*\ ((((archives?:?($|[ ]+)|\ ((send|get)(me)?|gimme|retrieve|mail|ls|dir(ectory)?|\ list|show|search|[fe]?grep|find|maxfiles|version|help|info)\ ([ ].*)?$)([ ]|$)*)+\ ([^ a-z].*$(.*$(.*$(.*$(.*$)?)?)?)?)?^^|\ (help|info)[ ]*$|\ (add|join|leave|sign( [^ ]+ |-)?o(n|ff)|(un|de)?-?sub)>)|\ ([^ a-z].*$(.*$(.*$(.*$(.*$)?)?)?)?)?^^|\ .*( (join|leave|add .* to|(delete|remove) .* from|\ (take|sign|get) .* off|(put|sign) .* on) .* [a-z-]*list|\ (un-?|sub?)scri(be|ption))>|\ ^^)
and in particular line
([^ a-z].*$(.*$(.*$(.*$(.*$)?)?)?)?)?^^|\
which, in conjunction with the first line there matches if the message begins with anything other than a letter a-z or whitespace (or is empty) and is at most six lines long.
Am I reading this right?
Why are such messages being trapped - I thought request messages usually begin with letters? Or, what exactly is that trying to catch, how much do I dare to modify it without breaking something? If I remove that one line entirely, what will break?