No Multigram For Admin Please
I'm confused by this multigram stuff, and when it kicks in... The lists I work with are pretty small, and it would be ideal for me if the X-Command to unsubscribe was just "perfect match" OTOH, I'd like for people who unsubscribe themselves (the blessed few with a clue) were able to match foo@mail.whatever.com and foo@whatever.com successfully and just take them off if they're that close... But I don't even know if that's what multigram does or not. I've read the docs and FAQ, but they seem to jump right into the numbering scheme without explaining how/when multigram applies... Or maybe I'm just reading the wrong section of the docs... So... if some kind soul would tell me when multigram kicks in and how I would alter rc.custom for the admin X-Command to be strictly a perfect match thing without screwing up auto-request commands from listees, I'd sure appreciate it... Oh yeah: One last thing: The address that is about to unsubscribed generates the following multigram matches to the current list: 177 rlynch@ignitionstate.com 32734 rlynch@ignitionstate.com 18 a 17066 rlynch@ignitionstate.com 118 kevin.erickson@bmge.com 3250 rlynch@ignitionstate.com 174 rfblitho@compuserve.com 3250 rlynch@ignitionstate.com 36 boche@billions.com 3247 rlynch@ignitionstate.com 167 rademaker@compuserve.com 3246 rlynch@ignitionstate.com 176 rjolech@herff-jones.com 3197 rlynch@ignitionstate.com 87 guns@scratchie.com 2527 rlynch@ignitionstate.com Like, the address I'm trying to remove is there, and it should be a perfect match, and I don't understand why that first one isn't just 32766... [See, I read the docs enough to know it goes up to 32766 :-)] Or, rather, why is multigram kicking in at all? If it's a perfect match, isn't it supposed to just unsubscribe it?
"Richard Lynch" <richard@zend.com> writes:
I'm confused by this multigram stuff, and when it kicks in...
The lists I work with are pretty small, and it would be ideal for me if the X-Command to unsubscribe was just "perfect match"
OTOH, I'd like for people who unsubscribe themselves (the blessed few with a clue) were able to match foo@mail.whatever.com and foo@whatever.com successfully and just take them off if they're that close... But I don't even know if that's what multigram does or not.
multigram is indeed what does the partial matching. It also inserts and deletes address from the dist file, as well as a bunch of other things. multigram is pretty much the most inscrutable part of SmartList and procmail.
So... if some kind soul would tell me when multigram kicks in and how I would alter rc.custom for the admin X-Command to be strictly a perfect match thing without screwing up auto-request commands from listees, I'd sure appreciate it...
Hmm, that's an interesting idea. There's no way to do it right now without altering the ".bin/unsubscribe" shell script. I've cc:ed the smartlist-dev mailing list: do people think this would be a worthwhile enhancement?
Oh yeah: One last thing:
The address that is about to unsubscribed generates the following multigram matches to the current list:
177 rlynch@ignitionstate.com 32734 rlynch@ignitionstate.com 18 a 17066 rlynch@ignitionstate.com 118 kevin.erickson@bmge.com 3250 rlynch@ignitionstate.com 174 rfblitho@compuserve.com 3250 rlynch@ignitionstate.com 36 boche@billions.com 3247 rlynch@ignitionstate.com 167 rademaker@compuserve.com 3246 rlynch@ignitionstate.com 176 rjolech@herff-jones.com 3197 rlynch@ignitionstate.com 87 guns@scratchie.com 2527 rlynch@ignitionstate.com
Like, the address I'm trying to remove is there, and it should be a perfect match, and I don't understand why that first one isn't just 32766... [See, I read the docs enough to know it goes up to 32766 :-)] Or, rather, why is multigram kicking in at all? If it's a perfect match, isn't it supposed to just unsubscribe it?
Perfect matches don't always generate a score of 32766 due to roundoff error in the process that multigram uses to calculate the score: it breaks up the addresses into "grams" (regular sized chunks) and adds a bit to the score for each gram that occurs only once in the address being compared (rlynch@ignitionstate.com, in this case) but at least once in the address from the dist file. It repeats that process for gram lengths of four, three, and two characters (the -w flag sets the max gram length). The roundoff occurs because it does all the arithmetic using integers, no floating point, even when it splits up the desired final score for a perfect match among all the partial matches that will take place. Actually, given the numbers, it would seem to be impossible to get a final score of 32766 if the address is more than three characters long. (Then again, I could be completely wrong.) Philip Guenther
participants (2)
-
Philip Guenther
-
Richard Lynch