summaryrefslogtreecommitdiff
path: root/Mailman/rules/docs/header-matching.txt
diff options
context:
space:
mode:
authorBarry Warsaw2008-02-02 23:03:19 -0500
committerBarry Warsaw2008-02-02 23:03:19 -0500
commitf03c31acb800d79c606ee3e206868aef8a08bfda (patch)
tree15e0b72f129b6ee5f4515647c8c25e0c970a80d9 /Mailman/rules/docs/header-matching.txt
parent7c5b4d64df6532548742460d405a8a64e35b22c2 (diff)
parent4823801716b1bf1711d63b649b0fafd6acd30821 (diff)
downloadmailman-f03c31acb800d79c606ee3e206868aef8a08bfda.tar.gz
mailman-f03c31acb800d79c606ee3e206868aef8a08bfda.tar.zst
mailman-f03c31acb800d79c606ee3e206868aef8a08bfda.zip
Merge the 'rules' branch.
Give the first alpha a code name. This branch mostly gets rid of all the approval oriented handlers in favor of a chain-of-rules based approach. This will be much more powerful and extensible, allowing rule definition by plugin and chain creation via web page. When a message is processed by the incoming queue, it gets sent through a chain of rules. The starting chain is defined on the mailing list object, and there is a built-in default starting chain, called 'built-in'. Each chain is made up of links, which describe a rule and an action, along with possibly some other information. Actions allow processing to take a detour through another chain, jump to another chain, stop processing, run a function, etc. The built-in chain essentially implements the original early part of the handler pipeline. If a message makes it through the built-in chain, it gets sent to the prep queue, where the message is decorated and such before sending out to the list membership. The 'accept' chain is what moves the message into the prep queue. There are also 'hold', 'discard', and 'reject' chains, which do what you would expect them to. There are lots of built-in rules, implementing everything from the old emergency handler to new handlers such as one not allowing empty subject headers. IMember grows an is_moderated attribute. The 'adminapproved' metadata key is renamed 'moderator_approved'. Fix some bogus uses of noreply_address to no_reply_address. Stash an 'original_size' attribute on the message after parsing its plain text. This can be used later to ensure the original message does not exceed a specified size without have to flatten the message again. The KNOWN_SPAMMERS global variable is replaced with HEADER_MATCHES. The mailing list's header_filter_rules variable is replaced with header_matches which has the same semantics as HEADER_MATCHES, but is list-specific. DEFAULT_MAIL_COMMANDS_MAX_LINES -> EMAIL_COMMANDS_MAX_LINES. Update smtplistener.py to be much better, to use maildir format instead of mbox format, to respond to RSET commands by clearing the maildir, and by silencing annoying asyncore error messages. Extend the doctest runner so that it will run .txt files in any docs subdirectory in the code tree. Add plugable keys 'mailman.mta' and 'mailman.rules'. The latter may have only one setting while the former is extensible. There are lots of doctests which should give all the gory details. Mailman/Post.py -> Mailman/inject.py and the command line usage of this module is removed. SQLALCHEMY_ECHO, which was unused, is removed. Backport the ability to specify additional footer interpolation variables by the message metadata 'decoration-data' key. can_acknowledge() defines whether a message can be responded to by the email robot. Simplify the implementation of _reset() based on Storm fixes. Be able to handle lists in Storm values. Do some reorganization.
Diffstat (limited to 'Mailman/rules/docs/header-matching.txt')
-rw-r--r--Mailman/rules/docs/header-matching.txt145
1 files changed, 145 insertions, 0 deletions
diff --git a/Mailman/rules/docs/header-matching.txt b/Mailman/rules/docs/header-matching.txt
new file mode 100644
index 000000000..fbd0ff65f
--- /dev/null
+++ b/Mailman/rules/docs/header-matching.txt
@@ -0,0 +1,145 @@
+Header matching
+===============
+
+Mailman can do pattern based header matching during its normal rule
+processing. There is a set of site-wide default header matchines specified in
+the configuaration file under the HEADER_MATCHES variable.
+
+ >>> from Mailman.app.lifecycle import create_list
+ >>> mlist = create_list(u'_xtest@example.com')
+
+Because the default HEADER_MATCHES variable is empty when the configuration
+file is read, we'll just extend the current header matching chain with a
+pattern that matches 4 or more stars, discarding the message if it hits.
+
+ >>> from Mailman.configuration import config
+ >>> chain = config.chains['header-match']
+ >>> chain.extend('x-spam-score', '[*]{4,}', 'discard')
+
+First, if the message has no X-Spam-Score header, the message passes through
+the chain untouched (i.e. no disposition).
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... To: _xtest@example.com
+ ... Subject: Not spam
+ ... Message-ID: <one>
+ ...
+ ... This is a message.
+ ... """)
+
+ >>> from Mailman.app.chains import process
+
+Pass through is seen as nothing being in the log file after processing.
+
+ # XXX This checks the vette log file because there is no other evidence
+ # that this chain has done anything.
+ >>> import os
+ >>> fp = open(os.path.join(config.LOG_DIR, 'vette'))
+ >>> fp.seek(0, 2)
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG:
+ <BLANKLINE>
+
+Now, if the header exists but does not match, then it also passes through
+untouched.
+
+ >>> msg['X-Spam-Score'] = '***'
+ >>> del msg['subject']
+ >>> msg['Subject'] = 'This is almost spam'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<two>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG:
+ <BLANKLINE>
+
+But now if the header matches, then the message gets discarded.
+
+ >>> del msg['x-spam-score']
+ >>> msg['X-Spam-Score'] = '****'
+ >>> del msg['subject']
+ >>> msg['Subject'] = 'This is spam, but barely'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<three>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG: ... DISCARD: <three>
+ <BLANKLINE>
+
+For kicks, let's show a message that's really spammy.
+
+ >>> del msg['x-spam-score']
+ >>> msg['X-Spam-Score'] = '**********'
+ >>> del msg['subject']
+ >>> msg['Subject'] = 'This is really spammy'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<four>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG: ... DISCARD: <four>
+ <BLANKLINE>
+
+Flush out the extended header matching rules.
+
+ >>> chain.flush()
+
+
+List-specific header matching
+-----------------------------
+
+Each mailing list can also be configured with a set of header matching regular
+expression rules. These are used to impose list-specific header filtering
+with the same semantics as the global `HEADER_MATCHES` variable.
+
+The list administrator wants to match not on four stars, but on three plus
+signs, but only for the current mailing list.
+
+ >>> mlist.header_matches = [('x-spam-score', '[+]{3,}', 'discard')]
+
+A message with a spam score of two pluses does not match.
+
+ >>> del msg['x-spam-score']
+ >>> msg['X-Spam-Score'] = '++'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<five>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG:
+
+A message with a spam score of three pluses does match.
+
+ >>> del msg['x-spam-score']
+ >>> msg['X-Spam-Score'] = '+++'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<six>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG: ... DISCARD: <six>
+ <BLANKLINE>
+
+As does a message with a spam score of four pluses.
+
+ >>> del msg['x-spam-score']
+ >>> msg['X-Spam-Score'] = '+++'
+ >>> del msg['message-id']
+ >>> msg['Message-ID'] = '<seven>'
+ >>> file_pos = fp.tell()
+ >>> process(mlist, msg, {}, 'header-match')
+ >>> fp.seek(file_pos)
+ >>> print 'LOG:', fp.read()
+ LOG: ... DISCARD: <seven>
+ <BLANKLINE>