summaryrefslogtreecommitdiff
path: root/src/mailman/pipeline/docs/subject-munging.rst
diff options
context:
space:
mode:
Diffstat (limited to 'src/mailman/pipeline/docs/subject-munging.rst')
-rw-r--r--src/mailman/pipeline/docs/subject-munging.rst249
1 files changed, 249 insertions, 0 deletions
diff --git a/src/mailman/pipeline/docs/subject-munging.rst b/src/mailman/pipeline/docs/subject-munging.rst
new file mode 100644
index 000000000..e7a6553ce
--- /dev/null
+++ b/src/mailman/pipeline/docs/subject-munging.rst
@@ -0,0 +1,249 @@
+===============
+Subject munging
+===============
+
+Messages that flow through the global pipeline get their headers *cooked*,
+which basically means that their headers go through several mostly unrelated
+transformations. Some headers get added, others get changed. Some of these
+changes depend on mailing list settings and others depend on how the message
+is getting sent through the system. We'll take things one-by-one.
+
+ >>> mlist = create_list('_xtest@example.com')
+
+
+Inserting a prefix
+==================
+
+Another thing header cooking does is *munge* the ``Subject`` header by
+inserting the subject prefix for the list at the front. If there's no subject
+header in the original message, Mailman uses a canned default. In order to do
+subject munging, a mailing list must have a preferred language.
+::
+
+ >>> mlist.subject_prefix = '[XTest] '
+ >>> mlist.preferred_language = 'en'
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ...
+ ... A message of great import.
+ ... """)
+ >>> msgdata = {}
+
+ >>> from mailman.pipeline.cook_headers import process
+ >>> process(mlist, msg, msgdata)
+
+The original subject header is stored in the message metadata. We must print
+the new ``Subject`` header because it gets converted from a string to an
+``email.header.Header`` instance which has an unhelpful ``repr``.
+
+ >>> msgdata['origsubj']
+ u''
+ >>> print msg['subject']
+ [XTest] (no subject)
+
+If the original message had a ``Subject`` header, then the prefix is inserted
+at the beginning of the header's value.
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... Subject: Something important
+ ...
+ ... A message of great import.
+ ... """)
+ >>> msgdata = {}
+ >>> process(mlist, msg, msgdata)
+ >>> print msgdata['origsubj']
+ Something important
+ >>> print msg['subject']
+ [XTest] Something important
+
+``Subject`` headers are not munged for digest messages.
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... Subject: Something important
+ ...
+ ... A message of great import.
+ ... """)
+ >>> process(mlist, msg, dict(isdigest=True))
+ >>> print msg['subject']
+ Something important
+
+Nor are they munged for *fast tracked* messages, which are generally defined
+as messages that Mailman crafts internally.
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... Subject: Something important
+ ...
+ ... A message of great import.
+ ... """)
+ >>> process(mlist, msg, dict(_fasttrack=True))
+ >>> print msg['subject']
+ Something important
+
+If a ``Subject`` header already has a prefix, usually following a ``Re:``
+marker, another one will not be added but the prefix will be moved to the
+front of the header text.
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... Subject: Re: [XTest] Something important
+ ...
+ ... A message of great import.
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest] Re: Something important
+
+If the ``Subject`` header has a prefix at the front of the header text, that's
+where it will stay. This is called *new style* prefixing and is the only
+option available in Mailman 3.
+
+ >>> msg = message_from_string("""\
+ ... From: aperson@example.com
+ ... Subject: [XTest] Re: Something important
+ ...
+ ... A message of great import.
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest] Re: Something important
+
+
+Internationalized headers
+=========================
+
+Internationalization adds some interesting twists to the handling of subject
+prefixes. Part of what makes this interesting is the encoding of i18n headers
+using RFC 2047, and lists whose preferred language is in a different character
+set than the encoded header.
+
+ >>> msg = message_from_string("""\
+ ... Subject: =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest] =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ >>> unicode(msg['subject'])
+ u'[XTest] \u30e1\u30fc\u30eb\u30de\u30f3'
+
+
+Prefix numbers
+==============
+
+Subject prefixes support a placeholder for the numeric post id. Every time a
+message is posted to the mailing list, a *post id* gets incremented. This is
+a purely sequential integer that increases monotonically. By added a ``%d``
+placeholder to the subject prefix, this post id can be included in the prefix.
+
+ >>> mlist.subject_prefix = '[XTest %d] '
+ >>> mlist.post_id = 456
+ >>> msg = message_from_string("""\
+ ... Subject: Something important
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] Something important
+
+This works even when the message is a reply, except that in this case, the
+numeric post id in the generated subject prefix is updated with the new post
+id.
+
+ >>> msg = message_from_string("""\
+ ... Subject: [XTest 123] Re: Something important
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] Re: Something important
+
+If the ``Subject`` header had old style prefixing, the prefix is moved to the
+front of the header text.
+
+ >>> msg = message_from_string("""\
+ ... Subject: Re: [XTest 123] Something important
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] Re: Something important
+
+
+And of course, the proper thing is done when posting id numbers are included
+in the subject prefix, and the subject is encoded non-ASCII.
+
+ >>> msg = message_from_string("""\
+ ... Subject: =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ >>> unicode(msg['subject'])
+ u'[XTest 456] \u30e1\u30fc\u30eb\u30de\u30f3'
+
+Even more fun is when the internationalized ``Subject`` header already has a
+prefix, possibly with a different posting number.
+
+ >>> msg = message_from_string("""\
+ ... Subject: [XTest 123] Re: =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] Re: =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+
+..
+ # XXX This requires Python email patch #1681333 to succeed.
+ # >>> unicode(msg['subject'])
+ # u'[XTest 456] Re: \u30e1\u30fc\u30eb\u30de\u30f3'
+
+As before, old style subject prefixes are re-ordered.
+
+ >>> msg = message_from_string("""\
+ ... Subject: Re: [XTest 123] =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest 456] Re:
+ =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+
+..
+ # XXX This requires Python email patch #1681333 to succeed.
+ # >>> unicode(msg['subject'])
+ # u'[XTest 456] Re: \u30e1\u30fc\u30eb\u30de\u30f3'
+
+
+In this test case, we get an extra space between the prefix and the original
+subject. It's because the original is *crooked*. Note that a ``Subject``
+starting with '\n ' is generated by some version of Eudora Japanese edition.
+
+ >>> mlist.subject_prefix = '[XTest] '
+ >>> msg = message_from_string("""\
+ ... Subject:
+ ... Important message
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+ >>> print msg['subject']
+ [XTest] Important message
+
+And again, with an RFC 2047 encoded header.
+
+ >>> msg = message_from_string("""\
+ ... Subject:
+ ... =?iso-2022-jp?b?GyRCJWEhPCVrJV4lcxsoQg==?=
+ ...
+ ... """)
+ >>> process(mlist, msg, {})
+
+..
+ # XXX This one does not appear to work the same way as
+ # test_subject_munging_prefix_crooked() in the old Python-based tests. I need
+ # to get Tokio to look at this.
+ # >>> print msg['subject']
+ # [XTest] =?iso-2022-jp?b?IBskQiVhITwlayVeJXMbKEI=?=