summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorbwarsaw2003-09-13 06:00:43 +0000
committerbwarsaw2003-09-13 06:00:43 +0000
commit2aece1de2b702626c57a34381bc631f92a2bb024 (patch)
tree14d5cb452b66e4a27d46600e8991cfcbba264bb6
parentce61a198d9a9d6028c4e9f7dda7e1b3d25435083 (diff)
downloadmailman-2aece1de2b702626c57a34381bc631f92a2bb024.tar.gz
mailman-2aece1de2b702626c57a34381bc631f92a2bb024.tar.zst
mailman-2aece1de2b702626c57a34381bc631f92a2bb024.zip
process(): In the msg.is_multipart() clause, inside the clause that
tries to convert t to something reasonable <wink>, we need to use errors='replace' when we encode from unicode to string. This is because the preceding unicode('ascii', 'replace') could end up inserted U+FFFD, which can't be encoded to ascii.
-rw-r--r--Mailman/Handlers/Scrubber.py7
1 files changed, 5 insertions, 2 deletions
diff --git a/Mailman/Handlers/Scrubber.py b/Mailman/Handlers/Scrubber.py
index b5be73dfc..7bc5f510d 100644
--- a/Mailman/Handlers/Scrubber.py
+++ b/Mailman/Handlers/Scrubber.py
@@ -301,8 +301,11 @@ Url : %(url)s
try:
t = unicode(t, partcharset, 'replace')
except (UnicodeError, LookupError):
- # Replace funny characters
- t = unicode(t, 'ascii', 'replace').encode('ascii')
+ # Replace funny characters. We use errors='replace' for
+ # both calls since the first replace will leave U+FFFD,
+ # which isn't ASCII encodeable.
+ u = unicode(t, 'ascii', 'replace')
+ t = u.encode('ascii', 'replace')
try:
# Should use HTML-Escape, or try generalizing to UTF-8
t = t.encode(charset, 'replace')