diff options
| author | bwarsaw | 2003-09-13 06:00:43 +0000 |
|---|---|---|
| committer | bwarsaw | 2003-09-13 06:00:43 +0000 |
| commit | 2aece1de2b702626c57a34381bc631f92a2bb024 (patch) | |
| tree | 14d5cb452b66e4a27d46600e8991cfcbba264bb6 | |
| parent | ce61a198d9a9d6028c4e9f7dda7e1b3d25435083 (diff) | |
| download | mailman-2aece1de2b702626c57a34381bc631f92a2bb024.tar.gz mailman-2aece1de2b702626c57a34381bc631f92a2bb024.tar.zst mailman-2aece1de2b702626c57a34381bc631f92a2bb024.zip | |
process(): In the msg.is_multipart() clause, inside the clause that
tries to convert t to something reasonable <wink>, we need to use
errors='replace' when we encode from unicode to string. This is
because the preceding unicode('ascii', 'replace') could end up
inserted U+FFFD, which can't be encoded to ascii.
| -rw-r--r-- | Mailman/Handlers/Scrubber.py | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/Mailman/Handlers/Scrubber.py b/Mailman/Handlers/Scrubber.py index b5be73dfc..7bc5f510d 100644 --- a/Mailman/Handlers/Scrubber.py +++ b/Mailman/Handlers/Scrubber.py @@ -301,8 +301,11 @@ Url : %(url)s try: t = unicode(t, partcharset, 'replace') except (UnicodeError, LookupError): - # Replace funny characters - t = unicode(t, 'ascii', 'replace').encode('ascii') + # Replace funny characters. We use errors='replace' for + # both calls since the first replace will leave U+FFFD, + # which isn't ASCII encodeable. + u = unicode(t, 'ascii', 'replace') + t = u.encode('ascii', 'replace') try: # Should use HTML-Escape, or try generalizing to UTF-8 t = t.encode(charset, 'replace') |
