qrunner crashes on invalid unicode sequence
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Mailman |
Fix Released
|
Low
|
Mark Sapiro | ||
mailman (Ubuntu) |
Fix Released
|
Wishlist
|
Unassigned |
Bug Description
When a message contains an invalud unicode sequence in its header, qrunner flat out crashes on that:
May 17 15:32:20 2015 (981) Uncaught runner exception: 'utf8' codec can't decode byte
0xe9 in position 18: invalid continuation byte
May 17 15:32:20 2015 (981) Traceback (most recent call last):
File "/var/lib/
self.
File "/var/lib/
keepqueued = self._dispose(
File "/var/lib/
more = self._dopipelin
File "/var/lib/
sys.
File "/var/lib/
i18ndesc = uheader(mlist, mlist.description, 'List-Id', maxlinelen=998)
File "/var/lib/
return Header(s, charset, maxlinelen, header_name, continuation_ws)
File "/usr/lib/
self.append(s, charset, errors)
File "/usr/lib/
ustr = unicode(s, incodec, errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 18: invalid
continuation byte
May 17 15:32:20 2015 (981) SHUNTING:
1431869540.
A solution for this specific case is to have Mailman/
I would say that this is actually a bug in python-email, since I think it doesn't make sense to set errors to "strict" rather than something like "replace" when the intention is to parse stuff so free-formed, under-specd
and user-controlled as email. Nonetheless, Mailman already sets errors='replace' in some places so it might aswell add it here.
Related branches
Changed in mailman: | |
status: | In Progress → Fix Committed |
Changed in mailman: | |
milestone: | 2.1.21 → 2.1.21rc1 |
status: | Fix Committed → Fix Released |
Actually, the traceback says what's happening is CookHeaders is trying to create the List-Id: header to be added to the message.
It tries to create a header of the form:
List-Id: list description <list.example.com>
And the exception occurs when trying to rfc 2047 encode the list's description in the charset of the list's preferred language. This exception should be occurring on every list post. Is that the case?
Also, what is the list's preferred_language and what is the raw value of the list's description attribute. Obtain this info with something like:
$ bin/withlist list1 language
Loading list list1 (unlocked)
The variable `m' is the list1 MailList instance
>>> m.preferred_
'en'
>>> m.description
'My List one'
>>>
(of course the list name and responses will be different in your case.)