Crashes on encoding errors
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Mailman |
New
|
Medium
|
Unassigned |
Bug Description
I have seen crashes on some Danish-speaking lists, where
users put Danish characters in the headers. It seems to
depend on the encoding used. I do not have much statistical
material, as I can not produce the crash myself, but have to
ask some of the "troublemakers" to post a test message every
time I need to see the crash.
Here is a traceback:
Mar 30 15:17:03 2004 (339) Uncaught runner exception:
unknown encoding
Mar 30 15:17:03 2004 (339) Traceback (most recent call
last):
File "/home/
in _oneloop
self.
File "/home/
in _onefile
keepqueued = self._dispose(
File "/home/
line 130, in _dispose
more = self._dopipelin
File "/home/
line 153, in _dopipeline
sys.
File "/home/
line 75, in process
prefix_
File "/home/
line 262, in prefix_subject
h.append(s, c)
File "/home/
in append
s = s.encode(outcodec, errors)
LookupError: unknown encoding
The sad thing is that these mails just disappear, and my
users wonder why they can not post to the list. Even a bounce
would be to prefer, or at least something in the log that tells
whom I could send apologies to...
Here are the headers of one offending message, as sent
directly to my mailbox:
From <email address hidden> Tue Mar 30 15:47:23 2004
Return-path: <email address hidden>
Envelope-to: <email address hidden>
Delivery-date: Tue, 30 Mar 2004 15:47:23 +0200
Received: from smtp020.tiscali.dk ([212.54.64.104])
by bagel.index with esmtp (Exim 3.35 #1 (Debian))
id 1B8JaM-0003l8-00
for <email address hidden>; Tue, 30 Mar 2004 15:47:22
+0200
Received: from pikobryg.dk
(213.237.
by smtp020.tiscali.dk (8.12.10/8.12.10) with ESMTP id
i2UDlKsm001688
for <email address hidden>; Tue, 30 Mar 2004 15:47:20
+0200 (MEST)
Message-ID: <email address hidden>
Date: Tue, 30 Mar 2004 15:47:18 +0200
From: "Christian F. Behrens" <email address hidden>
Reply-To: <email address hidden>
Organization: Pikobryg
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1;
en-US; rv:1.6) Gecko/20040113
X-Accept-Language: da, en-us, en
MIME-Version: 1.0
To: <email address hidden>
Subject: Korrektion af
=?windows-
=?windows-
Content-Type: text/plain; charset=
format=flowed
Content-
X-MIME-
smtp020.tiscali.dk id i2UDlKsm001688
Note that the subject line carries encoding "windows-1252",
which comes from Mime-Autoconversion at tiscali.dk.
The sender claims (and I believe him) that in his end the
subject line looks something like this:
Subjekt: Test mail encoded
=?ISO-8859-
I am not asking mailman to support mysterious windows code
pages, but if it could handle unknown encodings better (ie
without crashing) I would be much happier.
I am also sending a polite letter to tiscali.dk, asking them to
get their act together.
[http://
Sorry, forgot to specify that I run on a Debian/Stable, with
hand-installed mailman 2.1.2 from June 2003.
It may also be that the original mail in my example was written in
MS-Word, and pasted from there to Mozilla. Maybe that has
something to do with the encoding, perhaps more than tiscali.dk's
autoconversion.