Crashes on encoding errors

Bug #266047 reported by Heikkilevanto
2
Affects Status Importance Assigned to Milestone
GNU Mailman
New
Medium
Unassigned

Bug Description

I have seen crashes on some Danish-speaking lists, where
users put Danish characters in the headers. It seems to
depend on the encoding used. I do not have much statistical
material, as I can not produce the crash myself, but have to
ask some of the "troublemakers" to post a test message every
time I need to see the crash.

Here is a traceback:
Mar 30 15:17:03 2004 (339) Uncaught runner exception:
unknown encoding
Mar 30 15:17:03 2004 (339) Traceback (most recent call
last):
  File "/home/mailman/Mailman/Queue/Runner.py", line 105,
in _oneloop
    self._onefile(msg, msgdata)
  File "/home/mailman/Mailman/Queue/Runner.py", line 155,
in _onefile
    keepqueued = self._dispose(mlist, msg, msgdata)
  File "/home/mailman/Mailman/Queue/IncomingRunner.py",
line 130, in _dispose
    more = self._dopipeline(mlist, msg, msgdata, pipeline)
  File "/home/mailman/Mailman/Queue/IncomingRunner.py",
line 153, in _dopipeline
    sys.modules[modname].process(mlist, msg, msgdata)
  File "/home/mailman/Mailman/Handlers/CookHeaders.py",
line 75, in process
    prefix_subject(mlist, msg, msgdata)
  File "/home/mailman/Mailman/Handlers/CookHeaders.py",
line 262, in prefix_subject
    h.append(s, c)
  File "/home/mailman/pythonlib/email/Header.py", line 285,
in append
    s = s.encode(outcodec, errors)
LookupError: unknown encoding

The sad thing is that these mails just disappear, and my
users wonder why they can not post to the list. Even a bounce
would be to prefer, or at least something in the log that tells
whom I could send apologies to...

Here are the headers of one offending message, as sent
directly to my mailbox:

From <email address hidden> Tue Mar 30 15:47:23 2004
Return-path: <email address hidden>
Envelope-to: <email address hidden>
Delivery-date: Tue, 30 Mar 2004 15:47:23 +0200
Received: from smtp020.tiscali.dk ([212.54.64.104])
        by bagel.index with esmtp (Exim 3.35 #1 (Debian))
        id 1B8JaM-0003l8-00
        for <email address hidden>; Tue, 30 Mar 2004 15:47:22
+0200
Received: from pikobryg.dk
(213.237.80.191.adsl.vbr.worldonline.dk [213.237.80.191])
        by smtp020.tiscali.dk (8.12.10/8.12.10) with ESMTP id
i2UDlKsm001688
        for <email address hidden>; Tue, 30 Mar 2004 15:47:20
+0200 (MEST)
Message-ID: <email address hidden>
Date: Tue, 30 Mar 2004 15:47:18 +0200
From: "Christian F. Behrens" <email address hidden>
Reply-To: <email address hidden>
Organization: Pikobryg
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1;
en-US; rv:1.6) Gecko/20040113
X-Accept-Language: da, en-us, en
MIME-Version: 1.0
To: <email address hidden>
Subject: Korrektion af
=?windows-1252?Q?sukkerm=E6ngde_til_den_sids?=
 =?windows-1252?Q?te_nadver?=
Content-Type: text/plain; charset=windows-1252;
format=flowed
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by
smtp020.tiscali.dk id i2UDlKsm001688

Note that the subject line carries encoding "windows-1252",
which comes from Mime-Autoconversion at tiscali.dk.

The sender claims (and I believe him) that in his end the
subject line looks something like this:
Subjekt: Test mail encoded
=?ISO-8859-1?Q?=5B=E6=F8=E5_=C6=C5=D8=5D?=

I am not asking mailman to support mysterious windows code
pages, but if it could handle unknown encodings better (ie
without crashing) I would be much happier.

I am also sending a polite letter to tiscali.dk, asking them to
get their act together.

[http://sourceforge.net/tracker/index.php?func=detail&aid=926034&group_id=103&atid=100103]

Tags: pipermail
Revision history for this message
Heikkilevanto (heikkilevanto) wrote :

Sorry, forgot to specify that I run on a Debian/Stable, with
hand-installed mailman 2.1.2 from June 2003.

It may also be that the original mail in my example was written in
MS-Word, and pasted from there to Mozilla. Maybe that has
something to do with the encoding, perhaps more than tiscali.dk's
autoconversion.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.