Mail-in breaks for messages in unsupported character encodings

Bug #659329 reported by Chris Rossi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Won't Fix
Undecided
Unassigned

Bug Description

An incoming message which is encoded using an unsupported character encoding winds up in the quarantine with this error:

Traceback (most recent call last): File "/opt/karl/osi/3.27-0/src/karl/karl/utilities/mailin.py", line 225, in handle_message text, attachments = self.dispatcher.crackPayload(message) File "/opt/karl/osi/3.27-0/src/karl/karl/adapters/mailin.py", line 244, in crackPayload data = data.decode(charset) LookupError: unknown encoding: windows-874

(windows-874 is a Windows Thai character encoding.)

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

We have a couple of options for fixing this, so we need some sort of policy decision:

1) We could bounce the message, informing the user they should use unicode.

2) We could coerce to ascii, translating all 8 bit characters to '?' or some other fill character.

Opinions?

Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 659329] Re: Mail-in breaks for messages in unsupported character encodings

Hmm, Launchpad is giving an error when I try to subscribe others to this bug, so we'll discuss things via email.

As background for Nat/Evan, an email came in that we couldn't handle. It was in a foreign language, using an "encoding" for characters in that language that isn't a standard. It happens once a quarter and seems to always revolve around Thai. :)

Chris gives two alternatives. I'll give a third: due to budget, we mark this bug as wontfix and count on it not happening again. If it does, we re-think it.

--Paul

On Oct 12, 2010, at 12:33 PM, Chris Rossi wrote:

> We have a couple of options for fixing this, so we need some sort of
> policy decision:
>
> 1) We could bounce the message, informing the user they should use
> unicode.
>
> 2) We could coerce to ascii, translating all 8 bit characters to '?' or
> some other fill character.
>
> Opinions?
>
> --
> Mail-in breaks for messages in unsupported character encodings
> https://bugs.launchpad.net/bugs/659329
> You received this bug notification because you are subscribed to KARL3.
>
> Status in KARL3: New
>
> Bug description:
> An incoming message which is encoded using an unsupported character encoding winds up in the quarantine with this error:
>
> Traceback (most recent call last): File "/opt/karl/osi/3.27-0/src/karl/karl/utilities/mailin.py", line 225, in handle_message text, attachments = self.dispatcher.crackPayload(message) File "/opt/karl/osi/3.27-0/src/karl/karl/adapters/mailin.py", line 244, in crackPayload data = data.decode(charset) LookupError: unknown encoding: windows-874
>
> (windows-874 is a Windows Thai character encoding.)
>
>

Revision history for this message
Nat Katin-Borland (nborland) wrote :

I'm OK with not fixing this. Just out of curiosity, what happened to
the original email? Did we post it manually?

Thanks,
Nat

--
Nathaniel Katin-Borland
Support Specialist
Knowledge Management Initiative
KARL Support Team

Open Society Institute - New York Office
400 West 59th Street
New York, NY 10019
Email: <email address hidden>
Phone: 212-547-6984
http://www.soros.org/
http://www.karlproject.org

-----Original Message-----
From: Paul Everitt [mailto:<email address hidden>]
Sent: Wednesday, October 13, 2010 12:43 PM
To: Bug 659329
Cc: Nathaniel Katin-Borland; Evan McGonagill; Tres Seaver
Subject: Re: [Bug 659329] Re: Mail-in breaks for messages in unsupported
character encodings

Hmm, Launchpad is giving an error when I try to subscribe others to this
bug, so we'll discuss things via email.

As background for Nat/Evan, an email came in that we couldn't handle.
It was in a foreign language, using an "encoding" for characters in that
language that isn't a standard. It happens once a quarter and seems to
always revolve around Thai. :)

Chris gives two alternatives. I'll give a third: due to budget, we mark
this bug as wontfix and count on it not happening again. If it does, we
re-think it.

--Paul

On Oct 12, 2010, at 12:33 PM, Chris Rossi wrote:

> We have a couple of options for fixing this, so we need some sort of
> policy decision:
>
> 1) We could bounce the message, informing the user they should use
> unicode.
>
> 2) We could coerce to ascii, translating all 8 bit characters to '?'
or
> some other fill character.
>
> Opinions?
>
> --
> Mail-in breaks for messages in unsupported character encodings
> https://bugs.launchpad.net/bugs/659329
> You received this bug notification because you are subscribed to
KARL3.
>
> Status in KARL3: New
>
> Bug description:
> An incoming message which is encoded using an unsupported character
encoding winds up in the quarantine with this error:
>
> Traceback (most recent call last): File
"/opt/karl/osi/3.27-0/src/karl/karl/utilities/mailin.py", line 225, in
handle_message text, attachments = self.dispatcher.crackPayload(message)
File "/opt/karl/osi/3.27-0/src/karl/karl/adapters/mailin.py", line 244,
in crackPayload data = data.decode(charset) LookupError: unknown
encoding: windows-874
>
> (windows-874 is a Windows Thai character encoding.)
>
>

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

On Wed, Oct 13, 2010 at 1:23 PM, Nat Katin-Borland <email address hidden>wrote:

> I'm OK with not fixing this. Just out of curiosity, what happened to
> the original email? Did we post it manually?
>
> No, but we could. It is currently sitting in the quarantine. You can see
the email here:

https://karl.soros.org/po_quarantine/0

If you wanted to try to post manually, it looks like it was destined for the
'bp-internal-docs' community. FWIW, the Thai characters are in the sender's
signature--the body of the email is ascii, so something like option 2) would
work well here and is easy to implement.

Chris

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Per Nat, we'll put this on the barge and bring it back if it happens again.

Changed in karl3:
milestone: none → m999
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.