Mailman changes attachment filenames

Bug #266402 reported by Madrobin
2
Affects Status Importance Assigned to Milestone
GNU Mailman
New
Medium
Unassigned

Bug Description

Hello!

I'm using Mailman version 2.1.9.cp2, hosted by site5.com.

When sending attachments with long filenames to a list, the file name
arrives with one character substituted for another. The attachments with
the changed filename cannot be opened in the FirstClass email client.

The attachment names were:

2007 September Using FirstClass To Book School Resources.pdf

and

2007 September Teaching Lab Expectations and Routines.pdf

In both cases, when the attachment filenames were viewed in FirstClass
after having been delivered by Mailman, the last space in the filename had
been converted to a character which looks like a square. I am attaching a
screenshot.

If I try to open the attachment in FirstClass, I receive an error dialog
box:

File Transfer Failed because

File not open error

[1210:4104]

I am not able to open the attachment.

If I send the same attachments with the same filenames "directly" (not
through Mailman) and open them in FirstClass, there is no problem. The
filenames do not change and the attachments can be opened without
difficulty.

If I shorten the attachment filenames (for example, I removed the "2007
September" part of the above filenames) and send them through Mailman,
there is also no problem; the filenames do not change and the attachments
can be opened.

I also tried viewing the long-filename attachments sent via Mailman in a
different email client (Pegasus Mail). The attachment filenames had again
been changed (instead of a square, the character which had been substituted
for the last space looked more like a thick vertical bar.) However, even
with the filename change, Pegasus Mail was still able to open the
attachments. (Unfortunately, my employer allows us to use only
FirstClass.)

I was not able to find anything addressing this problem. However, if I
missed it and you can point me in the direction of a solution, I would
greatly appreciate it.

Thank you.

[http://sourceforge.net/tracker/index.php?func=detail&aid=1792087&group_id=103&atid=100103]

Revision history for this message
Madrobin (madrobin) wrote :
Revision history for this message
Mark Sapiro (msapiro) wrote :

Originator: NO

This appears to be a problem with header folding and unfolding resulting
in the <space> being possibly replaced by a <tab> which is shown as an
'invalid character' (the square symbol) by the mail client.

Can you save the attachment and then open the saved file outside of
FirstClass (perhaps after changing it's name or 'saving as' and providing a
'good' name)?

I am unable to know what is at fault here without seeing the raw headers
of the attachment part from the mail as sent and the mail as received from
Mailman.

It may be a Mailman issue, a cPanel issue or a FirstClass issue.

Also, please see
<http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq06.011.htp>.

Revision history for this message
Madrobin (madrobin) wrote :

Originator: YES

Hello!

Thanks very much for your reply. I have tried your suggestions about
renaming the attachments in FirstClass, but it does not seem to be
possible. If I try to "save" rather than "open" the attachments, I get an
immediate "file input/output error" and that's that.

I would be able to send you a copy of the "internet headers" from the
message as it was received in FirstClass, as well as the headers of the
copy of the message which I saved in Pegasus Mail when the message was
being sent. Is this what you need? I'm not sure that this is something I
would want to post in a public forum. Can I email them to you?

Thanks again for your help.

Revision history for this message
Mark Sapiro (msapiro) wrote :

Originator: NO

I have received the 'before and after' messages, and the problem is
because of header folding and unfolding. There have been discussions of
this on the mailman-*@python.org email lists. One example is at
<http://mail.python.org/pipermail/mailman-users/2007-June/057499.html>
which happens to address Subject: headers, but the issue is the same.

In this particular case, the original message attachment contains the
header

Content-disposition: attachment; filename="Test document saved with a very
long file name to see if it can be determined what is messing up.pdf"

all on one line even though it is probably split here. In Mailman's
processing of this message, the underlying Python email library methods
determine that this header exceeds the 78 character recommended maximum
length and thus folds it into

Content-disposition: attachment;
<tab>filename="Test document saved with a very long file name
<tab>to see if it can be determined what is messing up.pdf"

where <tab> represents an ascii tab character. The first fold (between
attachment; and filename=) occurs at a 'higher level syntactic break' as
recommended by the RFCs, but the filename itself is still too long so it is
folded onto a third continuation.

The basic issue revolves around the rules for folding and unfolding long
header lines. The original standard was RFC 822
<http://www.faqs.org/rfcs/rfc822.html>, sec 3.1.1. The current
recommendation is RFC 2822 <http://www.faqs.org/rfcs/rfc2822.html>, sec
2.2.3.

Mailman via the Python email library is not doing the right thing
according to these standards. Mailman is replacing a <space> with
<CR><LF><tab> in order to fold the header. This works unambiguously for the
first fold at a syntactic break, but in the second case it doesn't work.

FirstClass is doing the right thing in unfolding by removing only the
<CR><LF>, but this leaves the <tab> in the middle of the file name which
causes a problem for FirstClass.

Because of the ambiguities between RFC 822 and RFC 2822, it is more common
for mail clients to remove a whitespace character when unfolding. In this
case, that would have the effect of removing the space between 'file name'
and 'to see'. This also is an incorrect result, but not as bad.

The correct (RFC 2822) thing for the python email library to do would be
to fold by just inserting <CR><LF> immediately after 'attachment;' and also
immediately before ' to see'. This would not be a complete solution to
these issues as long as there were common mail clients the removed
whitespace when unfolding, but it might minimize the damage. Note that this
is really a Python email library issue, not a Mailman issue.

In the mean time, the problem can possibly be avoided by avoiding spaces
in file names. Use underscores instead.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.