Newlines are removed from the end of bug attachments submitted by email

Bug #898227 reported by Leo Iannacone
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Unassigned

Bug Description

Send an attachment by email in reply to some bug in Launchpad.

Attachment should end with a '\n' (new line), for example:

=== START ===
Hello world!

=== EOF ===

When you'll see it in LP, the last line '\n' will be replaced by '' and file will be:
=== START ===
Hello world!
=== EOF ===

In "diff words", let "<" be the LP_gmail version and ">" be the original one:
Diff:
< +
\ No newline at end of file
---
> +
>

This problem appears only when I use gmail to send my files, it does not occur using LP directly.

Anyway, I consider it really annoying, specially when your uploads are patches, they will be corrupted and you won't be able to apply them.

I'll prove it in the next messages.

Revision history for this message
Leo Iannacone (l3on) wrote :

File uploaded by LP web interface.

Revision history for this message
Leo Iannacone (l3on) wrote :
  • test.txt Edit (13 bytes, text/plain; charset=US-ASCII; name="test.txt")

File sent by GMail.

summary: - Email attachments are corrupted when last line is '\n'. Replaced by
- blank space.
+ Email attachments are corrupted when last line is '\n'.
description: updated
description: updated
Leo Iannacone (l3on)
description: updated
description: updated
Graham Binns (gmb)
Changed in launchpad:
status: New → Triaged
importance: Undecided → Low
summary: - Email attachments are corrupted when last line is '\n'.
+ Newlines are removed from the end of bug attachments submitted by email
Revision history for this message
Gavin Panella (allenap) wrote :

Leo, can you save the original message as sent by Gmail and attach it here (i.e. by going to "Show original" in the drop-down menu to near the top-right of a message in Gmail)? I want to see if it's Gmail corrupting the message or Launchpad. Thanks.

Changed in launchpad:
status: Triaged → Incomplete
Revision history for this message
Leo Iannacone (l3on) wrote :

Of course!

I did the same before report this bug...

$ echo "SGVsbG8gd29ybGQhCgo=" | base64 -d
Hello world!

$

Revision history for this message
Gavin Panella (allenap) wrote :

That looks like Launchpad is the culprit then. Launchpad should not be corrupting attachments in any way, so I'm marking this as Critical until more is known.

Changed in launchpad:
status: Incomplete → Triaged
importance: Low → Critical
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Might also be a mail-transport in between.

Is the file preserved if you cc someone else (preferably outside of gmail).

Revision history for this message
Martin Pool (mbp) wrote :

Sent from mutt with no trailing eol.

--
Martin <http://launchpad.net/~mbp>

j.c.sackett (jcsackett)
Changed in launchpad:
assignee: nobody → j.c.sackett (jcsackett)
j.c.sackett (jcsackett)
Changed in launchpad:
status: Triaged → In Progress
Revision history for this message
j.c.sackett (jcsackett) wrote :
j.c.sackett (jcsackett)
Changed in launchpad:
status: In Progress → Triaged
Revision history for this message
Curtis Hovey (sinzui) wrote :

Gmail inserts a CRLF after every 76 characters in the base64 encoding in the attachment. They do this to ensure ALL mail readers can read the attachment, but it is tampering with the attachment none-the-less. I don't think this is a critical issue, but it would be nice to make Lp handle this.

Changed in launchpad:
assignee: j.c.sackett (jcsackett) → nobody
importance: Critical → Low
Revision history for this message
Curtis Hovey (sinzui) wrote :

I think I have found the cause if this issue. Using the example from comment 2, I can see the entire message was stored in the librarian properly. The message was then passes to the Bugs incoming mail handler, which created the comment and attachment. The rules to create the comment create a Message object which also represents the message parts (encoded attachments) as chunks. If the chunk's mime-type is text, the chunk is decoded and stripped so that in can be placed in the librarian in an easy form to view. This is the fragment in

            if (mime_type == 'text/plain' and no_attachment
                and part.get_filename() is None):

                # Get the charset for the message part. If one isn't
                # specified, default to latin-1 to prevent
                # UnicodeDecodeErrors.
                charset = part.get_content_charset()
                if charset is None or str(charset).lower() == 'x-unknown':
                    charset = 'latin-1'
                content = self.decode(content, charset)

                if content.strip():
                    MessageChunk(
                        message=message, sequence=sequence,
                        content=content)
                    sequence += 1

I understand the desire to call strip for the sake of the the person looking at the file in a browser, but as this is a file attachment, the call to strip() is uncalled for.

Revision history for this message
Curtis Hovey (sinzui) wrote :

This is the real message from comment #2 and it is untainted: https://pastebin.canonical.com/76759/

Curtis Hovey (sinzui)
tags: added: email python-upgrade
Changed in launchpad:
importance: Low → High
Revision history for this message
Curtis Hovey (sinzui) wrote :

This issue is real a bug in python 2.6 (http://bugs.python.org/issue7143). This will be automatically be fixed when Launchpad's Production servers update to Python 2.7

William Grant (wgrant)
Changed in launchpad:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.