Comment 5 for bug 369150

Revision history for this message
In , Knightr (knightr) wrote :

These are quotes from the emails we exchanged (used with permission):

Keith Moore says:

<<Any_ MUA that tries to "convert" something encoded in RFC 2047 to a form
without encoding is broken. There's no way to convert every 2047-encoded
message to an unencoded message with valid syntax. 2047 is only intended to be
used for _display_ purposes, not to convert messages into a form which can be
parsed by a normal mail parser.

Adding quotes around the name before encoding doesn't solve the problem, and
actually causes other problems. >>

In the next part, the questions are by me, the answer is given by him:

<<
>So in the case of the "From:" line, what should MUA authors do?
>>
>> - Convert RFC 2047 encoded messages to an unencoded message, quoting it as
>> necessary to make it look like RFC (2)822 compliant (except for the presence of
>> non-ASCII characters).

no. (or maybe, this is a last-resort option for patching legacy code where
it's infeasible to do it right)

>>
>> [otherwise it causes problem with the MUA email address parsers]
>>
>> - Convert RFC 2047 encoded messages to an unencoded message without bothering
>> about syntax for display but quoting it, as necessary, when using it to reply.
>> That quoting would actually be temporary as it would have to be removed before
>> encoding it using RFC 2047 (once again, this is to please these MUA address
>> parsers).

no.

>>
>> - Use the RFC 2047 encoded message for display but remove the RFC 2047 encoded
>> part when replying (keeping only what's between the "<" ">").

definitely not.
>>

This part is interesting:

<<
here's what should be done in MUAs:

always maintain the message in original format. the 2047-decoding routines
should be used only by the code that displays messages.

replies should be based on the unencoded message headers. that way, the
exact same 2047 encoding is used on a reply as was used on the message
being replied to.

the 2047-encoding routines should only be called from the code that handles
message composition.

message stores that do searching (as in, those that support IMAP) probably
need to do it differently. IMHO, they need to store messages in original
format but build indices based on text extracted from decoding headers
(and for that matter, body parts).
>>

His ok for quoting him:

<<
>> Do I have your permission to quote your reply when contacting the companies in
>> question?

yes.
>>

There was a small typo in what he said above so I asked for precisions:

<<
>Your reply (or at least my interpretation of it) suggests that you might
>> mean "encoded message headers" (that would be the only way to be
>> *entirely* sure that we use the exact same 2047 encoding.

yes, I meant to say "undecoded message headers".
>>

He was a very nice person to deal with:

<<
>> Sorry for taking so much of your time...

it's really no problem. I'm very familiar with this topic and I don't
have to think much about it ... and I type reasonably fast :)

Keith
>>

There seems to be a small interpretation problem of that RFC. If what I copied
above doesn't provided you with the answers you seek I can forward your question
to him...

Let me know...