v1 headers are encoded as UTF-8

Bug #1108994 reported by Zane Bitter
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Glance
Won't Fix
Medium
Unassigned

Bug Description

In the v1.1 API, metadata headers are echoed with the values encoded as UTF-8 (since bug 1042078).

However, UTF-8 is not a valid encoding for HTTP header values. (Indeed, some ASCII characters are also forbidden.) The output should be encoded using the MIME header encoding rules from RFC 2047 (http://www.ietf.org/rfc/rfc2047.txt ).

For reference the format of the header field contents is defined in section 4.2 of RFC 2616:

       field-content = <the OCTETs making up the field-value
                        and consisting of either *TEXT or combinations
                        of token, separators, and quoted-string>

http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2

...which must be further interpreted using section 2.2:

  The TEXT rule is only used for descriptive field contents and values
  that are not intended to be interpreted by the message parser. Words
  of *TEXT MAY contain characters from character sets other than
  ISO-8859-1 only when encoded according to the rules of RFC 2047.

       TEXT = <any OCTET except CTLs,
                        but including LWS>

http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2

The Location: header is a special case, since its value is a URI, which needs to be URL-encoded (this brings it into the acceptable subset of ASCII, and therefore no further encoding is required).

(A similar issue in python-glanceclient is bug 1108969. The encoding of these headers is treated differently on reception, as described in bug 1108979.)

Revision history for this message
Mark Washenberger (markwash) wrote :

You certainly have the right of it.

However, I'm not sure we can just start publishing headers in a different way. Prior to the UTF-8 fix (which was apparently incorrect) we could have made this change, but now old clients expecting UTF-8 won't know what to do with MIME encoded headers.

Unless there is some major problem caused by having the wrong encoding type, we might need to just leave this unfixed in Glance v1 and be sure to document it well.

Revision history for this message
Zane Bitter (zaneb) wrote :

I agree, this seems to be one of those standards nobody follows; frameworks don't seem to have built-in support for decoding the headers. I think making this clearly documented may well be the best fix for the bug(s); at the time I raised these issues I could not find any documentation other than the code.

After discovering that the v1 API was based on Swift, I thought I read somewhere that S3 uses url-encoded (i.e. percent-encoded) UTF-8 for the equivalent headers, but I can no longer find the source :(

Changed in glance:
importance: Undecided → Medium
status: New → Triaged
tags: added: docs
Lawrance (jing)
Changed in glance:
assignee: nobody → Lawrance (jing)
Lawrance (jing)
Changed in glance:
assignee: Lawrance (jing) → nobody
Revision history for this message
Erno Kuvaja (jokke) wrote :

If we are not going to fix this, can we close the bug?

tags: added: propose-close
Changed in glance:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.