Comment 5 for bug 1102177

Revision history for this message
William Grant (wgrant) wrote : Re: [Bug 1102177] Re: URI() does not encode properly with percent-encoding

On 21/01/13 20:07, Xan wrote:
> What do you mean?
>
> Are lazr.URI('http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B')
> and lazr_uri.URI('http://en.wikipedia.org/wiki/Operators_in_C_and_C++')
> equivalent?

+ is a reserved character. From RFC 3986 section 2.2 "Reserved Characters":

"""
   Percent-
   encoding a reserved character, or decoding a percent-encoded octet
   that corresponds to a reserved character, will change how the URI is
   interpreted by most applications.
"""

From RFC 2616 section 3.2.3 "URI Comparison" (note that RFC 3986
supersedes the referenced RFC 2396):

"""
   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.
"""

So, no, those two are not equivalent. "+" is a reserved character, so it
cannot be translated to or from "%2B" without changing the URL's meaning.