Comment 2 for bug 78898

Revision history for this message
Stuart Bishop (stub) wrote :

In this case, the purpose is to mark up the users input. It doesn't matter what characters are technically allowed; if it looks like a URL, we should mark it up like a URL except for trailing punctuation.

eg. I might want to add a bug report:

    When I go to a URL like http://☣.net/, it works except Firefox rewrites the URL to the ASCII form in the URL bar and it looks fugly. I'm not sure if this a bug or an anti-phishing feature.

(as an aside, I don't think Launchpad should use the technical definition of a URL circa 1993, but the real world definition. The only time it needs to encode a URL as US-ASCII is when generating HTTP headers. Mail readers and web browsers do the right thing and correctly encode Unicode URLs embedded in HTML to US-ASCII for transport over HTTP, so there is no reason to display uglified URLs in our HTML output. But this is generally irrelevant as we have standardized on ASCII URL components everywhere except for user inputted external URLs).