URL's include subsequent closing parentheses, punctuation, and text unseparated by spaces, even when opening parenthesis is before URL

Bug #788193 reported by Eliah Kagan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
Low
Unassigned

Bug Description

When a closing parenthesis immediately follows a URL, and punctuation immediately follows this ")" character, and text immediately follows the punctuation, and there are no intervening spaces, then the closing parenthesis, punctuation, and text are all considered to be part of the URL, *even* when the closing parenthesis is matched by an unlinkified opening parenthesis *before* the URL.

There are several classes of situations where such a presentation of a URL would occur. The parenthetical aside ending with the URL could be followed by an ellipsis and more text:

    Blah (blah https://launchpad.net)...foo bar.

(An example of this "in the wild," which is what motivated me to file this bug report, is https://bugs.launchpad.net/ubuntu/+source/gnome-panel/+bug/767095/comments/112.)

Or it could be itself part of a clause separated from another clause by "--" or "-":

    Blah--blah (blah https://launchpad.net)--foo bar.
    Blah-blah (blah https://launchpad.net)-foo bar.

(I'm not sure whether or not a single dash should be respected in this capacity; most of the time when a singe dash is used this way, it is surrounded by spaces, and single dashes probably occur much more in URL's than double dashes.)

Interestingly, when the parenthetical aside ending with the URL is part of a clause separated off with an em- or en-dash, the correct behavior is displayed:

    Blah–blah (blah https://launchpad.net)–foo bar.
    Blah—blah (blah https://launchpad.net)—foo bar.

Finally, as a class of arguably wrong behavior, the parenthetical aside could be separated from another clause or list element with a comma or semicolon, where the composer of the message accidentally neglects to include the space:

    Blah (blah https://launchpad.net),foo bar.
    Blah (blah https://launchpad.net);foo bar.

I'm not sure what should happen in that last situation, but since there is no opening parenthesis in the URL and there is one before it, arguably the closing parenthesis and the text following it should not be linkified. On the other hand, in the absence of punctuation (or whitespace), URL's should **not always** end right before the first right-parenthesis they may be interpreted to contain. For example, it is almost certainly desirable for URL's like these to be completely linkified:

    http://localhost.localdomain/considerations%20for%20handling%20)%20in%20urls/
    http://localhost.localdomain/silly)but)plausible)URL)/index.html

And it seems that should be the case even when they are in a sentence where there are opening parentheses before them:

    Blah (http://localhost.localdomain/considerations%20for%20handling%20)%20in%20urls/.
    Blah (blah (blah (blah (lol http://localhost.localdomain/silly)but)plausible)URL)/index.html, foo!

In the first line, it's still more likely that the closing parenthesis was accidentally omitted after the URL. The second line doesn't make sense, but it wouldn't make any more sense if the ")" characters in the URL were considered to match the "(" characters before it.

By the way, it seems that in deciding whether or not a parenthesis is part of a URL, it may be useful to consider whether or not it is enclosed in quotes. That is, if a closing parenthesis is enclosed in quotes inside a linkified URL, then even if its matching opening parenthesis is unlinkified before the URL, it still should probably remain linkified. This consideration is relevant if and only if bug 237609 is fixed (i.e., it should be taken into account when this and that bug are considered together).

Bug 600240 seems almost to be a polar opposite of this bug, and seems to be fixed, even though it is not marked fixed. Perhaps the fix for that bug was too aggressive in including closing parentheses, and introduced this bug?

Tags: comments
description: updated
description: updated
Gary Poster (gary)
Changed in launchpad:
status: New → Triaged
importance: Undecided → Low
tags: added: comments
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.