Highlight of links in terminal is not very smart

Bug #1688391 reported by teo1978
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gnome-terminal (Ubuntu)
New
Undecided
Unassigned

Bug Description

Steps to reproduce:
- In a terminal, type this command:
    ls /srv/www/example.com/htdocs
- hit Enter
- move the mouse cursor over "www" or over "example.com"

Expected:
- nothing should happen
Observed:
- The substring "www/example.com/htdocs" is underlined and right-clickable as if it was an URL, which it obviously isn't.

It is questionable in the first place that you try to parse urls that don't start with a schema.

But in any case, you need to fix your regular expression I guess.

I see now that even wwwXexampleYcom gets recognized as an URL. Looks like somebody forgot to escape a "." in a regular expression.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: gnome-terminal 3.18.3-1ubuntu1
ProcVersionSignature: Ubuntu 4.4.0-72.93-generic 4.4.49
Uname: Linux 4.4.0-72-generic x86_64
NonfreeKernelModules: nvidia_uvm nvidia
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: amd64
CurrentDesktop: Unity
Date: Thu May 4 21:40:58 2017
InstallationDate: Installed on 2013-10-11 (1301 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Release amd64 (20130424)
SourcePackage: gnome-terminal
UpgradeStatus: Upgraded to xenial on 2016-12-11 (143 days ago)

Revision history for this message
teo1978 (teo8976) wrote :
description: updated
Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

> as if it was an URL, which it obviously isn't

Obviously to the human eye. Not so obviously to computer algorithms.

> It is questionable in the first place that you try to parse urls that don't start with a schema.

Indeed questionable. Which does not necessarily mean it's bad. It's a convenience feature, and URLs pretty often appear without schema. Probably not recognizing them would lead to way more feature requests and complaints than a few false positive matches.

> But in any case, you need to fix your regular expression I guess

"fix" in what sense exactly?

> I see now that even wwwXexampleYcom gets recognized as an URL

URLs often begin with www5 or similar strings which are intentionally recognized. Not sure where to draw the line which additional characters and how many of them to accept.

> Looks like somebody forgot to escape a "." in a regular expression.

Nope. As said, the given behavior is intentional. Also, the regular expression is backed up by more than a hundred unittests.

Sure there is room for improvement, but any such request should be constructive, and should definitely refrain from calling an algorithm that by its nature needs to have heuristics "retarded".

Revision history for this message
teo1978 (teo8976) wrote :

This is not a matter of "having to be heuristic", this is plain wrong. Here you are matching something that can be easily ruled out by a non-heuristic rule without introducing any false negative, so it IS retarded.

How can this possibly make sense:
  example.com/foo/bar => not an URL
  wwwexamplecom/foo/bar => URL

What's the rationale? Ok, I see, it seems to be "ANYTHING that starts with www". Even "wwwlalala" is highlighted as a url.
That's just plain stupid (sorry, it's just a fact, and note that I AM being constructive by pointing out a stupid error that can easily be fixed). There's no way that something that doesn't even contain a dot can be an absolute url (detecting relative urls would make no sense, right?)

At the very least, the "starts with www" criteria should be changed to "starts with www, optionally followed by letters, numbers, dashes and underscores, followed by a dot" (and then at least a non-space character).

This would eliminate a lot of false positives with no single added false negative, and it's trivial.

summary: - Highlight of links in terminal is retarded
+ Highlight of links in terminal is not very smart
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.