When filing a new bug, the duplicate finder doesn't find an existing bug whose summary is an exact match

Bug #932956 reported by Guilherme Salgado on 2012-02-15
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Launchpad itself

Bug Description

Today gnome-control-center crashed and I was taken to https://bugs.launchpad.net/ubuntu/+source/gnome-control-center/+filebug<token> with the summary field pre-filled with "gnome-control-center crashed with SIGSEGV in g_type_check_instance_cast()", but when I clicked "Next" it didn't show me an existing bug which had that same string as its summary (bug 932644). This is probably the reason why that bug has so many duplicates even though it's only a few hours old (as of this writing it's 7 hours old and has 8 duplicates).

One can easily reproduce this by going to https://bugs.launchpad.net/ubuntu/+source/gnome-control-center/+filebug, typing "gnome-control-center crashed with SIGSEGV in g_type_check_instance_cast()" and confirming that bug 932644 is not among the results. If you search for it on https://bugs.launchpad.net/ubuntu/+source/gnome-control-center/+bugs it is found, though

Robert Collins (lifeless) wrote :

Interesting. The dupefinder does its own tokenisation to create the wider search string. This is another case that a separate search engine would likely prevent the bug occuring.

One easy fix would be to add an || exact-match test to the search.

Changed in launchpad:
status: New → Triaged
importance: Undecided → High
tags: added: bugs dupefinder search
Robert Collins (lifeless) wrote :

('easy' only in code, may have perf implications)

Abel Deuring (adeuring) wrote :

My work on bug 29713 and bug 1020443 probably improved the situation a bit:

(1) Words with dashes are now no longer mapped to a search term like "gnomecontrolcenter | (gnome & control & center)".
(2) The dupe finder stemmed the search term twice. This affects the term "gnome-control-center" which is stemmd to "gnome-control-cent", which in turn is stemmed to "gnome-control-c" and the latter word is not part of the FTI data

Bug 932644 is still not found though -- only a few other bugs with the title "gnome-control-center crashed with SIGSEGV in g_type_check_instance_cast()" appear in the list of possible dupes.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers