[quicksearch] New 'quick search' is unable to search for dashes '-'

Bug #282995 reported by Brett Alton on 2008-10-14
102
This bug affects 15 people
Affects Status Importance Assigned to Milestone
Xapian
Invalid
Undecided
Unassigned
synaptic (Ubuntu)
Low
Jean-Baptiste Lallement

Bug Description

Binary package hint: synaptic

In Synaptic, when I search for something like 'ubuntu-res', it fails to find 'ubuntu-restricted-extras'. This appears to be universal with any package that includes a dash as searching for 'ubuntu' reveals the package 'ubuntu-restricted-extras'.

This is seen by 'abiword' and 'abiword-', etc.

$ lsb_release -a && apt-cache policy synaptic
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu intrepid (development branch)
Release: 8.10
Codename: intrepid
synaptic:
  Installed: 0.62.1ubuntu9
  Candidate: 0.62.1ubuntu9
  Version table:
 *** 0.62.1ubuntu9 0
        500 http://archive.ubuntu.com intrepid/main Packages
        100 /var/lib/dpkg/status

UPDATE: For 0.62.1ubuntu10 this behaviour changed to "A quick search for 'word1-word2' will now find packages matching any of the words 'word1' or 'word2', instead of the hyphenated combination 'word1-word2'", crippling its usability, since *many* packages contain hyphens.

Related branches

Chris Coulson (chrisccoulson) wrote :

Thanks for reporting this bug and any supporting documentation. Since this bug has enough information provided for a developer to begin work, I'm going to mark it as confirmed and let them handle it from here. Thanks for taking the time to make Ubuntu better!

Changed in synaptic:
importance: Undecided → Low
status: New → Triaged
description: updated
Roman Polach (rpolach) wrote :

Now in Intrepid final behavior of quick search changes:
Now when I use a dash, it finds every package containing on of parts of search keyword
before and after dash. E.g. if I search for "mozilla-mplayer" it finds every occurence of
"mozilla" and also every occurence of "mplayer". The correct behavior IMO should be find
really occurence of "mozilla-mplayer".

Ameen Demidem (ameen.demidem) wrote :

Can you please first confirm that you still have the same problem with the latest update of synaptic : 0.62.1ubuntu10
As both of the versions in 8.04 and 8.10 are working fine with me.

It's better, but not fixed completely.

If you search for 'sun-java', it searches 'sun' and 'java' just like Roman Polach's 'mozilla-mplayer'.

Should it not actually use the '-' (dash) as a search term? If I want to search 'mozilla' and 'mplayer' or 'sun' and 'java', I'll add a space (e.g. 'sun java', 'mozilla mplayer').

Dashes are very important in package names and I believe they should be treated as such in the search.

Ameen Demidem (ameen.demidem) wrote :

Thanks for your input. Agree with you.

Thommitsch (thommitsch) wrote :

Why does the quick search show 14 results when I search for "ttf", but 150 when I enter "-ttf" or "ttf-"?

Daniel Pirch (dpirch) wrote :

There doesn't seem so be any relationship between the search term and the search results.

For example, typing libsdl1.2 into the quick search box lists 134 mostly unrelated packages, typing libsdl1.2-dev even returns more than 3600 packages.

Typing more than one word will also confuse the search. For example, searching for "xchat gnome" (without quotes) results in 1373 packages.

description: updated
Daniel Pirch (dpirch) wrote :

The bug is still present in Jaunty.

Patrice Vetsel (vetsel-patrice) wrote :

This bug is also present in Karmic.

AlexT (aletum) wrote :

Same here in Karmic on a freshly installed system

AlexT (aletum) wrote :
AlexT (aletum) wrote :
summary: - [intrepid] New 'quick search' is unable to search for dashes '-'
+ [quicksearch] New 'quick search' is unable to search for dashes '-'
tags: added: quicksearch

This report is tracked upstream and will most probably be released in 1.3.x of xapian.

Changed in xapian:
status: Unknown → Confirmed
Olly Betts (ojwb) wrote :

Xapian ticket #22 is not related to this bug. This is not a bug in Xapian, but in how it is being used.

Changed in xapian:
importance: Unknown → Undecided
status: Confirmed → New
status: New → Invalid
Olly Betts (ojwb) wrote :

Sorry, your bug link is wrong, so I've removed it as it just confuses the real issue. It's great to see people actively triaging tickets in Launchpad, but please resist the temptation to latch onto likely sounding upstream tickets without reading them carefully to see if they are actually related.

Xapian ticket #22 is about improving the handling of hyphenated phrases containing a single character component. You'd generally expect e-mail and email to match the same documents (at least in English).

The initial example in this ticket "ubuntu-res" clearly doesn't fall into that category, since neither "ubuntu" nor "res" are single characters.

The bug here is not in Xapian, but in how Xapian is being used. So it's a bug in synaptic, or perhaps apt-xapian-index. Or maybe both, since you need the indexer and searcher to agree on how the index is built.

If you use Xapian's TermGenerator and QueryParser classes, then a hyphen is indexed as if it were a space, but at search time it generates a phrase.

But if you want to handle it differently, you can generate whatever terms you want, and parse queries however you want. It sounds from this ticket like for quick search, most people expect "-" to be part of a term, and many expect an implicit wildcard at the end (so ubuntu-res to match ubuntu-restricted-extras).

A fix is in progress for searching with an hyphen however implicit wildcard when searching for a package name with a prefix e.g. name:ubuntu-res is not possible with the current version of xapian:

Extract from the discussion on #xapian:
"It needs two new options: firstly, to make it handle wildcards (and implicit wildcards from FLAG_PARTIAL) for boolean fields and secondly to make it handle wildcards and implicit wildcards at the end of hypenated phrases."

A workaround (but very bad practice) is to replace '-' by '* ' when the prefix 'name:' and replace '-' by ' ' when no prefix is used.
With this a query such as 'name:ubuntu-res' will be parsed as XPubuntu* AND res* (with wildcard expansion)
but a query such as 'name:(ubuntu-res*)' won't return anything

Changed in synaptic (Ubuntu):
assignee: nobody → Jean-Baptiste Lallement (jibel)
status: Triaged → In Progress

commited to my bzr branch.

Changed in synaptic (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package synaptic - 0.63.1ubuntu4

---------------
synaptic (0.63.1ubuntu4) lucid; urgency=low

  [ Michael Vogt ]
  * po/it.po:
    - updated, thanks to Milo Casagrande (closes: #575685)

  [ Jean-Baptiste Lallement ]
  * common/rpackage.{cc,h}:
    - Use simplified URI for third party changelogs (LP: #45129)
  * debian/patches/01_ubuntu_changelog.dpatch:
    - update patch to support third party changelogs
  * common/rpackage.{cc,h}:
    - Support third party changelogs by using ArchiveURI() (LP: #153966)
    - Display LP links when changelog is not available for download
      (LP: #452564)
  * gtk/rgmainwindow.cc:
    - check package flags when applying an action to a package list
      (LP: #513460)
  * common/rpackagelister.cc
    - workaround to allow searching for terms with an hyphen (LP: #282995)
  * common/rpackagelister.cc:
    - xapianSearch: do not expand the first term when replacing the hyphen
    to reduce size of the resultse

  [ Oliver Joos ]
  * common/rconfiguration.cc:
    - Fix to store setting "Consider recommended packages as dependencies"
      (closes debian #440027 and LP: #154349)
 -- Michael Vogt <email address hidden> Thu, 15 Apr 2010 01:59:51 +0200

Changed in synaptic (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.