No Package(s) for Language-specific Stemming Dictionary and Affix Files

Bug #301770 reported by Duncan McGreggor
4
Affects Status Importance Assigned to Milestone
Hardy Backports
Undecided
Unassigned
postgresql-8.3 (Ubuntu)
Undecided
Martin Pitt
postgresql-common (Ubuntu)
Medium
Martin Pitt

Bug Description

Binary package hint: postgresql-common

Currently, PostgreSQL 8.3 full text search only provides simple stemming support by default. postgresql-common does not install files needed to support full stemming. In order for Ubuntu-packaged PostgreSQL to support full stemming, ispell (or myspell or hunspell) dictionary and affix files for the desired languages need to be installed. They need to be UTF-8 files, and as of now, they need to be installed in the postgres "tsearch_data" directory.

If packaging support was provided, then full text search with improved stemming could be supported in environments that require Ubuntu packages for all software/source code installations.

Revision history for this message
Martin Pitt (pitti) wrote :

As per our email discussion:

  - can't directly use hunspell directories in /usr/share/myspell/dicts/, since they are often not UTF-8 encoded, which is required for PostgreSQL

  - p-common gets a dpkg trigger which iconvs available hunspell dictionaries to /var/lib/postgresql/dicts/

  - p-common gets test cases based on Duncan's Launchpad-private scripts.

  - p-8.3 gets a patch which falls back to /var/lib/postgresql/dicts/ if no available dictionary is found in the postgres tsearch-data/ directory.

Changed in postgresql-common:
assignee: nobody → pitti
importance: Undecided → Medium
status: New → In Progress
Martin Pitt (pitti)
Changed in postgresql-8.3:
assignee: nobody → pitti
status: New → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :

Fixed in bzr trunk.

For postgresql-8.3 I have a working patch, just its upstream inclusion still needs to be discussed.

Changed in postgresql-common:
status: In Progress → Fix Committed
Changed in postgresql-8.3:
status: In Progress → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

For the record, upstream wasn't happy with my original approach, so I updated the patch and the -common infrastructure, and sent it upstream again. I'll wait for some more feedback before I upload, just to avoid having people actually put it in use and then having it to change later.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package postgresql-common - 95

---------------
postgresql-common (95) experimental; urgency=low

  * Add automatic building of PostgreSQL tsearch/stem dictionaries:
    - Add pg_updatedicts: Build dictionaries and affix files from installed
      hunspell/myspell dictionary packages.
    - Add t/150_tsearch_stemming.t: Test cases for pg_updatedicts, tsearch
      functionality, and word stem handling.
    - t/001_packages.t: Ensure that hunspell-en-us is installed, above new
      test relies on it.
    - debian/postgresql-common.install: Install pg_updatedicts.
    - debian/rules: Create man page from pg_udpatedicts POD.
    - Add debian/postgresql-common.triggers: Register interest on
      /usr/share/myspell/dicts.
    - debian/postgresql-common.postinst: Call pg_updatedicts on upgrade to
      this version, fresh install, and our trigger.
    - debian/postgresql-common.postrm: Remove /var/cache/postgresql on purge.
    - (LP: #301770)

postgresql-common (94) unstable; urgency=low

  * t/070_non_postgres_clusters.t: Test that all cluster configuration files
    are owned by the cluster superuser. Reproduces #481349.
  * pg_createcluster: Make the cluster configuration directory, "start.conf",
    and "environment" owned by the cluster superuser instead of root.
    (Closes: #481349)
  * t/030_errors.t: Check behaviour of starting of clusters with colliding
    ports. Reproduces #472627.
  * pg_ctlcluster: Error out with a port collision message if another cluster
    is already running on the port. (Closes: #472627)
  * t/090_multicluster.t: Don't reconfigure cluster on conflicting port, since
    that now fails with above fix.

 -- Martin Pitt <email address hidden> Sat, 06 Dec 2008 11:35:52 -0800

Changed in postgresql-common:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package postgresql-8.3 - 8.3.5-2

---------------
postgresql-8.3 (8.3.5-2) experimental; urgency=low

  * Add 15-dict-fallback-dir.patch: If a tsearch/stem dictionary is
    not found in sharedir/tsearch_data/ll_cc.{dict,affix}, fall back
    to sharedir/tsearch_data/system_ll_cc.{dict,affix}, where
    postgresql-common creates them from system directories. (LP: #301770)

 -- Martin Pitt <email address hidden> Sat, 06 Dec 2008 11:39:31 -0800

Changed in postgresql-8.3:
status: Fix Committed → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote :

Subscribed backports team for approval of a hardy backport of both packages. I keep -common and all the server packages backportable all the time.

Revision history for this message
Martin Pitt (pitti) wrote :

Duncan, I assume you need this for hardy, not for dapper? The -8.3 package needs a small modification to be backportable to dapper, since dapper didn't have the new python world order yet.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers