Pg 9.6 unaccent() changes how certain characters are normalized
Bug #1719986 reported by
Galen Charlton
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Won't Fix
|
Low
|
Unassigned |
Bug Description
The following test in t/lp1501781-
SELECT is(evergreen.
'euvres', 'oe ligature');
This is because Pg 9.6's unaccent() function was corrected so that unaccent('Œuvres') will now return 'OEuvres' rather than 'Euvres'.
The test case is easy enough to adjust, but it's probably worth poking at this a bit more to identify other cases where the normalization changed, as some REINDEXes on columns in actor.usr may be called for if patron names contain any of the affected ligatures.
Evergreen master
Changed in evergreen: | |
milestone: | none → 2.12.7 |
milestone: | 2.12.7 → 3.next |
importance: | Undecided → Low |
summary: |
- Test case in need of adjustment under Pg 9.6 + Pg 9.6 unaccent() changes how certain characters are normalized |
description: | updated |
tags: | added: database |
To post a comment you must log in.
This would potentially affect sites using pg_upgrade, but not those using pg_dumpall to perform the upgrade.
In addition to the changes to default contrib/unaccent mapping in PostgreSQL 9.6, there are further changes currently committed to master and likely to appear in PostgreSQL 11:
https:/ /commitfest. postgresql. org/14/ 1161/
https:/ /git.postgresql .org/gitweb/ ?p=postgresql. git;a=commitdif f;h=ec0a69e49bf 41a37b5c2d6f6be 66d8abae00ee05
It should be possible to write a check for potentially- affected values, but the changes are numerous (between 915 and 1029 mappings added/changed):
$ git diff --shortstat REL9_5_ STABLE. ..REL9_ 6_STABLE -- contrib/ unaccent/ unaccent. rules
1 file changed, 915 insertions(+), 53 deletions(-)
$ git diff --shortstat REL9_5_ STABLE. ..master -- contrib/ unaccent/ unaccent. rules
1 file changed, 1029 insertions(+), 53 deletions(-)
Worth noting: with the fix for bug 1671150 we'll be doing a drop/create on the affected indexes in upcoming 2.12 and 3.0 point releases, as well as 3.1.
At a minimum, it might be helpful to document which indexes an admin should consider REINDEXing if using pg_upgrade to move to PostgreSQL 9.6.
Perhaps a general "things to keep in mind when upgrading PostgreSQL" admin document or section of the release notes?
I propose we fix the test and document the concern either in an existing section in the docs/release notes, or start a new place if no suitable section exists.