postgresql-8.3 cluster locale should not be utf8
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| postgresql-common (Debian) |
Fix Released
|
Unknown
|
||
| postgresql-common (Ubuntu) |
Medium
|
Martin Pitt |
Bug Description
Binary package hint: postgresql-8.3
On Gutsy, the initdb command runs with locale=utf8.
I don't hesitate using utf8, but it prevents you from using any other charsets which are not compatible (e.g. latin1).
I think it would be better to use C or POSIX locale and as default encoding utf8, so you are still able to use other charsets than utf8 (initdb --locale=POSIX -E UTF8 ...)
Torsten
Torsten Krah (tkrah) wrote : | #2 |
Its not about not using utf8 as default for databases, but not for the locale.
If you use e.g. de_DE.UTF8 as locale and issue this command:
createdb XX -E LATIN1
you get:
encoding LATIN1 does not match server's locale de_DE.UTF-8
DETAIL: The server's LC_CTYPE setting requires encoding UTF8.
This is what i mean with prevents, you can't create another database on this cluster with latin1 charset like you suggested.
You are forced to utf8, even if you got some old databases which have to use latin1.
Using posix or c as locale does not show this problem.
I knew how to create a new one (take initdb script, set encoding, locale + path and its ready).
But upgrading from existing installations might fail here and behaves not like previous versions (look at debian bug report).
Found some similar debian thread:
http://
Hope this helps.
Torsten Krah (tkrah) wrote : | #3 |
reopened to discuss this further as upgrading existing databases may fail.
Changed in postgresql-8.3: | |
status: | Invalid → New |
Martin Pitt (pitti) wrote : | #4 |
Martin Pitt (pitti) wrote : Reproduction recipe | #5 |
For the record, this is the reproduction recipe:
setup:
sudo pg_createcluster 8.2 main --start
sudo -u postgres createdb -E latin1 latintest
sudo -u postgres createdb utf8test
sudo -u postgres psql -c "create table t(x varchar); insert into t values(
sudo -u postgres psql -c "create table t(x varchar); insert into t values(E'A\xF6B');" latintest
verify that 8.2 DBs have correct encoding:
$ psql -Atc 'select * from t' utf8test
AöB
$ psql -Atc 'select * from t' latintest | iconv -f latin1
AöB
upgrade to 8.3 which breaks:
sudo pg_upgradecluster 8.2 main
I am not yet quite clear how to work around this problem yet. My
preferred approach would be to create the 8.3 databases with a proper
encoding on upgrade and convert it on the fly. But at least I can put
above recipe into the postgresql-common test suite.
Martin
--
Martin Pitt | http://
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Changed in postgresql-common: | |
importance: | Undecided → Medium |
status: | New → Confirmed |
Martin Pitt (pitti) wrote : | #6 |
Fixed in trunk.
Changed in postgresql-common: | |
assignee: | nobody → pitti |
status: | Confirmed → Fix Committed |
Launchpad Janitor (janitor) wrote : | #7 |
This bug was fixed in the package postgresql-common - 87
---------------
postgresql-common (87) unstable; urgency=medium
* Urgency medium since #472930 is an important bug fix.
* debian/
for "unknown status") instead of 0 (which means "service is running", but
it is debatable and confusing whether all clusters are running if there
are none at all). (LP: #203966)
* Update Spanish debconf translations, thanks Javier Fernández-Sanguino
Peña. (Closes: #473405)
* t/060_obsolete_
default_
"off", so both cases get tested. This replicates the problem report from
Karsten Hilbert.
* pg_upgradecluster: Work with default_
* debian/
only relevant for PostgreSQL versions earlier than 8.1. Thanks to Ross
Boylan for pointing this out.
* Add t/051_inconsist
pre-8.3 to 8.3 succeed and have correct encodings if the old DB had a
database whose encoding did not match the server locale. This reproduces
#472930.
* pg_upgradecluster: Fix handling of database encodings on upgrade, since
8.3 now forces DB encodings and server locale to match:
- With C locale, keep encoding of DBs on upgrade, just as in previous
versions. (C is compatible with all encodings, and causes lots of string
functions not to work correctly, but people still use it deliberately.)
- With other locales, create the target DB manually with a compatible
encoding, and call pg_restore in a way to not create the target DB and
automatically convert encoding.
- Closes: #472930, LP: #207779
-- Martin Pitt <email address hidden> Mon, 31 Mar 2008 14:15:25 +0100
Changed in postgresql-common: | |
status: | Fix Committed → Fix Released |
Changed in postgresql-common: | |
status: | Unknown → Fix Released |
IMHO UTF-8 is absolutely the right thing nowadays. But even if you do not like it, you are welcome to pg_dropcluster the existing instance, and create a new one (man pg_createcluster) using a different default encoding. Or just create a new database with a different encoding.
What do you mean with "prevents you from using other charsets"?