db migrate script to set charset=utf8 for all tables

Bug #1279000 reported by Bhuvan Arumugam on 2014-02-11
48
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Glance
Medium
Zhi Yan Liu
OpenStack + Chef
High
Unassigned
oslo-incubator
High
Zhi Yan Liu
glance (Ubuntu)
Medium
Unassigned
Trusty
Medium
Unassigned

Bug Description

2014-02-11 17:42:15.495 26564 CRITICAL glance [-] ValueError: Tables "image_locations,image_members,image_properties,image_tags,images,migrate_version,task_info,tasks" have non utf8 collation, please make sure all tables are CHARSET=utf8
2014-02-11 17:42:15.495 26564 TRACE glance Traceback (most recent call last):
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv//bin/glance-manage", line 13, in <module>
2014-02-11 17:42:15.495 26564 TRACE glance sys.exit(main())
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py", line 220, in main
2014-02-11 17:42:15.495 26564 TRACE glance return CONF.command.action_fn()
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py", line 121, in sync
2014-02-11 17:42:15.495 26564 TRACE glance CONF.command.current_version)
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py", line 98, in sync
2014-02-11 17:42:15.495 26564 TRACE glance migration.db_sync(db_migration.MIGRATE_REPO_PATH, version)
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/openstack/common/db/sqlalchemy/migration.py", line 195, in db_sync
2014-02-11 17:42:15.495 26564 TRACE glance _db_schema_sanity_check()
2014-02-11 17:42:15.495 26564 TRACE glance File "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/openstack/common/db/sqlalchemy/migration.py", line 216, in _db_schema_sanity_check
2014-02-11 17:42:15.495 26564 TRACE glance ) % ','.join(table_names))
2014-02-11 17:42:15.495 26564 TRACE glance ValueError: Tables "image_locations,image_members,image_properties,image_tags,images,migrate_version,task_info,tasks" have non utf8 collation, please make sure all tables are CHARSET=utf8
2014-02-11 17:42:15.495 26564 TRACE glance

glance-manage fail to come up with above error. It's like due to following commit in oslo wherein we enforce charset=utf8 for all tables.
7aa94df Add a db check for CHARSET=utf8

I think we should have a migration script to change the charset for all tables.

Zhi Yan Liu (lzy-dev) on 2014-02-14
Changed in glance:
importance: Undecided → High
status: New → Confirmed
importance: High → Medium
Mark Washenberger (markwash) wrote :

Due to the potentially high cost:benefit ratio of such migrations for certain deployments, I think it would be great to simply remove this check from oslo-db, or otherwise make it optional.

But I'd love a gut-check on this opinion :-)

Bhuvan Arumugam (bhuvan) wrote :

This is specific to MySQL. I think, we should fix 2 things:
  1. documentation. fix documentation to configure MySQL with charset=utf8. The following lines should be added to /etc/my.cnf file [mysqld] section. This should fix this problem for new installations.
collation-server = utf8_general_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
  2. for existing ones, I think a migration script would help. The script should scan through all tables and change charset, if it is not previously set as utf8.
  ALTER TABLE <table> CONVERT TO CHARACTER SET 'utf8';

FWIW, right now, we face this issue only in glance as oslo code is sync'ed to this project. We may face this problem in other components, once this change is merged in them.

Zhi Yan Liu (lzy-dev) wrote :

Reply base on above comments.

If we want to make utf-8 migration script approach work, we need to "optionlize" the utf-8 checking logic for db_sync() first. Otherwise we can only please operator (human) fix tables' charset to utf-8 instead of Glance, since any migration script can't work until the table charset be fixed to utf-8.

Changed in oslo:
assignee: nobody → Zhi Yan Liu (lzy-dev)
Changed in glance:
assignee: nobody → Zhi Yan Liu (lzy-dev)

Fix proposed to branch: master
Review: https://review.openstack.org/75356

Changed in oslo:
status: New → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/75865

Changed in glance:
status: Confirmed → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/75898

David Geng (genggjh) wrote :

Hi, I hint another error when set the character-set-server to utf8 when start neutron server:
INFO neutron.db.api [-] NT-8B9D93A Database registration exception: (OperationalError) (1071, 'Specified key was too long; max key length is 1000 bytes') '\nCREATE TABLE agents (\n\tid VARCHAR(36) NOT NULL, \n\tagent_type VARCHAR(255) NOT NULL, \n\t`binary` VARCHAR(255) NOT NULL, \n\ttopic VARCHAR(255) NOT NULL, \n\thost VARCHAR(255) NOT NULL, \n\tadmin_state_up BOOL NOT NULL, \n\tcreated_at DATETIME NOT NULL, \n\tstarted_at DATETIME NOT NULL, \n\theartbeat_timestamp DATETIME NOT NULL, \n\tdescription VARCHAR(255), \n\tconfigurations VARCHAR(4095) NOT NULL, \n\tPRIMARY KEY (id), \n\tCONSTRAINT uniq_agents0agent_type0host UNIQUE (agent_type, host), \n\tCHECK (admin_state_up IN (0, 1))\n)\n\n' ()

Doug Hellmann (doug-hellmann) wrote :

I'm having some trouble understanding how the proposed change fits into the bigger picture.

IIUC, we need all of the tables to have the same character set in order for foreign key constraints to work properly (https://bugs.launchpad.net/heat/+bug/1261605). The change proposed above would provide a switch to turn off character set checking, which would allow tables to not have the same character set. Is that wise? What effect would that have on future migrations?

It also appears that enabling the switch requires a change in the migration driving program for each project (glance-manage, etc.). Is there a way to define that option centrally so that all projects have it?

Why is glance the only project reporting this issue? Have we just not hit it in other projects? It looks like nova resolved the issue by providing an alter script that ensures the character sets for all the tables are utf8 (https://github.com/openstack/nova/blob/0a2ab98491cbf62e148e8289bc69c26357bd136c/nova/db/sqlalchemy/migrate_repo/versions/170_havana.py#L1395). Does glance need a similar change? I realize those sorts of changes can be rather expensive, so we would want to isolate the migration as a single step in the migration process and make sure it is highlighted in the release notes so deployers are aware of it.

Zhi Yan Liu (lzy-dev) wrote :

Hi Doug Hellmann,

Sorry for the delay response.

>>>> IIUC, we need all of the tables to have the same character set in order
for foreign key constraints to work properly
(https://bugs.launchpad.net/heat/+bug/1261605). The change proposed
above would provide a switch to turn off character set checking, which
would allow tables to not have the same character set. Is that wise?
What effect would that have on future migrations?

Actually I think we are on the same page on this point. The purpose for the
switch is just for provide a temporary disablement on the checking to give deployer a chance to
execute automatic utf8 table charset migration script, otherwise all the migrations will be blocked. So after that, the final charset of the tables will all under unified utf8. So this will not effect future migrations as well.

>>>> It also appears that enabling the switch requires a change in the
migration driving program for each project (glance-manage, etc.). Is
there a way to define that option centrally so that all projects have
it?

Yes, it's a worth way to go, but TBH IMHO seems there is not a common way. Like to know other folks input.

>>>> Why is glance the only project reporting this issue? Have we just not
hit it in other projects?

I think other projects should have this similar issue after they synced that utf8 check code from oslo-db.

>>>> It looks like nova resolved the issue by
providing an alter script that ensures the character sets for all the
tables are utf8
(https://github.com/openstack/nova/blob/0a2ab98491cbf62e148e8289bc69c26357bd136c/nova/db/sqlalchemy/migrate_repo/versions/170_havana.py#L1395).
Does glance need a similar change? I realize those sorts of changes can
be rather expensive, so we would want to isolate the migration as a
single step in the migration process and make sure it is highlighted in
the release notes so deployers are aware of it.

Yep, agreed. Actually I prepared a migration for Glance, and after my oslo change get mreger I will push it landing:
https://review.openstack.org/#/c/75898/5/glance/db/sqlalchemy/migrate_repo/versions/035_mysql_set_utf8_charset.py

Bhuvan Arumugam (bhuvan) wrote :

fwiw, we faced same issue in keystone.migrate_version table. The fix was to set default charset for all tables to 'utf8';

I think we should have migration script for all components and document mysql setting to use utf8 as default character set for new databases.

Zhi Yan Liu (lzy-dev) on 2014-03-13
Changed in glance:
milestone: none → icehouse-rc1
James Page (james-page) on 2014-03-17
Changed in glance (Ubuntu Trusty):
status: New → Triaged
importance: Undecided → High
James Page (james-page) on 2014-03-17
Changed in glance (Ubuntu Trusty):
importance: High → Medium
Changed in oslo:
assignee: Zhi Yan Liu (lzy-dev) → Ben Nemec (bnemec)
Ben Nemec (bnemec) on 2014-03-19
Changed in oslo:
milestone: none → icehouse-rc1
assignee: Ben Nemec (bnemec) → Zhi Yan Liu (lzy-dev)
Ben Nemec (bnemec) on 2014-03-21
Changed in oslo:
importance: Undecided → High

Reviewed: https://review.openstack.org/75356
Committed: https://git.openstack.org/cgit/openstack/oslo-incubator/commit/?id=f7705f3a7d6acd8319318afc0fd9d1b534285f7d
Submitter: Jenkins
Branch: master

commit f7705f3a7d6acd8319318afc0fd9d1b534285f7d
Author: Zhi Yan Liu <email address hidden>
Date: Sat Feb 1 06:24:16 2014 +0800

    Make table utf-8 charset checking be optional for DB migration

    Add an option to the migration db_sync call to allow a project to
    disable the check for utf8 encoding on all tables. This allows the
    project to fix problems automatically via a migration, which
    otherwise would not be possible because the utf8 check would stop
    them from running migrations.

    Change-Id: I7f271d846141ac72dde3fb0d12159b125018eb2c
    Closes-bug: #1279000
    Signed-off-by: Zhi Yan Liu <email address hidden>

Changed in oslo:
status: In Progress → Fix Committed
Thiago Martins (martinx) wrote :

Hello guys,

Just for the record, brand new installation, Ubuntu 14.04, today's upgrade, keystone version "1:2014.1~rc1-0ubuntu1" and, first time running "keystone-manage db_sync" command on an empty "keystone database", it returns:

"2014-03-29 03:09:43.521 18564 CRITICAL keystone [-] ValueError: Tables "migrate_version" have non utf8 collation, please make sure all tables are CHARSET=utf8"

...maybe it is related to this BUG, I'm not sure... Only one table got created on keystone db, which is "migrate_version", db_sync aborted.

My keystone/mysql server is a KVM Virtual Machine, IPv6-Only setup, name resolution okay. MySQL bind-listen on "::".

After following post #2 and restarting it from scratch, db_sync worked!

Best!
Thiago

Dolph Mathews (dolph) wrote :

Thiago, that exception looks to be caused by this patch to oslo.db:

  https://github.com/openstack/oslo-incubator/commit/7aa94df

Should there be an exception for migrate_version there? (why do we care about the charset of that table?) Alternatively, Nova appears to have a workaround:

    if migrate_engine.name == "mysql":
        # In Folsom we explicitly converted migrate_version to UTF8.
        sql = "ALTER TABLE migrate_version CONVERT TO CHARACTER SET utf8;"
        # Set default DB charset to UTF8.
        sql += "ALTER DATABASE %s DEFAULT CHARACTER SET utf8;" % \
                migrate_engine.url.database
        migrate_engine.execute(sql)

Changed in keystone:
status: New → Incomplete
Thierry Carrez (ttx) on 2014-04-01
tags: added: icehouse-rc-potential
Ben Nemec (bnemec) wrote :

If the charset of migrate_version doesn't matter (I don't know if that's true though), then we might just want to exclude it from the check in the oslo function since we would know that it never matters.

Lars Kellogg-Stedman (larsks) wrote :

Even with https://review.openstack.org/#/c/75356/ merged into glance, glance master will still fail with the original error if it encounters a non-utf8 database, because it calls db_sync with without modifying the default sanity_check=True. So we either need an option to set sanity_check=False when upgrading the database, or really the migration mechanism should take care of converting database tables to utf-8.

Ben Nemec (bnemec) wrote :

Lars, see https://review.openstack.org/#/c/75865/ That should allow migrations to continue for Icehouse without forcing utf8.

Unfortunately there's a lot of potential complexity around converting tables to utf8 so the Glance team doesn't want to add that this late in the cycle. This is a temporary workaround and hopefully we can come up with a permanent solution in Juno.

Lars Kellogg-Stedman (larsks) wrote :

Ben: Ah, thanks, I missed that. Is there a similar fix for keystone? Keystone has the same utf8 check, but does not yet appear to have merged (or have pending) the changes to make it optional and to expose that to the -manage command.

Reviewed: https://review.openstack.org/75865
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=1a40831f1711bf46c878dd247679bb05d19ffb5a
Submitter: Jenkins
Branch: master

commit 1a40831f1711bf46c878dd247679bb05d19ffb5a
Author: Zhi Yan Liu <email address hidden>
Date: Mon Feb 24 18:47:51 2014 +0800

    Making DB sanity checking be optional for DB migration

    Adding a deprecated option to skip the DB sanity check to db_sync
    interface, to give deployer a chance to execute migration script.
    Without this change all migration scripts will be blocked before get
    execution.

    DocImpact

    Change-Id: If887ac6b814879a0140dc1065a060233aee7bc82
    Partial-Bug: #1279000
    Related-Id: I7f271d846141ac72dde3fb0d12159b125018eb2c
    Signed-off-by: Zhi Yan Liu <email address hidden>

Thierry Carrez (ttx) on 2014-04-01
Changed in glance:
status: In Progress → Fix Released
Ben Nemec (bnemec) wrote :

I'm not sure what the status is in Keystone. I seem to recall that they had already done a utf8 migration script, but to be safe you should probably check with them directly.

Thierry Carrez (ttx) wrote :

See bug 1301036 for the keystone issue

no longer affects: keystone
Thierry Carrez (ttx) on 2014-04-04
Changed in oslo:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2014-04-17
Changed in oslo:
milestone: icehouse-rc1 → 2014.1
Thierry Carrez (ttx) on 2014-04-17
Changed in glance:
milestone: icehouse-rc1 → 2014.1
Changed in glance (Ubuntu):
status: Triaged → Fix Released
Changed in glance (Ubuntu Trusty):
status: Triaged → Fix Released
Download full text (3.5 KiB)

Thanks a lot

2014-04-29 18:27 GMT+02:00 Corey Bryant <email address hidden>:

> ** Changed in: glance (Ubuntu)
> Status: Triaged => Fix Released
>
> ** Changed in: glance (Ubuntu Trusty)
> Status: Triaged => Fix Released
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1279000
>
> Title:
> db migrate script to set charset=utf8 for all tables
>
> Status in OpenStack Image Registry and Delivery Service (Glance):
> Fix Released
> Status in Oslo - a Library of Common OpenStack Code:
> Fix Released
> Status in “glance” package in Ubuntu:
> Fix Released
> Status in “glance” source package in Trusty:
> Fix Released
>
> Bug description:
> 2014-02-11 17:42:15.495 26564 CRITICAL glance [-] ValueError: Tables
> "image_locations,image_members,image_properties,image_tags,images,migrate_version,task_info,tasks"
> have non utf8 collation, please make sure all tables are CHARSET=utf8
> 2014-02-11 17:42:15.495 26564 TRACE glance Traceback (most recent call
> last):
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv//bin/glance-manage", line 13, in
> <module>
> 2014-02-11 17:42:15.495 26564 TRACE glance sys.exit(main())
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py",
> line 220, in main
> 2014-02-11 17:42:15.495 26564 TRACE glance return
> CONF.command.action_fn()
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py",
> line 121, in sync
> 2014-02-11 17:42:15.495 26564 TRACE glance
> CONF.command.current_version)
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/cmd/manage.py",
> line 98, in sync
> 2014-02-11 17:42:15.495 26564 TRACE glance
> migration.db_sync(db_migration.MIGRATE_REPO_PATH, version)
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/openstack/common/db/sqlalchemy/migration.py",
> line 195, in db_sync
> 2014-02-11 17:42:15.495 26564 TRACE glance _db_schema_sanity_check()
> 2014-02-11 17:42:15.495 26564 TRACE glance File
> "/usr/local/csi/share/csi-glance.venv/lib/python2.6/site-packages/glance/openstack/common/db/sqlalchemy/migration.py",
> line 216, in _db_schema_sanity_check
> 2014-02-11 17:42:15.495 26564 TRACE glance ) % ','.join(table_names))
> 2014-02-11 17:42:15.495 26564 TRACE glance ValueError: Tables
> "image_locations,image_members,image_properties,image_tags,images,migrate_version,task_info,tasks"
> have non utf8 collation, please make sure all tables are CHARSET=utf8
> 2014-02-11 17:42:15.495 26564 TRACE glance
>
> glance-manage fail to come up with above error. It's like due to
> following commit in oslo wherein we enforce charset=utf8 for all tables.
> 7aa94df Add a db check for CHARSET=utf8
>
> I think we should have a migration script to change the charset for
> all tables.
>
> To manage notif...

Read more...

Jon Proulx (jproulx) wrote :

HI All,

I Just hit this bug during a test Upuntu 12.04 test upgrade , I see fix released for 14.04 should I have this fix in 12.04?

root@test-controller:~# dpkg -l glance
||/ Name Version
+++-==================================================
ii glance 1:2014.1-0ubuntu1~cloud0

root@test-controller:~# glance-manage db sync
2014-06-30 16:59:24.744 9119 CRITICAL glance [-] ValueError: Tables "image_locations,image_members,image_properties,image_tags,images,migrate_version" have non utf8 collation, please make sure all tables are CHARSET=utf8

JJ Asghar (d-jj) wrote :

Heres' a fix, it's gross but works.

# mysql -u root -pilikerandompasswords glance
mysql> alter table migrate_version convert to character set utf8 collate utf8_unicode_ci;
mysql> flush privileges;
mysql> quit

JJ Asghar (d-jj) wrote :

Here is the openstack + chef bug report: https://bugs.launchpad.net/openstack-chef/+bug/1356887

Changed in openstack-chef:
importance: Undecided → High
status: New → Triaged
Changed in openstack-chef:
status: Triaged → Fix Released

Change abandoned by Flavio Percoco (<email address hidden>) on branch: master
Review: https://review.openstack.org/109154
Reason: after all this time, I'd assume we don't need this patch anymore. The bug/blueprint was probably fixed already or invalidated. Please, feel free to re-open it if you think otherwise.

Reviewed: https://review.openstack.org/207297
Committed: https://git.openstack.org/cgit/openstack/glance/commit/?id=e5d10c3c1b52b62ed685b98500fc4416964d0951
Submitter: Jenkins
Branch: master

commit e5d10c3c1b52b62ed685b98500fc4416964d0951
Author: Sabari Kumar Murugesan <email address hidden>
Date: Wed Jul 29 10:06:46 2015 -0700

    Fix existing migrations to create utf-8 tables for MySQL DB

    oslo.db mandates all tables should be created with utf-8 charset
    when using MySQL DB. We added an option (deprecated) to skip this
    sanity check and it was subsequently removed in Liberty.

    This patch fixes existing migration scripts to create tables with
    utf-8 charset. Since oslo.db does a sanity check before running
    the migrations, it's safe to assume that only new tables created
    since the previous migration will be affected.

    Closes-bug: #1279000

    Change-Id: Ia31dc9e5d4494d1b9633f916fa6e2704d33dffb0

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers