neutron db migration fails, only attempted by n-c-c @ Juno

Bug #1524448 reported by Ryan Beisner
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nova-cloud-controller (Juju Charms Collection)
Fix Released
Medium
David Ames

Bug Description

neutron db migration fails, only attempted by the n-c-c charm when deploying Juno

http://bazaar.launchpad.net/~openstack-charmers/charms/trusty/nova-cloud-controller/next/view/204/hooks/nova_cc_hooks.py#L251

http://paste.ubuntu.com/13863246/

2015-12-09 16:52:38 INFO shared-db-relation-changed oslo.db.exception.DBDuplicateEntry: (IntegrityError) (1062, "Duplicate entry 'L3 agent-fat-machine' for key 'uniq_agents0agent_type0host'") 'ALTER TABLE agents ADD CONSTRAINT uniq_agents0agent_type0host UNIQUE (agent_type, host)' ()
2015-12-09 16:52:39 INFO worker.uniter.jujuc server.go:172 running hook tool "juju-log" ["-l" "INFO" "Retrying 'migrate_neutron_database' 1 more times (delay=15)"]
2015-12-09 16:52:39 INFO juju-log shared-db:65: Retrying 'migrate_neutron_database' 1 more times (delay=15)

...

2015-12-09 16:52:55 INFO shared-db-relation-changed subprocess.CalledProcessError: Command '['neutron-db-manage', '--config-file=/etc/neutron/neutron.conf', '--config-file=/etc/neutron/plugins/ml2/ml2_conf.ini', 'upgrade', 'head']' returned non-zero exit status 1
2015-12-09 16:52:55 INFO juju.worker.uniter.context context.go:579 handling reboot
2015-12-09 16:52:55 ERROR juju.worker.uniter.operation runhook.go:107 hook "shared-db-relation-changed" failed: exit status 1

Revision history for this message
David Ames (thedac) wrote :

According to James Page this was due to a grace period allowing Juno to be deployed without neutron-api and that grace period has passed.

MPs to come removing migrate_neutron_database from nova-cc and running it unconditionally in neutron-api.

Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → David Ames (thedac)
milestone: none → 16.01
David Ames (thedac)
Changed in nova-cloud-controller (Juju Charms Collection):
status: Triaged → In Progress
Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI - Now re-testing with the proposed charm branches, on metal where the issue was observed.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Test on bare metal, where we observed the db migration fail, is now passing. Substituted the proposed n-api and n-c-c charms in a trusty-juno metal deploy, no more db migration woes.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

FYI, juju stat, with the proposed charms deployed: http://paste.ubuntu.com/13898125/

Revision history for this message
Ryan Beisner (1chb1n) wrote :

I'm no longer able to reproduce the n-c-c neutron db migration fail with unmodified charms (a good thing).

The mystery is that I don't know exactly what "fixed" it.

A number of packages were SRU'd and updated in the Juno cloud archive this week, so that may be the reason.

# neutron-api unit says (as expected):
2015-12-10 18:35:12 INFO juju-log shared-db:55: Not running neutron database migration as migrations are handled by the neutron-server process or nova-cloud-controller charm.

# nova-cloud-controller does the db migration successfully:
http://paste.ubuntu.com/13902181/

David Ames (thedac)
Changed in nova-cloud-controller (Juju Charms Collection):
status: In Progress → Invalid
Revision history for this message
Larry Michel (lmic) wrote :

We are hitting this for following OpenStack pipeline parameters:
+ . ./pipeline_parameters
++ OPENSTACK_RELEASE=juno
++ COMPUTE=nova-kvm
++ BLOCK_STORAGE=cinder-vnx
++ IMAGE_STORAGE=glance-swift
++ BACKEND_DATABASE=mysql
++ NETWORKING=neutron-openvswitch
++ UBUNTU_RELEASE=trusty

From neutron-api unit log:

2016-01-18 14:54:43 INFO worker.uniter.jujuc server.go:172 running hook tool "juju-log" ["Not running neutron database migration as migrations are handled by the neutron-server process or nova-cloud-controller charm."]
2016-01-18 14:54:43 DEBUG worker.uniter.jujuc server.go:173 hook context id "neutron-api/0-shared-db-relation-changed-7941853246600621954"; dir "/var/lib/juju/agents/unit-neutron-api-0/charm"
2016-01-18 14:54:43 INFO juju-log shared-db:39: Not running neutron database migration as migrations are handled by the neutron-server process or nova-cloud-controller charm.

and in nova-cloud-controller unit log file:

2016-01-18 14:59:08 INFO shared-db-relation-changed cursor.execute(statement, parameters)
2016-01-18 14:59:08 INFO shared-db-relation-changed File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
2016-01-18 14:59:08 INFO shared-db-relation-changed self.errorhandler(self, exc, value)
2016-01-18 14:59:08 INFO shared-db-relation-changed File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2016-01-18 14:59:08 INFO shared-db-relation-changed raise errorclass, errorvalue
2016-01-18 14:59:08 INFO shared-db-relation-changed oslo.db.exception.DBDuplicateEntry: (IntegrityError) (1062, "Duplicate entry 'L3 agent-hayward-02' for key 'uniq_agents0agent_type0host'") 'ALTER TABLE agents ADD CONSTRAINT uniq_agents0agent_type0host UNIQUE (agent_type, host)' ()
2016-01-18 14:59:08 INFO worker.uniter.jujuc server.go:172 running hook tool "juju-log" ["-l" "INFO" "Retrying 'migrate_neutron_database' 5 more times (delay=3)"]
2016-01-18 14:59:08 DEBUG worker.uniter.jujuc server.go:173 hook context id "nova-cloud-controller/0-shared-db-relation-changed-5106613301876849604"; dir "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm"
2016-01-18 14:59:08 INFO juju-log shared-db:16: Retrying 'migrate_neutron_database' 5 more times (delay=3)

Changed in nova-cloud-controller (Juju Charms Collection):
status: Invalid → New
tags: added: oil
James Page (james-page)
Changed in nova-cloud-controller (Juju Charms Collection):
milestone: 16.01 → 16.04
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Observed this again 2016 Jan 29, Trusty-Juno (stable charms), deployed to metal:

http://pastebin.ubuntu.com/14729769/

2016-01-29 04:44:13 INFO shared-db-relation-changed subprocess.CalledProcessError: Command '['neutron-db-manage', '--config-file=/etc/neutron/neutron.conf', '--config-file=/etc/neutron/plugins/ml2/ml2_conf.ini', 'upgrade', 'head']' returned non-zero exit status 1
2016-01-29 04:44:13 ERROR juju.worker.uniter.operation runhook.go:107 hook "shared-db-relation-changed" failed: exit status 1

Revision history for this message
David Ames (thedac) wrote :

Found the root cause. It is a race condition waiting for the neutron_api_relation. If the neutron-server service is started before the neutron-api relation is joined it creates the mysql tables and subsequent migrations fail.

 941 @hooks.hook('neutron-api-relation-joined')
 942 def neutron_api_relation_joined(rid=None, remote_restart=False):
 943 with open('/etc/init/neutron-server.override', 'wb') as out:
 944 out.write('manual\n')

Proved by deploying without the neutron-api relation.

Solution coming.

Revision history for this message
Ryan Beisner (1chb1n) wrote :

Confirmed T-J on metal success with that merge: 3 of 3 all green. Thanks!

David Ames (thedac)
Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Fix Committed
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Issue still exists @ n-c-c stable. Need to evaluate for backport to stable.

tags: added: backport-potential
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Issue has returned to n-c-c master(next). Observing consistently on bare metal deploys.

2016-03-18 07:06:29 INFO shared-db-relation-changed oslo.db.exception.DBDuplicateEntry: (IntegrityError) (1062, "Duplicate entry 'L3 agent-fat-machine' for key 'uniq_agents0agent_type0host'") 'ALTER TABLE agents ADD CONSTRAINT uniq_agents0agent_type0host UNIQUE (agent_type, host)' ()
2016-03-18 07:06:29 INFO juju-log shared-db:17: Retrying 'migrate_neutron_database' 1 more times (delay=15)
2016-03-18 07:06:44 INFO juju-log shared-db:17: Migrating the neutron database.

Juju stat, unit log snippet:
http://pastebin.ubuntu.com/15414454/

Also see the full n-c-c unit log as attached.

Changed in nova-cloud-controller (Juju Charms Collection):
status: Fix Committed → New
James Page (james-page)
Changed in nova-cloud-controller (Juju Charms Collection):
milestone: 16.04 → 16.07
Revision history for this message
David Ames (thedac) wrote :

This was resolved with the removal of neutron from nova-cloud-controller

Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Fix Released
milestone: 16.07 → 16.04
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.