password update shared-db relation does not fire nova_cell_api_relation_changed

Bug #1883142 reported by David O Neill
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Fix Released
Medium
Unassigned

Bug Description

Short Description
=================
nova_cell_api_relation_changed is not firing on shared-db relation changed or other events.

Long description
============
Due to an issue with the mysql-percona charm.
We had to remove the units and redeploy the charm.
We then restored the database from backup.
During this time the shared db relation fired many times, however cell0 connection string did not update

https://docs.openstack.org/nova/latest/user/cells.html

Inspection of the log show no event, for the entire lifecycle of the mysql shared-db relation changed over 2 months.

grep "Cell registration data changed, triggering a remote restart" unit-nova-cloud-controller-0.log

ls -liah unit-nova-cloud-controller-0.log
98437571 -rw-r----- 1 syslog adm 96M Jun 11 16:01 unit-nova-cloud-controller-0.log

head -n 1 unit-nova-cloud-controller-0.log
2019-04-19 12:41:39 INFO juju.cmd supercommand.go:57 running jujud [2.5.4 gc go1.11.6]

tail -n 1 unit-nova-cloud-controller-0.log
2020-06-11 16:00:51 DEBUG juju-log 0 section(s) found

From the hooks file
===================
grep -ri "Cell registration data changed, triggering a remote restart" /var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/
/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc_hooks.py: "Cell registration data changed, triggering a remote restart",

This led to heat being unable to delete stacks and other strange behaviors

in nova.log the following was observed
=========================================
2020-06-10 13:53:04.941 1112506 ERROR nova.context OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'nova'@'10.116.52.84' (using password: YES)")
2020-06-10 13:56:02.805 1112506 ERROR nova.context [req-496dac2f-9a3d-4892-afeb-77e58561efde 9e30fe9de70041fcbdfba6216255806c bc26db3ae04948658687c7cde7c588f8 - 0d06495de92149f3a0b56a3fbe12249e 0d06495de92149f3a0b56a3fbe12249e] Error gathering result from cell 00000000-0000-0000-0000-000000000000: OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'nova'@'10.116.52.84' (using password: YES)")
2020-06-10 13:56:02.805 1112506 ERROR nova.context OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'nova'@'10.116.52.84' (using password: YES)")
2020-06-10 13:57:04.996 1112506 ERROR nova.context [req-99cb2747-3c94-487c-9cba-96367cf3f1e4 0b8f96bf476d477782ae5fb71fa78b68 bc26db3ae04948658687c7cde7c588f8 - 0d06495de92149f3a0b56a3fbe12249e 0d06495de92149f3a0b56a3fbe12249e] Error gathering result from cell 00000000-0000-0000-0000-000000000000: OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user 'nova'@'10.116.52.84' (using password: YES)")

Code in question
===================

@hooks.hook('nova-cell-api-relation-changed')
def nova_cell_api_relation_changed(rid=None, unit=None):
    data = hookenv.relation_get(rid=rid, unit=unit)
    ch_neutron.log("Data: {}".format(data, level=hookenv.DEBUG))
    if not data.get('cell-name'):
        return
    cell_updated = ncc_utils.update_child_cell(
        name=data['cell-name'],
        db_service=data['db-service'],
        amqp_service=data['amqp-service'])
    if cell_updated:
        hookenv.log(
            "Cell registration data changed, triggering a remote restart",
            level=hookenv.DEBUG)
        hookenv.relation_set(
            relation_id=rid,
            restart_trigger=str(uuid.uuid4()))

What we would like
==================
Should this fire on shared-db relation changed, and if so why did it not?

thank you

Tags: scaleback
summary: - password uopdate shared-db relation does not fire
+ password update shared-db relation does not fire
nova_cell_api_relation_changed
Ryan Beisner (1chb1n)
tags: added: scaleback
Revision history for this message
Ryan Beisner (1chb1n) wrote :

Given the situation (adding/removing database units mixed with rolling back database contents), we would expect the adminsitrator to have to do manual intervention in order to recover operations.

However, we think there is a bug to fix here in less-turbulent scenarios, and we'll look further into that.

Changed in charm-nova-cloud-controller:
milestone: none → 20.08
James Page (james-page)
Changed in charm-nova-cloud-controller:
milestone: 20.08 → none
Changed in charm-nova-cloud-controller:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Chris MacNaughton (chris.macnaughton) wrote :

I believe that this bug should be resolved by another recent fix for bug https://bugs.launchpad.net/charm-nova-cloud-controller/+bug/1892904 so I'm marking this bug fix-released as well. If it can be reproduced with current charms, please re-mark this bug as new!

Changed in charm-nova-cloud-controller:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.