Comment 4 for bug 1916609

Revision history for this message
Liam Young (gnuoy) wrote :

I think I have finally got to the bottom of this. When the nova-compute charm looks for the metadata-shared-secret it inspects all the units of all the applications related to it on the neutron-plugin relation *1 . There is usually only one application but after the ovn migration there are two (neutron-openvswitch and ovn-chassis).

Due to the way the relations are inspected its mostly luck whether the charm picks the secret from the old neutron-openvswitch relation or the new ovn-chassis relation. The relation-ids seem to be sorted as strings *2 so although its luck which one the secret comes from it should be consistent from then on. But, from the bug description, that is not whats happening. It appears that the old secret is picked up initially and then after a config-changed hook it switches to the new one. The only way I can account for this is if the neutron-openvswitch unit was removed after the upgrade but before the debug=true trigger. This is exactly what the documentation suggests that you do.

I have manually simulated this and sure enough if the neutron-openvswtich relation id comes last in a string ordered list of neutron-plugin relation ids then the wrong metadata secret will be written. Even removing the neutron-openvswitch application will not trigger the secret to be corrected. But manually triggering a config changed hook will cause the compute charm to reinspect its relation and update nova.conf with the correct secret.

*1 https://opendev.org/openstack/charm-nova-compute/src/branch/master/hooks/nova_compute_context.py#L127
*2 https://github.com/juju/juju/blame/develop/worker/uniter/runner/jujuc/relation-ids.go#L83