Comment 4 for bug 1959720

Revision history for this message
Drew Freiberger (afreiberger) wrote :

Ultimately, there are two separate assumptions in openstack-integrator's update_members function that leads to this failure mode.

First:

Adding in some sanity checking of the members vs self.members check, it appears that self.members is a local cache of data and is blank. Perhaps if self.members is None when running an update_loadbalancer routine, we should query the LB provider/Octavia for a current list of members before just blindly creating members with the assumption that it's never been set up.

I added some debug log entries to the top of the function to print out the passed arg "members" vs the "self.members" object attribute.

unit-openstack-integrator-0: 16:19:12 INFO unit.openstack-integrator/0.juju-log members: {('192.168.200.139', '6443'), ('192.168.200.197', '6443'), ('192.168.200.123', '6443')}
unit-openstack-integrator-0: 16:19:12 INFO unit.openstack-integrator/0.juju-log self.members: set()

In the create method, we detect if there are already members, though we just created the LB (probably for idempotent and error-handling purposes.)

https://github.com/juju-solutions/charm-openstack-integrator/blob/a0363d0d103764418e6cf93fbbdbaa0b2b02e55a/lib/charms/layer/openstack.py#L525-L527

We may wish to add this same list_members logic into the update_members function:

if not self.members:
    self.members = self._impl.list_members()

This would help to prime the charm's cached information about the existing LB and avoid the octavia 500 error bug upon creation of a duplicate member.

I think this may be specifically related to environments where openstack-integrator is removed/added/migrated where it'll detect the loadbalancer and attempt to update it, but has no cached info about the lb's pools/members/etc in the local kv store.

Second:

Upon trying this logic on my local failing unit, it's running into issues with python3 and set logic not working as this code assumes. It is the result of the incoming tuples being (str, str) instead of (str, int) as the self._impl.list_members() results in.

We need to do a stronger match than set logic, or groom the input to the update_loadbalancer function to match the octavia implementation data structure.

From this logic at the top of update_members:

        # prime the members cache before update of pre-existing LB lp#1959720
        log("members: {}", members)
        log("self.members: {}", self.members)
        if not self.members:
            self.members = self._impl.list_members()
            log("self.members updated: {}", self.members)

unit-openstack-integrator-0: 16:38:13 INFO unit.openstack-integrator/0.juju-log members: {('192.168.200.123', '6443'), ('192.168.200.139', '6443'), ('192.168.200.197', '6443')}
unit-openstack-integrator-0: 16:38:13 INFO unit.openstack-integrator/0.juju-log self.members: set()
unit-openstack-integrator-0: 16:38:16 INFO unit.openstack-integrator/0.juju-log self.members updated: {('192.168.200.197', 6443), ('192.168.200.123', 6443), ('192.168.200.139', 6443)}
unit-openstack-integrator-0: 16:38:18 INFO unit.openstack-integrator/0.juju-log Removed member: ('192.168.200.197', 6443)
unit-openstack-integrator-0: 16:38:24 INFO unit.openstack-integrator/0.juju-log Removed member: ('192.168.200.123', 6443)
unit-openstack-integrator-0: 16:38:29 INFO unit.openstack-integrator/0.juju-log Removed member: ('192.168.200.139', 6443)
unit-openstack-integrator-0: 16:38:42 INFO unit.openstack-integrator/0.juju-log Added member: ('192.168.200.123', '6443')