Comment 8 for bug 1915512

Revision history for this message
Hua Zhang (zhhuabj) wrote :

Summary of some patches so far.

1, patchset4 [1], running 'hookenv.departing_unit' in leader unit.

reactive.flags.register_trigger(
    when_not='cluster.connected', set_flag='unit.departed')

...
@reactive.when('unit.departed')
def update_controller_ip_port_list():
    departing_unit = ch_core.hookenv.departing_unit()
    if not departing_unit:
        # do clean-up staffs here
        reactive.clear_flag('unit.departed')

The problem is that the flag 'cluster.connected' is always there even after running 'juju remove-unit xxx'.
I was told that the uit being removed is the one that gets that event, the leader unit is not leaving the cluster realtion so it will retain that flag.

See comment #7 for more details - https://bugs.launchpad.net/charm-octavia/+bug/1915512/comments/7

2, so do we have to run 'hookenv.departing_unit' in departed unit side? That's what patchset3 [2] does.

Yeah, we can sucessfully get departing_unit_name by running hookenv.departing_unit in departed unit side

@reactive.when_not('cluster.connected')
def cluster_departed():
    if hookenv.local_unit() == hookenv.departing_unit():

but how to pass departing_unit_name from departed unit to leader unit because update_controller_ip_port_list running in leader unit side needs this parameter? and how to guarantee the order to first run cluster_departed then run update_controller_ip_port_list

See comment #6 for more details - https://bugs.launchpad.net/charm-octavia/+bug/1915512/comments/6

3, Building departing_unit_name list in leader unit itself to bypass the problem 2), That's what patchset2 [3] does.

       for u in ch_core.hookenv.iter_units_for_relation_name('cluster'):
            running_units.append(u.unit.replace('/', '-'))
        for unit_name in db_units:
            if unit_name not in running_units:
                missing_units.append(unit_name)

But we also have one problem, when running 'juju remove-unit octavia/3', you will see octavia/3 is still in the output of iter_units_for_relation_name.

So we may have to cope with it through an action or something like that.

[1] https://review.opendev.org/c/openstack/charm-octavia/+/787700/4/src/reactive/octavia_handlers.py
[2] https://review.opendev.org/c/openstack/charm-octavia/+/787700/3/src/reactive/octavia_handlers.py
[3] https://review.opendev.org/c/openstack/charm-octavia/+/787700/2/src/reactive/octavia_handlers.py