List index out of range when departing a cluster

Bug #1881596 reported by Florian Guitton
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL Router Charm
Fix Released
Medium
Unassigned
mysql-shared charm interface
Fix Released
Medium
David Ames

Bug Description

Hello everybody,

I fell through an issue today spilling out the following error message, getting my mysql-router in an error state and preventing cleanup of the unit.

unit-keystone-mysql-router-1: 13:10:28 ERROR unit.keystone-mysql-router/1.juju-log shared-db:47: Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/charm/reactive/mysql_router_handlers.py", line 102, in proxy_shared_db_responses
    instance.proxy_db_and_user_responses(db_router, shared_db)
  File "lib/charm/openstack/mysql_router.py", line 480, in proxy_db_and_user_responses
    unit = sending_interface.all_joined_units[0]
  File "/var/lib/juju/agents/unit-keystone-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/endpoints.py", line 582, in __getitem__
    return super().__getitem__(self._translate_key(key))
IndexError: list index out of range

The juju model contains the following :

Every 5.0s: juju status --color keystone juju-playground: Mon Jun 1 13:14:09 2020

Model Controller Cloud/Region Version SLA Timestamp
dsi-r1-openstack dsi-juju-controller dsi-maas/default 2.7.6 unsupported 13:14:10Z

App Version Status Scale Charm Store Rev OS Notes
keystone 17.0.0 waiting 3 keystone jujucharms 314 ubuntu
keystone-ha blocked 2 hacluster jujucharms 68 ubuntu
keystone-ldap-ict 17.0.0 active 2 keystone-ldap jujucharms 29 ubuntu
keystone-mysql-router 8.0.20 error 3 mysql-router jujucharms 0 ubuntu

Unit Workload Agent Machine Public address Ports Message
keystone/1* blocked executing 0/lxd/9 172.30.200.6 5000/tcp Missing relations: database, Allowed_units list provided but this unit not present
  keystone-mysql-router/1* error idle 172.30.200.6 hook failed: "shared-db-relation-departed"
keystone/2 blocked idle 3/lxd/4 172.30.200.56 5000/tcp Database not initialised
  keystone-ha/1* blocked idle 172.30.200.56 Resource: res_ks_admin_hostname not running
  keystone-ldap-ict/2* active idle 172.30.200.56 Unit is ready
  keystone-mysql-router/2 active idle 172.30.200.56 Unit is ready
keystone/3 blocked idle 0/lxd/10 172.30.200.9 5000/tcp Database not initialised
  keystone-ha/2 blocked idle 172.30.200.9 Resource: res_ks_admin_hostname not running
  keystone-ldap-ict/3 active idle 172.30.200.9 Unit is ready
  keystone-mysql-router/3 active idle 172.30.200.9 Unit is ready

Machine State DNS Inst id Series AZ Message
0 started 10.80.0.10 rh-09 focal SFO-02 Deployed
0/lxd/9 started 172.30.200.6 juju-92b05d-0-lxd-9 focal SFO-02 Container started
0/lxd/10 started 172.30.200.9 juju-92b05d-0-lxd-10 focal SFO-02 Container started
3 started 10.80.0.3 rh-06 focal SFO-02 Deployed
3/lxd/4 started 172.30.200.56 juju-92b05d-3-lxd-4 focal SFO-02 Container started

Tags: scaleback
Revision history for this message
Alvaro Uria (aluria) wrote :

Same here,
https://pastebin.ubuntu.com/p/zHzkhkkmYc/

2020-07-17 12:05:27 DEBUG shared-db-relation-departed Traceback (most recent call last):
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/charm/hooks/shared-db-relation-departed", line 22, in <module>
2020-07-17 12:05:27 DEBUG shared-db-relation-departed main()
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
2020-07-17 12:05:27 DEBUG shared-db-relation-departed bus.dispatch(restricted=restricted_mode)
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
2020-07-17 12:05:27 DEBUG shared-db-relation-departed _invoke(other_handlers)
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
2020-07-17 12:05:27 DEBUG shared-db-relation-departed handler.invoke()
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
2020-07-17 12:05:27 DEBUG shared-db-relation-departed self._action(*args)
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/charm/reactive/mysql_router_handlers.py", line 102, in proxy_shared_db_responses
2020-07-17 12:05:27 DEBUG shared-db-relation-departed instance.proxy_db_and_user_responses(db_router, shared_db)
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "lib/charm/openstack/mysql_router.py", line 481, in proxy_db_and_user_responses
2020-07-17 12:05:27 DEBUG shared-db-relation-departed unit = sending_interface.all_joined_units[0]
2020-07-17 12:05:27 DEBUG shared-db-relation-departed File "/var/lib/juju/agents/unit-keystone-mysql-router-0/.venv/lib/python3.8/site-packages/charms/reactive/endpoints.py", line 582, in __getitem__
2020-07-17 12:05:27 DEBUG shared-db-relation-departed return super().__getitem__(self._translate_key(key))
2020-07-17 12:05:27 DEBUG shared-db-relation-departed IndexError: list index out of range

Revision history for this message
Alvaro Uria (aluria) wrote :

When the shared-db-relation-departed error is resolved with --no-retry, unit will also fail on:
* shared-db-relation-broken
* db-router-relation-departed
* db-router-relation-broken

Ryan Beisner (1chb1n)
tags: added: scaleback
Changed in charm-mysql-router:
status: New → Triaged
tags: added: scale-back
removed: scaleback
tags: added: scaleback
removed: scale-back
Changed in charm-mysql-router:
importance: Undecided → Medium
milestone: none → 20.08
James Page (james-page)
Changed in charm-mysql-router:
milestone: 20.08 → none
Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :

Hitting this on focal-ussuri while implementing a scaleback test for hacluster https://review.opendev.org/#/c/741592/ . In this test we're scaling back keystone. On focal-ussuri hacluster isn't the only subordinate of keystone: there is also mysql-router. When scaling back we end up with:

keystone/0* blocked executing 3 172.17.100.7 5000/tcp Missing relations: database, Allowed_units list provided but this unit not present
  keystone-mysql-router/2 error idle 172.17.100.7 hook failed: "shared-db-relation-departed"
keystone/1 blocked idle 4 172.17.100.15 5000/tcp Database not initialised
  hacluster/1 blocked idle 172.17.100.15 Insufficient peer units for ha cluster (require 3)
  keystone-mysql-router/1 active idle 172.17.100.15 Unit is ready
keystone/2 blocked idle 5 172.17.100.23 5000/tcp Database not initialised
  hacluster/0* blocked idle 172.17.100.23 Insufficient peer units for ha cluster (require 3)
  keystone-mysql-router/0* active idle 172.17.100.23 Unit is ready

2020-08-27 19:23:24 ERROR juju-log shared-db:4: Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/charm/reactive/mysql_router_handlers.py", line 102, in proxy_shared_db_responses
    instance.proxy_db_and_user_responses(db_router, shared_db)
  File "lib/charm/openstack/mysql_router.py", line 481, in proxy_db_and_user_responses
    unit = sending_interface.all_joined_units[0]
  File "/var/lib/juju/agents/unit-keystone-mysql-router-2/.venv/lib/python3.8/site-packages/charms/reactive/endpoints.py", line 582, in __getitem__
    return super().__getitem__(self._translate_key(key))
IndexError: list index out of range

https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_func_full/openstack/charm-hacluster/741592/10/6706/index.html

Changed in charm-mysql-router:
status: Triaged → In Progress
assignee: nobody → Aurelien Lourot (aurelien-lourot)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (master)

Fix proposed to branch: master
Review: https://review.opendev.org/748714

David Ames (thedac)
Changed in charm-interface-mysql-shared:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → David Ames (thedac)
milestone: none → 20.10
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-interface-mysql-shared (master)

Fix proposed to branch: master
Review: https://review.opendev.org/748717

Changed in charm-interface-mysql-shared:
status: Triaged → In Progress
Revision history for this message
David Ames (thedac) wrote :

The use of when_any in the shared interface is causing the original bug.

    @reactive.when_any('endpoint.{endpoint_name}.broken',
                       'endpoint.{endpoint_name}.departed')
    def departed(self):
        flags = (
            self.expand_name('{endpoint_name}.connected'),
            self.expand_name('{endpoint_name}.available'),
        )
        for flag in flags:
            reactive.clear_flag(flag)

This update to the mysql-shared interface duplicates a change put into mysql-router which resolved scale in issues with mysql-innodb-cluster:

https://review.opendev.org/748717

Revision history for this message
David Ames (thedac) wrote :

CORRECTION

"The use of when_any in the shared interface is causing the original bug."
The interface fix resolves the secondary bug, Aurelien's fix is still necessary.

Changed in charm-mysql-router:
milestone: none → 20.10
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-interface-mysql-shared (master)

Reviewed: https://review.opendev.org/748717
Committed: https://git.openstack.org/cgit/openstack/charm-interface-mysql-shared/commit/?id=06675c43c9f48ffa98b2abf2ddca137f8d18f31c
Submitter: Zuul
Branch: master

commit 06675c43c9f48ffa98b2abf2ddca137f8d18f31c
Author: David Ames <email address hidden>
Date: Fri Aug 28 09:33:57 2020 -0700

    Scale-in fixes

    Properly handle departed and broken hooks.

    Change-Id: Iecb1f943ffa5505617b5e5eb70f070458fc6645d
    Closes-Bug: #1881596

Changed in charm-interface-mysql-shared:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (master)

Reviewed: https://review.opendev.org/748714
Committed: https://git.openstack.org/cgit/openstack/charm-mysql-router/commit/?id=84bc5577c819b70202a4eaea6e2a7f694c194753
Submitter: Zuul
Branch: master

commit 84bc5577c819b70202a4eaea6e2a7f694c194753
Author: Aurelien Lourot <email address hidden>
Date: Fri Aug 28 13:23:25 2020 +0200

    Fix shared-db-relation-departed error when departing a cluster

    func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/399
    Change-Id: Ie86fca7289a0b04686841e78019b76a270a5b411
    Closes-Bug: #1881596
    Co-Authored-By: Dmitrii Shcherbakov <email address hidden>

Changed in charm-mysql-router:
status: In Progress → Fix Committed
Changed in charm-mysql-router:
assignee: Aurelien Lourot (aurelien-lourot) → nobody
Changed in charm-mysql-router:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.