I'm still debugging the root cause, but during an upgrade from ceph-osd octopus to pacific, I observed this crash resulting in error status for the charm:
```
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log old_version: octopus
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log new_version: None
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log octopus to None is a valid upgrade path. Proceeding.
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log Making dir /var/lib/charm/ceph-osd ceph:ceph 555
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log Monitor hosts are ['10.17.4.19', '10.17.4.22', '10.17.4.28']
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log AZ Info: rack=zone1
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log roll_osd_cluster called with None
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log osd_sorted_list: [<charms_ceph.utils.CrushLocation object at 0x7fc7c1dde940>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1dde0a0>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1dde370>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1ddeaf0>, <charms_ceph.utils.CrushLocation object at 0x7fc7c206e730>]
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log upgrade position: 0
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log monitor_key_set osd_pc6a-rb3-n1_None_start 1720064270.0629272
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-osd/0.config-changed set osd_pc6a-rb3-n1_None_start
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Rolling
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-osd/0.juju-log DEPRECATION WARNING: Function one_shot_log is being removed : Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Current ceph version is 15.2
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Upgrading to: None
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Installing [] with options: ['--option=Dpkg::Options::=--force-confold']
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed set osd_pc6a-rb3-n1_None_alive
unit-ceph-osd-0: 03:37:52 INFO unit.ceph-osd/0.juju-log Installing ['ceph', 'gdisk', 'radosgw', 'xfsprogs', 'lvm2', 'parted', 'smartmontools', 'btrfs-progs'] with options: ['--option=Dpkg::Options::=--force-confold']
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed Traceback (most recent call last):
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 907, in <module>
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed hooks.execute(sys.argv)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 962, in execute
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed self._hooks[hook_name]()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 93, in _harden_inner2
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed return f(*args, **kwargs)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 472, in config_changed
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed check_for_upgrade()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 160, in check_for_upgrade
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed ceph.roll_osd_cluster(new_version=new_version,
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2633, in roll_osd_cluster
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed lock_and_roll(upgrade_key=upgrade_key,
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2326, in lock_and_roll
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed upgrade_osd(version, kick_function=dog.kick_the_dog)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2693, in upgrade_osd
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed with maintain_all_osd_states():
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed return next(self.gen)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2962, in maintain_all_osd_states
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed osd_states = get_all_osd_states()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2928, in get_all_osd_states
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed for osd_num in get_local_osd_ids():
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 698, in get_local_osd_ids
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed osd_id = osd_dir.split('-')[1]
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed IndexError: list index out of range
unit-ceph-osd-0: 03:37:53 ERROR juju.worker.uniter.operation hook "config-changed" (via explicit, bespoke hook script) failed: exit status 1
```
Some config must have been set to an invalid or unknown value, but the None filtered through and somehow the ceph-osd charm determined that "octopus to None" was a valid upgrade path.
I'm still debugging the root cause, but during an upgrade from ceph-osd octopus to pacific, I observed this crash resulting in error status for the charm:
``` osd/0.juju- log old_version: octopus osd/0.juju- log new_version: None osd/0.juju- log octopus to None is a valid upgrade path. Proceeding. osd/0.juju- log Making dir /var/lib/ charm/ceph- osd ceph:ceph 555 osd/0.juju- log Monitor hosts are ['10.17.4.19', '10.17.4.22', '10.17.4.28'] osd/0.juju- log AZ Info: rack=zone1 osd/0.juju- log roll_osd_cluster called with None osd/0.juju- log osd_sorted_list: [<charms_ ceph.utils. CrushLocation object at 0x7fc7c1dde940>, <charms_ ceph.utils. CrushLocation object at 0x7fc7c1dde0a0>, <charms_ ceph.utils. CrushLocation object at 0x7fc7c1dde370>, <charms_ ceph.utils. CrushLocation object at 0x7fc7c1ddeaf0>, <charms_ ceph.utils. CrushLocation object at 0x7fc7c206e730>] osd/0.juju- log upgrade position: 0 osd/0.juju- log monitor_key_set osd_pc6a- rb3-n1_ None_start 1720064270.0629272 osd/0.config- changed set osd_pc6a- rb3-n1_ None_start osd/0.juju- log Rolling osd/0.juju- log DEPRECATION WARNING: Function one_shot_log is being removed : Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25 osd/0.juju- log Current ceph version is 15.2 osd/0.juju- log Upgrading to: None osd/0.juju- log Installing [] with options: ['--option= Dpkg::Options: :=--force- confold' ] osd/0.config- changed set osd_pc6a- rb3-n1_ None_alive osd/0.juju- log Installing ['ceph', 'gdisk', 'radosgw', 'xfsprogs', 'lvm2', 'parted', 'smartmontools', 'btrfs-progs'] with options: ['--option= Dpkg::Options: :=--force- confold' ] osd/0.config- changed Traceback (most recent call last): osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ hooks/config- changed" , line 907, in <module> osd/0.config- changed hooks.execute( sys.argv) osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ hooks/charmhelp ers/core/ hookenv. py", line 962, in execute osd/0.config- changed self._hooks[ hook_name] () osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ hooks/charmhelp ers/contrib/ hardening/ harden. py", line 93, in _harden_inner2 osd/0.config- changed return f(*args, **kwargs) osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ hooks/config- changed" , line 472, in config_changed osd/0.config- changed check_for_upgrade() osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ hooks/config- changed" , line 160, in check_for_upgrade osd/0.config- changed ceph.roll_ osd_cluster( new_version= new_version, osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 2633, in roll_osd_cluster osd/0.config- changed lock_and_ roll(upgrade_ key=upgrade_ key, osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 2326, in lock_and_roll osd/0.config- changed upgrade_ osd(version, kick_function= dog.kick_ the_dog) osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 2693, in upgrade_osd osd/0.config- changed with maintain_ all_osd_ states( ): osd/0.config- changed File "/usr/lib/ python3. 8/contextlib. py", line 113, in __enter__ osd/0.config- changed return next(self.gen) osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 2962, in maintain_ all_osd_ states osd/0.config- changed osd_states = get_all_ osd_states( ) osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 2928, in get_all_osd_states osd/0.config- changed for osd_num in get_local_ osd_ids( ): osd/0.config- changed File "/var/lib/ juju/agents/ unit-ceph- osd-0/charm/ lib/charms_ ceph/utils. py", line 698, in get_local_osd_ids osd/0.config- changed osd_id = osd_dir. split(' -')[1] osd/0.config- changed IndexError: list index out of range uniter. operation hook "config-changed" (via explicit, bespoke hook script) failed: exit status 1
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 INFO unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-
unit-ceph-osd-0: 03:37:53 ERROR juju.worker.
```
Some config must have been set to an invalid or unknown value, but the None filtered through and somehow the ceph-osd charm determined that "octopus to None" was a valid upgrade path.
This should be caught by the charm.