Comment 0 for bug 2071875

Revision history for this message
Samuel Allan (samuelallan) wrote :

I'm still debugging the root cause, but during an upgrade from ceph-osd octopus to pacific, I observed this crash resulting in error status for the charm:

```
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log old_version: octopus
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log new_version: None
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log octopus to None is a valid upgrade path. Proceeding.
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log Making dir /var/lib/charm/ceph-osd ceph:ceph 555
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log Monitor hosts are ['10.17.4.19', '10.17.4.22', '10.17.4.28']
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log AZ Info: rack=zone1
unit-ceph-osd-0: 03:37:49 INFO unit.ceph-osd/0.juju-log roll_osd_cluster called with None
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log osd_sorted_list: [<charms_ceph.utils.CrushLocation object at 0x7fc7c1dde940>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1dde0a0>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1dde370>, <charms_ceph.utils.CrushLocation object at 0x7fc7c1ddeaf0>, <charms_ceph.utils.CrushLocation object at 0x7fc7c206e730>]
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log upgrade position: 0
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log monitor_key_set osd_pc6a-rb3-n1_None_start 1720064270.0629272
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-osd/0.config-changed set osd_pc6a-rb3-n1_None_start
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Rolling
unit-ceph-osd-0: 03:37:50 WARNING unit.ceph-osd/0.juju-log DEPRECATION WARNING: Function one_shot_log is being removed : Support for use of upstream ``apt_pkg`` module in conjunctionwith charm-helpers is deprecated since 2019-06-25
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Current ceph version is 15.2
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Upgrading to: None
unit-ceph-osd-0: 03:37:50 INFO unit.ceph-osd/0.juju-log Installing [] with options: ['--option=Dpkg::Options::=--force-confold']
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed set osd_pc6a-rb3-n1_None_alive
unit-ceph-osd-0: 03:37:52 INFO unit.ceph-osd/0.juju-log Installing ['ceph', 'gdisk', 'radosgw', 'xfsprogs', 'lvm2', 'parted', 'smartmontools', 'btrfs-progs'] with options: ['--option=Dpkg::Options::=--force-confold']
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed Traceback (most recent call last):
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 907, in <module>
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed hooks.execute(sys.argv)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 962, in execute
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed self._hooks[hook_name]()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 93, in _harden_inner2
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed return f(*args, **kwargs)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 472, in config_changed
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed check_for_upgrade()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 160, in check_for_upgrade
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed ceph.roll_osd_cluster(new_version=new_version,
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2633, in roll_osd_cluster
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed lock_and_roll(upgrade_key=upgrade_key,
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2326, in lock_and_roll
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed upgrade_osd(version, kick_function=dog.kick_the_dog)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2693, in upgrade_osd
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed with maintain_all_osd_states():
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed return next(self.gen)
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2962, in maintain_all_osd_states
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed osd_states = get_all_osd_states()
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2928, in get_all_osd_states
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed for osd_num in get_local_osd_ids():
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 698, in get_local_osd_ids
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed osd_id = osd_dir.split('-')[1]
unit-ceph-osd-0: 03:37:52 WARNING unit.ceph-osd/0.config-changed IndexError: list index out of range
unit-ceph-osd-0: 03:37:53 ERROR juju.worker.uniter.operation hook "config-changed" (via explicit, bespoke hook script) failed: exit status 1
```

Some config must have been set to an invalid or unknown value, but the None filtered through and somehow the ceph-osd charm determined that "octopus to None" was a valid upgrade path.

This should be caught by the charm.