error CrushLocation missing 2 required positional arguments: 'identifier' and 'name'

Bug #2044052 reported by Samuel Allan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Fix Committed
Undecided
Unassigned

Bug Description

The config-changed hooks threw an exception on a fresh deployment, with ceph-osd charm latest/edge:

```
unit-ceph-osd-0: 04:18:02 DEBUG unit.ceph-osd/0.juju-log Hardening function 'config_changed'
unit-ceph-osd-0: 04:18:02 DEBUG unit.ceph-osd/0.juju-log No hardening applied to 'config_changed'
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log old_version: None
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log new_version: None
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log Attempting to resume possibly failed upgrade.
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log Making dir /var/lib/charm/ceph-osd ceph:ceph 555
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log Monitor hosts are ['192.168.151.187']
unit-ceph-osd-0: 04:18:02 DEBUG unit.ceph-osd/0.juju-log Writing file /var/lib/charm/ceph-osd/ceph.conf ceph:ceph 644
unit-ceph-osd-0: 04:18:02 INFO unit.ceph-osd/0.juju-log roll_osd_cluster called with None
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed Traceback (most recent call last):
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 1002, in <module>
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed hooks.execute(sys.argv)
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/core/hookenv.py", line 963, in execute
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed self._hooks[hook_name]()
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 90, in _harden_inner2
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed return f(*args, **kwargs)
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 544, in config_changed
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed check_for_upgrade()
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/hooks/config-changed", line 174, in check_for_upgrade
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed ceph.roll_osd_cluster(new_version=new_version,
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 2622, in roll_osd_cluster
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed osd_tree = get_osd_tree(service=upgrade_key)
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 637, in get_osd_tree
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed return [CrushLocation(**host) for host in roots]
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed File "/var/lib/juju/agents/unit-ceph-osd-0/charm/lib/charms_ceph/utils.py", line 637, in <listcomp>
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed return [CrushLocation(**host) for host in roots]
unit-ceph-osd-0: 04:18:02 WARNING unit.ceph-osd/0.config-changed TypeError: CrushLocation.__init__() missing 2 required positional arguments: 'identifier' and 'name'
unit-ceph-osd-0: 04:18:02 ERROR juju.worker.uniter.operation hook "config-changed" (via explicit, bespoke hook script) failed: exit status 1
```

Bundle:

```
series: jammy
applications:
  ceph-dashboard:
    charm: ceph-dashboard
    channel: latest/edge
    revision: 49
  ceph-mon:
    charm: ceph-mon
    channel: latest/edge
    revision: 192
    resources:
      alert-rules: 1
    num_units: 1
    to:
    - lxd:0
    options:
      monitor-count: 1
    constraints: arch=amd64
  ceph-osd:
    charm: ceph-osd
    channel: latest/edge
    revision: 577
    num_units: 3
    to:
    - "0"
    - "1"
    - "2"
    options:
      osd-devices: /dev/vdb /dev/vdc
    constraints: arch=amd64
    storage:
      bluestore-db: loop,1024M
      bluestore-wal: loop,1024M
      cache-devices: loop,10240M
      osd-devices: loop,1024M
      osd-journals: loop,1024M
machines:
  "0":
    constraints: arch=amd64
  "1":
    constraints: arch=amd64
  "2":
    constraints: arch=amd64
relations:
- - ceph-dashboard:dashboard
  - ceph-mon:dashboard
- - ceph-mon:osd
  - ceph-osd:mon

```

description: updated
Revision history for this message
Luciano Lo Giudice (lmlogiudice) wrote :

Just to clarify: Did this happen when migrating from quincy to reef? or just by changing the config on reef?

The reason I ask is because the only call to `CrushLocation` is on `check_for_upgrade`, which makes me think the external command we're calling has changed its output.

Revision history for this message
Samuel Allan (samuelallan) wrote :

This was deploying these from scratch - eg.:

```
juju add-model ceph
juju deploy ceph-osd --channel latest/edge -n 3
...
```

Revision history for this message
Luciano Lo Giudice (lmlogiudice) wrote :

I haven't been able to reproduce, unfortunately: https://paste.ubuntu.com/p/9f5nw8MT3f/

I did have to tweak the bundle slightly, though: The `alert-rules` resource for ceph-mon was removed (I got an error saying it needed to be a path, not a number), and I had to increase the `cache-devices` storage option to at least 10G, but it otherwise went fine.

Can you tell me what's the output of calling `ceph --id osd-upgrade osd tree --format=json`?

Revision history for this message
Samuel Allan (samuelallan) wrote (last edit ):

Hmm I'm unable to run that, because it seems the ceph-mon hasn't set up the keyring yet, since ceph-osd is still down:

```
ceph maas-controller maas/default 3.2.3 unsupported 04:38:03Z

App Version Status Scale Charm Channel Rev Exposed Message
ceph-dashboard waiting 1 ceph-dashboard latest/edge 49 no Charm configuration in progress
ceph-mon 17.2.6 waiting 1 ceph-mon latest/edge 192 no Monitor bootstrapped but waiting for number of OSDs to reach expected-osd-count (3)
ceph-osd 17.2.6 error 3 ceph-osd latest/edge 577 no hook failed: "config-changed"
ceph-radosgw 17.2.6 waiting 1 ceph-radosgw latest/edge 566 no Incomplete relations: mon
easyrsa 3.0.1 active 1 easyrsa latest/stable 48 no Certificate Authority connected.

Unit Workload Agent Machine Public address Ports Message
ceph-mon/0* waiting idle 0/lxd/0 192.168.151.159 Monitor bootstrapped but waiting for number of OSDs to reach expected-osd-count (3)
  ceph-dashboard/0* waiting idle 192.168.151.159 Charm configuration in progress
ceph-osd/0 error idle 0 192.168.151.155 hook failed: "config-changed"
ceph-osd/1 error idle 1 192.168.151.156 hook failed: "config-changed"
ceph-osd/2* error idle 2 192.168.151.154 hook failed: "config-changed"
ceph-radosgw/0* waiting idle 2/lxd/0 192.168.151.157 80/tcp Incomplete relations: mon
easyrsa/0* active idle 1/lxd/0 192.168.151.158 Certificate Authority connected.

Machine State Address Inst id Base AZ Message
0 started 192.168.151.155 node-2 ubuntu@22.04 default Deployed
0/lxd/0 started 192.168.151.159 juju-062477-0-lxd-0 ubuntu@22.04 default Container started
1 started 192.168.151.156 node-3 ubuntu@22.04 default Deployed
1/lxd/0 started 192.168.151.158 juju-062477-1-lxd-0 ubuntu@22.04 default Container started
2 started 192.168.151.154 node-4 ubuntu@22.04 default Deployed
2/lxd/0 started 192.168.151.157 juju-062477-2-lxd-0 ubuntu@22.04 default Container started
```

(config-changed hook because I needed to update the osd-devices value after deployment)

Revision history for this message
Luciano Lo Giudice (lmlogiudice) wrote :

That's _very_ weird, as the only point at which CrushLocation objects are created is after fetching an OSD tree with the osd-upgrade key. Still looking into this.

Revision history for this message
Samuel Allan (samuelallan) wrote :

I did some digging, and I think I found some useful info.

In the logs:

```
unit-ceph-osd-0: 03:00:29 INFO unit.ceph-osd/0.juju-log old_version: None
unit-ceph-osd-0: 03:00:29 INFO unit.ceph-osd/0.juju-log new_version: None
unit-ceph-osd-0: 03:00:29 INFO unit.ceph-osd/0.juju-log Attempting to resume possibly failed upgrade.
```

Which are from the `check_for_upgrade` function ( https://opendev.org/openstack/charm-ceph-osd/src/commit/1494d9a24527769064b819d5e1da4fdaebf6dcd2/hooks/ceph_hooks.py#L134 ).

It seems that resolving the ceph versions are failing here for some reason, returning None for both old and new. So when it gets to `if (ceph.UPGRADE_PATHS.get(old_version) == new_version) or resuming_upgrade:`, this will evaluate to `if (None == None) ...`, and then the upgrade logic will begin, calling roll_osd_cluster , etc. etc.

Revision history for this message
Samuel Allan (samuelallan) wrote :

Ok, the root of the issue is that the default `source` config option was changed to 'quincy' here https://opendev.org/openstack/charm-ceph-osd/commit/986981c6f47ad4ba5085a96f5683a0da650c421e/#diff-b5106b5f3da7575c9a8cb306418c20825103a4fe . Quincy is a ceph release, and ceph release names are not supported by the charm to resolve a ceph version - it requires an openstack release name (eg. 'yoga', 'zed'), 'distro', 'distro-proposed', 'proposed', 'cloud:*-OPENSTACK_RELEASE_NAME[/*]', 'deb*OPENSTACK_CODENAME*', 'deb*OPENSTACK_CODENAME*', or 'snap*OPENSTACK_CODENAME*'.

We need to change the default value for source back to yoga, or another appropriate default.

The docs for the `source` config value should also be updated to be clearer and more accurate.

Changed in charm-ceph-osd:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-ceph-osd (master)

Reviewed: https://review.opendev.org/c/openstack/charm-ceph-osd/+/901870
Committed: https://opendev.org/openstack/charm-ceph-osd/commit/d46e6b66d6a390ae4697e785ae925bff23c35db3
Submitter: "Zuul (22348)"
Branch: master

commit d46e6b66d6a390ae4697e785ae925bff23c35db3
Author: Luciano Lo Giudice <email address hidden>
Date: Fri Nov 24 21:00:04 2023 -0300

    Revert default source to 'yoga'

    The Openstack libs don't recognize Ceph releases when specifying
    the charm source. Instead, we have to use an Openstack release.
    Since it was set to quincy, reset it to yoga.

    Change-Id: Ie9d485e89bd97d10774912691d657428758300ae
    Closes-Bug: #2044052

Changed in charm-ceph-osd:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.