sunbeam cluster join fails with Error: Unable to list disks - Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

Bug #2065855 reported by Nobuto Murata
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juju Charmed Operator - MicroCeph
New
Undecided
Unassigned
OpenStack Snap
Triaged
Undecided
Unassigned

Bug Description

This might be related or the same as https://bugs.launchpad.net/snap-openstack/+bug/2063223 but I haven't compared the logs carefully. At least for my case, what's happening is as follows.

$ snap list openstack
Name Version Rev Tracking Publisher Notes
openstack 2024.1 503 2024.1/edge canonical✓ -

$ ssh sunbeam-2.localdomain -- -t time sunbeam cluster join --role control --role compute --role storage --token eyJuYW1lIjoic3VuYmVhbS0yLmxvY2FsZG9tYWluIiwic2VjcmV0IjoiNzFj
ZDBhN2RjYTRjNzU3ODQ0OWRiZTUzNTJhNjgxOTY0NmEzY2E2NDVlZTI1YjRmOWIwZWQwY2UxNzZlYWEyOCIsImZpbmdlcnByaW50IjoiMjIyNTQyYjU3YTVmMGJhN2RmZTMyYjA1NDBjZWQyYzcwNjkzMDc3ZWU1NTg2YmFjZWM1MzI1Mjg4Y
jA4NGM4OSIsImpvaW5fYWRkcmVzc2VzIjpbIjEwLjAuMTIzLjI0MTo3MDAwIl19
Error: Unable to list disks

Based on the timestamp below, it looks like the list-disks action failed with "Error: failed listing disks: Daemon not yet initialized".

[sunbeam-20240516-061411.377443.log]
06:19:24,240 sunbeam.commands.microceph DEBUG Running list-disks on : 'microceph/1'
06:19:24,413 connector DEBUG Connector: closing controller connection
06:19:24,629 connector DEBUG Connector: closing controller connection
06:19:27,723 sunbeam.commands.microceph DEBUG {'return-code': 0}

[debug-log]
unit-microceph-1: 06:19:26 DEBUG unit.microceph/1.juju-log Emitting Juju event list_disks_action.

...

unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: Completed guarded section fully: 'Bootstrapping'
unit-microceph-1: 06:20:10 ERROR unit.microceph/1.juju-log peers:1: Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

unit-microceph-1: 06:20:10 WARNING unit.microceph/1.juju-log peers:1: Microceph not bootstrapped yet.
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Deferring <ConfigChangedEvent via MicroCephCharm/on/config_changed[21]>.
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Re-emitting deferred event <RelationJoinedEvent via MicroCephCharm/on/ceph_relation_joined[29]>.
unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: _on_relation_changed event
unit-microceph-1: 06:20:10 ERROR unit.microceph/1.juju-log peers:1: Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

unit-microceph-1: 06:20:10 WARNING unit.microceph/1.juju-log peers:1: Microceph not bootstrapped yet.
unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: Not processing request as service is not yet ready
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Deferring <RelationJoinedEvent via MicroCephCharm/on/ceph_relation_joined[29]>.
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Re-emitting deferred event <RelationJoinedEvent via MicroCephCharm/on/ceph_relation_joined[33]>.
unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: _on_relation_changed event
unit-microceph-1: 06:20:10 ERROR unit.microceph/1.juju-log peers:1: Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

unit-microceph-1: 06:20:10 WARNING unit.microceph/1.juju-log peers:1: Microceph not bootstrapped yet.
unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: Not processing request as service is not yet ready
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Deferring <RelationJoinedEvent via MicroCephCharm/on/ceph_relation_joined[33]>.
unit-microceph-1: 06:20:10 DEBUG unit.microceph/1.juju-log peers:1: Re-emitting deferred event <RelationChangedEvent via MicroCephCharm/on/ceph_relation_changed[37]>.
unit-microceph-1: 06:20:10 INFO unit.microceph/1.juju-log peers:1: _on_relation_changed event
unit-microceph-1: 06:20:11 ERROR unit.microceph/1.juju-log peers:1: Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

unit-microceph-1: 06:20:11 WARNING unit.microceph/1.juju-log peers:1: Microceph not bootstrapped yet.
unit-microceph-1: 06:20:11 INFO unit.microceph/1.juju-log peers:1: Not processing request as service is not yet ready
unit-microceph-1: 06:20:11 DEBUG unit.microceph/1.juju-log peers:1: Deferring <RelationChangedEvent via MicroCephCharm/on/ceph_relation_changed[37]>.
unit-microceph-1: 06:20:11 DEBUG unit.microceph/1.juju-log peers:1: Re-emitting deferred event <RelationChangedEvent via MicroCephCharm/on/ceph_relation_changed[41]>.
unit-microceph-1: 06:20:11 INFO unit.microceph/1.juju-log peers:1: _on_relation_changed event
unit-microceph-1: 06:20:11 ERROR unit.microceph/1.juju-log peers:1: Failed executing cmd: ['microceph', 'status'], error: Error: failed listing disks: Daemon not yet initialized

unit-microceph-1: 06:20:11 WARNING unit.microceph/1.juju-log peers:1: Microceph not bootstrapped yet.
unit-microceph-1: 06:20:11 INFO unit.microceph/1.juju-log peers:1: Not processing request as service is not yet ready

...

unit-microceph-1: 06:20:11 INFO unit.microceph/1.juju-log peers:1: {'snap-channel': 'reef/stable', 'public_net': IPv4Network('10.0.123.0/24'), 'cluster_net': IPv4Network('10.0.123.0/24'), 'micro_ip': IPv4Address('10.0.123.234')}
unit-microceph-1: 06:20:22 DEBUG unit.microceph/1.juju-log peers:1: Command microceph cluster join eyJuYW1lIjoibWljcm9jZXBoLzEiLCJzZWNyZXQiOiIxYTE3ZWVlMTBiNmFmM2IxODczMWEyYjhjZmIzNjc0YjQzNmY5NzVmOTM5MmUyYjZkZmIzNjZjZmEyYjc2OGE3IiwiZmluZ2VycHJpbnQiOiJlYmY1MWI4M2Q5NzJlOGQ2YWU3NTMxMTExY2NmZDlkY2VhZGM3ZDM4YTEwZWEzMDljZmQ3OTdhYzQ4YTg2ODU4Iiwiam9pbl9hZGRyZXNzZXMiOlsiMTAuMC4xMjMuMjQxOjc0NDMiXX0= --microceph-ip 10.0.123.234 finished; Output:
unit-microceph-1: 06:20:22 INFO unit.microceph/1.juju-log peers:1: Setting active status
unit-microceph-1: 06:20:22 INFO unit.microceph/1.juju-log peers:1: Completed guarded section fully: 'Bootstrapping'
unit-microceph-1: 06:20:22 DEBUG unit.microceph/1.juju-log peers:1: Command microceph status finished; Output: MicroCeph deployment summary:
- sunbeam-1 (10.0.123.241)
  Services: mds, mgr, mon, osd
  Disks: 2
- sunbeam-2 (10.0.123.234)
  Services: mds, mgr, mon
  Disks: 0

Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Nobuto Murata (nobuto) wrote :
Download full text (3.6 KiB)

Looks like the sunbeam snap tried to wait for the workload status = active for microceph/1 with wait_units_ready.

However, it looks like the microceph charm turned into "active" multiple times in the deployment phase even between "Service not bootstrapped" statuses.

Adding the charm task.

$ juju show-status-log -m admin/controller microceph/1 --days 1
Time Type Status Message
16 May 2024 06:18:33Z juju-unit allocating
16 May 2024 06:18:33Z workload waiting waiting for machine
16 May 2024 06:18:33Z workload waiting installing agent
16 May 2024 06:18:33Z workload waiting agent initialising
16 May 2024 06:18:37Z workload maintenance installing charm software
16 May 2024 06:18:37Z juju-unit executing running install hook
16 May 2024 06:18:39Z workload waiting no status set yet
16 May 2024 06:18:39Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:02Z juju-unit executing running ceph-relation-created hook
16 May 2024 06:19:02Z workload waiting no status set yet
16 May 2024 06:19:02Z workload active
16 May 2024 06:19:02Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:03Z workload waiting no status set yet
16 May 2024 06:19:03Z workload active
16 May 2024 06:19:03Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:03Z juju-unit executing running peers-relation-created hook
16 May 2024 06:19:04Z workload waiting no status set yet
16 May 2024 06:19:04Z workload active
16 May 2024 06:19:04Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:05Z juju-unit executing running leader-settings-changed hook
16 May 2024 06:19:05Z workload waiting no status set yet
16 May 2024 06:19:05Z workload active
16 May 2024 06:19:05Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:06Z juju-unit executing running config-changed hook
16 May 2024 06:19:06Z workload waiting no status set yet
16 May 2024 06:19:06Z workload active
16 May 2024 06:19:06Z workload maintenance (bootstrap) Service not bootstrapped
16 May 2024 06:19:07Z juju-unit executing running start hook
16 May 2024 06:19:09Z juju-unit executing running ceph-relation-joined hook for remote-05e55060365a487b8186893e3318108a/0
16 May 2024 06:19:11Z juju-unit executing running ceph-relation-joined hook for remote-ec265fd659364c568460c535967ef66f/0
16 May 2024 06:19:13Z juju-unit executing running ceph-relation-changed hook for remote-05e55060365a487b8186893e3318108a/0
16 May 2024 06:19:15Z juju-unit executing running ceph-relation-changed hook for remote-ec265fd659364c568460c535967ef66f/0
16 May 2024 06:19:17Z juju-unit executing running peers-relation-changed hook
16 May 2024 06:19:19Z juju-unit executing running peers-relation-joined hook for microceph/0
16 May 2024 06:19:21Z juju-unit executing running peers-relation-changed hook for microceph/0
16 May 2024 06:19:23Z juju-unit idle
16 May 2024 06:19:24Z juju-unit execu...

Read more...

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Changed in snap-openstack:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.