Path /dev/disk/by-dname/bcache1 does not exist - bailing during secrets storage hook
Bug #1883585 reported by
Jason Hobbs
This bug affects 3 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph OSD Charm |
Invalid
|
Undecided
|
Unassigned |
Bug Description
A ceph-osd unit got stuck in blocked with the message "No block devices detected using current configuration"
In the unit log, I see this:
2020-06-15 06:45:12 INFO juju-log secrets-
https:/
description: | updated |
To post a comment you must log in.
Looks similar to https:/ /bugs.launchpad .net/charm- ceph-osd/ +bug/1878752
which James created a workaround for https:/ /bugs.launchpad .net/charm- ceph-osd/ +bug/1878752/ comments/ 10
git --no-pager branch --contains= b1aab5d0e12e433 b714e39f78945ba f16e508a41
master
* stable/20.05
https:/ /opendev. org/openstack/ charm-ceph- osd/commit/ b1aab5d0e12e433 b714e39f78945ba f16e508a41
But it only works if the symlinks are there in the first place because the code that triggers the issue of this bug is executed earlier than the code in the charm that triggers the symlink disappearance.
From what I see, there was some activity around the time of the check which resulted in related messages being printed in the kernel log:
ceph-osd-1 (looks like there were no devices processed previously):
2020-06-15 06:45:04 INFO juju-log secrets- storage: 275: Skipping osd devices previously processed by this unit: [] storage: 275: Checking for pristine devices: "[]" storage: 275: ceph bootstrapped, rescanning disks
2020-06-15 06:45:04 DEBUG juju-log secrets-
2020-06-15 06:45:04 INFO juju-log secrets-
2020-06-15 06:45:12 INFO juju-log secrets- storage: 275: Path /dev/disk/ by-dname/ bcache1 does not exist - bailing
kern.log on machine 11:
Jun 15 06:45:12 mudkip kernel: [13775.693310] bcache: register_bcache() error /dev/sdb: device already registered (emitting change event)
Jun 15 06:45:12 mudkip kernel: [13775.775323] bcache: register_bcache() error /dev/nvme0n1p1: device already registered
Jun 15 06:45:12 mudkip kernel: [13775.820840] bcache: register_bcache() error /dev/sda3: device already registered (emitting change event)
Jun 15 06:45:12 mudkip kernel: [13775.832105] bcache: register_bcache() error /dev/sda4: device already registered (emitting change event)
* ceph-osd waits for all queued uevents to be processed before calling osdize: /opendev. org/openstack/ charm-ceph- osd/src/ commit/ 48144a14417e183 2d1ced4998e1383 9671d1351e/ hooks/ceph_ hooks.py# L540-L542
ceph.udevadm_ settle( )
ceph. osdize( dev, config( 'osd-format' ),
https:/
for dev in get_devices():
* there is nothing in the charm to re-trigger uevents for bcache devices between that and the error message seen in the end:
osdize: /opendev. org/openstack/ charm-ceph- osd/src/ commit/ 48144a14417e183 2d1ced4998e1383 9671d1351e/ lib/charms_ ceph/utils. py#L1494- L1499
https:/
https:/ /opendev. org/openstack/ charm-ceph- osd/src/ commit/ 48144a14417e183 2d1ced4998e1383 9671d1351e/ lib/charms_ ceph/utils. py#L1508- L1540
There are several cases like this in the log but they don't give an indication of what exactly caused it.
https:/ /paste. ubuntu. com/p/s2N7fSdth f/