[add-disk action, bcache] charm tries to add bcache OSD again

Bug #1985890 reported by Trent Lloyd
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
New
Undecided
Unassigned

Bug Description

The charm tries to re-initialize a disk already active using bcache when it was added using the add-disk action.

[Steps to reproduce]
1) Deploy ceph-osd charm using openstack-on-openstack and storage spaces

2) Add a bcache disk

juju add-storage ceph-osd/0 cache-devices=cinder,10G,1

3) Add a new OSD disk

juju add-storage ceph-osd/0 osd-devices=cinder,32G,1

4) Due to Bug #1985884 the added OSD didn't use bcache, so remove and re-add it

juju run-action --wait ceph-osd/0 remove-disk osd-devices="/dev/vdd"
juju run-action ceph-osd/0 add-disk osd-devices=/dev/vdd osd-ids=osd.3

5) The added OSD has it's reweight set to 0 from the remove-disk, so we need to fix that
juju ssh ceph-mon/0 sudo ceph osd crush reweight osd.N 1

6) Confirm ceph is happy everything is as expected

7) Add another disk
juju add-storage ceph-osd/0 osd-devices=cinder,32G,1

[Result]

When re-scanning the disks it sees /dev/bcache0 as already processed by the unit and not /dev/vdd the underlying disk. It then tries and fails to re-add it because the device is already in use.

2022-08-12 07:42:04 INFO unit.ceph-osd/0.juju-log server.go:319 mon:41: Skipping osd devices previously processed by this unit: ['/dev/vdb', '/dev/bcache0']
2022-08-12 07:42:05 INFO unit.ceph-osd/0.juju-log server.go:319 mon:41: Device /dev/vdb already processed by charm, skipping
2022-08-12 07:42:05 WARNING unit.ceph-osd/0.mon-relation-changed logger.go:60 partx: /dev/vdd: failed to read partition table
2022-08-12 07:42:05 INFO unit.ceph-osd/0.juju-log server.go:319 mon:41: Can't get info for /dev/vdd: b''
2022-08-12 07:42:05 WARNING unit.ceph-osd/0.mon-relation-changed logger.go:60 Failed to find physical volume "/dev/vdd".
2022-08-12 07:42:05 WARNING unit.ceph-osd/0.mon-relation-changed logger.go:60 Failed to find physical volume "/dev/vdd".
2022-08-12 07:42:05 WARNING unit.ceph-osd/0.mon-relation-changed logger.go:60 Can't open /dev/vdd exclusively. Mounted filesystem?

It tries to do this over and over again

[Fix]

This same kind of issue has re-occured on multiple occasions. The charm really needs to grow an awareness of multi-disk interactions and at runtime parse and understand the full block device tree including LVM, bcache, vault encryption, db/wal devices, etc.

The charm seems to make some attempt to keep track of this as 'osd-aliases' however bcache device names are not reliable and will change and reorder on boot. The charm really needs to actively associate OSDs at runtime.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.