Path /dev/disk/by-dname/bcache1 does not exist - bailing during secrets storage hook

Bug #1883585 reported by Jason Hobbs
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ceph OSD Charm
Invalid
Undecided
Unassigned

Bug Description

A ceph-osd unit got stuck in blocked with the message "No block devices detected using current configuration"

In the unit log, I see this:

2020-06-15 06:45:12 INFO juju-log secrets-storage:275: Path /dev/disk/by-dname/bcache1 does not exist - bailing

https://solutions.qa.canonical.com/#/qa/testRun/f5c5a786-796a-4519-b57a-dc2072815e60

description: updated
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Looks similar to https://bugs.launchpad.net/charm-ceph-osd/+bug/1878752

which James created a workaround for https://bugs.launchpad.net/charm-ceph-osd/+bug/1878752/comments/10

git --no-pager branch --contains=b1aab5d0e12e433b714e39f78945baf16e508a41
  master
* stable/20.05

https://opendev.org/openstack/charm-ceph-osd/commit/b1aab5d0e12e433b714e39f78945baf16e508a41

But it only works if the symlinks are there in the first place because the code that triggers the issue of this bug is executed earlier than the code in the charm that triggers the symlink disappearance.

From what I see, there was some activity around the time of the check which resulted in related messages being printed in the kernel log:

ceph-osd-1 (looks like there were no devices processed previously):

2020-06-15 06:45:04 INFO juju-log secrets-storage:275: Skipping osd devices previously processed by this unit: []
2020-06-15 06:45:04 DEBUG juju-log secrets-storage:275: Checking for pristine devices: "[]"
2020-06-15 06:45:04 INFO juju-log secrets-storage:275: ceph bootstrapped, rescanning disks

2020-06-15 06:45:12 INFO juju-log secrets-storage:275: Path /dev/disk/by-dname/bcache1 does not exist - bailing

kern.log on machine 11:

Jun 15 06:45:12 mudkip kernel: [13775.693310] bcache: register_bcache() error /dev/sdb: device already registered (emitting change event)
Jun 15 06:45:12 mudkip kernel: [13775.775323] bcache: register_bcache() error /dev/nvme0n1p1: device already registered
Jun 15 06:45:12 mudkip kernel: [13775.820840] bcache: register_bcache() error /dev/sda3: device already registered (emitting change event)
Jun 15 06:45:12 mudkip kernel: [13775.832105] bcache: register_bcache() error /dev/sda4: device already registered (emitting change event)

* ceph-osd waits for all queued uevents to be processed before calling osdize:
https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/hooks/ceph_hooks.py#L540-L542
        ceph.udevadm_settle()
        for dev in get_devices():
            ceph.osdize(dev, config('osd-format'),

* there is nothing in the charm to re-trigger uevents for bcache devices between that and the error message seen in the end:

osdize:
https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/lib/charms_ceph/utils.py#L1494-L1499

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/lib/charms_ceph/utils.py#L1508-L1540

There are several cases like this in the log but they don't give an indication of what exactly caused it.

https://paste.ubuntu.com/p/s2N7fSdthf/

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (4.1 KiB)

In a search of a source of extra bcache registration events I considered snapd.

In particular, the udev backend may be the source of a problem.

https://github.com/snapcore/snapd/blob/2.45.1/interfaces/udev/backend.go#L55-L59 (/etc/udev.d/70-{snap-name}.rules file is maintained per snap)
https://github.com/snapcore/snapd/blob/2.45.1/interfaces/udev/backend.go#L120 (and is written to when a ruleset changes).
https://github.com/snapcore/snapd/blob/2.45.1/interfaces/udev/backend.go#L130 (ReloadRules is called after that).

ReloadRules has an unconditional exec of `udevadm trigger --subsystem-nomatch=input`:

https://github.com/snapcore/snapd/blob/2.45.1/interfaces/udev/udev.go#L44-L47
// ReloadRules runs three commands that reload udev rule database.
//
// The commands are: udevadm control --reload-rules
// udevadm trigger --subsystem-nomatch=input
// udevadm settle --timeout=3
// and optionally trigger other subsystems as defined in the interfaces. Eg:
// udevadm trigger --subsystem-match=input
// udevadm trigger --property-match=ID_INPUT_JOYSTICK=1
func ReloadRules(subsystemTriggers []string) error {
// ...
 output, err = exec.Command("udevadm", "trigger", "--subsystem-nomatch=input").CombinedOutput()

Based on my understanding of snapd:

* backends are used by all snaps so any snap in question might trigger that https://github.com/snapcore/snapd/blob/2.45.1/interfaces/backends/backends.go#L38-L48
* ReloadRules in snapd may be called on snap installation or refresh.

I submitted a patch to juju-crashdump to collect udev rules to provide more clarity on which rules are there in the future.
https://github.com/juju/juju-crashdump/pull/56

I can see the register_bcache errors after both the installation of both "core" and "canonical-livepatch" snaps.

июн 15 03:18:26 mudkip systemd[1]: Mounting Mount unit for core, revision 9289...
июн 15 03:18:26 mudkip systemd[1]: Mounted Mount unit for core, revision 9289.
июн 15 03:18:27 mudkip kernel: bcache: register_bcache() error /dev/nvme0n1p1: device already registered
июн 15 03:18:27 mudkip kernel: bcache: register_bcache() error /dev/sdb: device already registered (emitting change event)
июн 15 03:18:27 mudkip kernel: bcache: register_bcache() error /dev/sda4: device already registered (emitting change event)
июн 15 03:18:27 mudkip kernel: bcache: register_bcache() error /dev/sda3: device already registered (emitting change event)
июн 15 03:18:28 mudkip kernel: bcache: register_bcache() error /dev/sdb: device already registered (emitting change event)
...
июн 15 03:18:38 mudkip systemd[1]: Mounting Mount unit for canonical-livepatch, revision 95...
июн 15 03:18:38 mudkip systemd[1]: Mounted Mount unit for canonical-livepatch, revision 95.
июн 15 03:18:40 mudkip kernel: bcache: register_bcache() error /dev/sdb: device already registered (emitting change event)
июн 15 03:18:40 mudkip kernel: bcache: register_bcache() error /dev/sda3: device already registered (emitting change event)
июн 15 03:18:40 mudkip kernel: bcache: register_bcache() error /dev/sda4: device already registered (emitting change event)
июн 15 03:18:40 mudkip ke...

Read more...

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I do see a consistent pattern across multiple nodes in this test run in that there are 4 series of register_bcache errors in the kernel log:

1) one at the kernel startup;
2) 3 afterwards around snapd or lxd-related messages.

snapd also runs inside LXD containers and the effect of doing `udevadm trigger` inside a container is yet to be determined.

Revision history for this message
Alexander Litvinov (alitvinov) wrote :

Blocked by this on a customer deployment.
exactly the same issue, during vault hook
/var/log/kern.log
 bcache: register_bcache() error /dev/sde: device already registered (emitting change event)

Revision history for this message
Alexander Litvinov (alitvinov) wrote :

One observation:
Hosts without lxd containers, or the one with only one easyrsa container are fine.
Affected are hosts with many lxd containers :
ceilometer, designate, rabbitmq-server, nova-cloud-controller etc

Revision history for this message
Alexander Litvinov (alitvinov) wrote :

Subscribed ~field-critical

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

On the environment that Alex provided I was able to confirm the installations of snaps (core and canonical-livepatch) do cause `udevadm trigger` to be done:
https://paste.ubuntu.com/p/48GBFbz2hP/

However, this is not the root cause of the problem.

A manual reproducer for that is doing `udevadm trigger` by hand.

For debugging I inserted some logging into udev rules:

ubuntu@server-maas-6:~$ cat /etc/udev/rules.d/bcache0.rules
# Written by curtin
SUBSYSTEM=="block", ACTION=="add|change", RUN+="/usr/bin/logger 'Cached UUID for $kernel: $env{CACHED_UUID}'"
SUBSYSTEM=="block", ACTION=="add|change", ENV{CACHED_UUID}=="3d690e74-5c33-4030-8f54-7d40714a3406", SYMLINK+="disk/by-dname/bcache0"

and it shows that there are 2 sets of uevents fired for each bcache device:

1) one with CACHED_UUID set;
2) one without.

a) `udevadm trigger`: uevents without CACHED_UUID come after the ones with CACHED_UUID for all devices: https://paste.ubuntu.com/p/tPMY5qTfmP/

b) `sudo udevadm trigger --action=add`:

   - sometimes all uevents with CACHED_UUID come after all uevents without CACHED_UUID and the symlinks get created.
   - sometimes only a part of uevents with CACHED_UUID come after uevents without CACHED_UUID - only some symlinks are present.
   See https://paste.ubuntu.com/p/YbSfKn3vBx/ for both cases.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Just to close the remaining unclear item: going back to comment #3, the series of register_bcache messages due to the following:

1) bootup (uevent replay during coldplug);
2) core snap installation (snapd does a `udevadm trigger` as described in https://bugs.launchpad.net/charm-ceph-osd/+bug/1883585/comments/2)
3) canonical-livepatch snap installation (snapd does a `udevadm trigger` as described in https://bugs.launchpad.net/charm-ceph-osd/+bug/1883585/comments/2)

4) after the `osdize_dev` method exits, `udevadm trigger --subsystem-match=block --action=add` is called in rescan_osd_devices which triggers another set of register_bcache messages in the kernel log.

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/lib/charms_ceph/utils.py#L1540 (the "bailing" message is printed to the juju log)

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/hooks/ceph_hooks.py#L542-L547 (the execution in prepare_disks_and_activate continues)

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/hooks/ceph_hooks.py#L551 (start_osds is called with an empty list of devices)

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/lib/charms_ceph/utils.py#L959-L961 (rescan_osd_devices is called unconditionally)

https://opendev.org/openstack/charm-ceph-osd/src/commit/48144a14417e1832d1ced4998e13839671d1351e/lib/charms_ceph/utils.py#L976-L980
udevadm trigger --subsystem-match=block --action=add

So, the symlinks go away as soon as the first snap gets installed (p.2), however, we only see the failure after (p.3). Adding a long period of time between (p.3) and the failure does not help since we don't wait on any uevents to appear - they get removed as described in #7.

Revision history for this message
Konstantinos Kaskavelis (kaskavel) wrote :

Closing this due to inactivity (low number of occurrences, and no hit for more than one year)

Changed in charm-ceph-osd:
status: New → Invalid
tags: added: solutions-qa-expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.