timeout removing bcache device when lvm is over bcache
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
curtin |
Fix Released
|
High
|
Chad Smith | ||
curtin (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Notes from a conversation with Ryan.
me:
I've been able to reproduce a bcache device removal timeout during maas/curtin installation on 4.15.0-62-generic #69-Ubuntu
I've left the machine up so we can investigate
Ryan, I imported your ssh key from launchpad, ubuntu@10.244.41.14 - can you have a look?
dmesg: http://
cloud-init-
Ryan:
I don't see any obvious oop's or tracebacks from the kernel. I'm not sure what else I should look for.
Looking at the shutdown tree, the dm-0 over the bcache-device looks like curtin should have stopped
the lvm device on top of bcache0
Shutdown Plan:
{'level': 2, 'device': '/sys/class/
{'level': 2, 'device': '/sys/class/
{'level': 1, 'device': '/sys/class/
Let me see if I can recreate this structure in our vmtest; the way that the bcache show down works where we stop the cache of a bcache block device, in this scenario there's a relation between bcache1 and bcache0 which share a cacheset.
OK, I can recreate this issue in a VMtest with the config included in the logs (with a change to replicate the LVM over bcache) and it;s unrelated to the bcache kernel changes. The issue is the LVM over bcache0 (from ceph in the QA case) which shares a cacheset with bcache1 is the issue.
curtin config: http://
Related branches
- Chad Smith: Approve
- Server Team CI bot: Approve (continuous-integration)
-
Diff: 158 lines (+131/-1)3 files modifiedcurtin/block/clear_holders.py (+1/-1)
examples/tests/bcache-ceph-nvme-simple.yaml (+107/-0)
tests/vmtests/test_bcache_ceph.py (+23/-0)
tags: | added: cdo-qa foundations-engine |
Changed in curtin: | |
status: | New → Confirmed |
assignee: | nobody → Chad Smith (chad.smith) |
Changed in curtin: | |
importance: | Undecided → High |
Changed in curtin (Ubuntu): | |
status: | New → Triaged |
curtin error logs