timeout removing bcache device when lvm is over bcache

Bug #1844543 reported by Jason Hobbs
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin
Fix Released
High
Chad Smith
curtin (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Notes from a conversation with Ryan.

me:
I've been able to reproduce a bcache device removal timeout during maas/curtin installation on 4.15.0-62-generic #69-Ubuntu

I've left the machine up so we can investigate

Ryan, I imported your ssh key from launchpad, ubuntu@10.244.41.14 - can you have a look?

dmesg: http://paste.ubuntu.com/p/hWvSF5VnzS/

cloud-init-output.log http://paste.ubuntu.com/p/dv6bcnNW4Z/

Ryan:
I don't see any obvious oop's or tracebacks from the kernel. I'm not sure what else I should look for.

Looking at the shutdown tree, the dm-0 over the bcache-device looks like curtin should have stopped
the lvm device on top of bcache0

 Shutdown Plan:
        {'level': 2, 'device': '/sys/class/block/bcache1', 'dev_type': 'bcache'}
        {'level': 2, 'device': '/sys/class/block/dm-0', 'dev_type': 'lvm'}
        {'level': 1, 'device': '/sys/class/block/bcache0', 'dev_type': 'bcache'}

Let me see if I can recreate this structure in our vmtest; the way that the bcache show down works where we stop the cache of a bcache block device, in this scenario there's a relation between bcache1 and bcache0 which share a cacheset.

OK, I can recreate this issue in a VMtest with the config included in the logs (with a change to replicate the LVM over bcache) and it;s unrelated to the bcache kernel changes. The issue is the LVM over bcache0 (from ceph in the QA case) which shares a cacheset with bcache1 is the issue.

curtin config: http://paste.ubuntu.com/p/zmTBb2Bnxs/

Related branches

tags: added: cdo-qa foundations-engine
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

curtin error logs

description: updated
Chad Smith (chad.smith)
Changed in curtin:
status: New → Confirmed
assignee: nobody → Chad Smith (chad.smith)
Ryan Harper (raharper)
Changed in curtin:
importance: Undecided → High
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

This bug is fixed with commit e174e1cd to curtin on branch master.
To view that commit see the following URL:
https://git.launchpad.net/curtin/commit/?id=e174e1cd

Changed in curtin:
status: Confirmed → Fix Committed
Revision history for this message
Ryan Harper (raharper) wrote : Fixed in curtin version 19.3.

This bug is believed to be fixed in curtin in version 19.3. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status: Fix Committed → Fix Released
Revision history for this message
Nobuto Murata (nobuto) wrote :

When can we expect this to be SRUed?

Revision history for this message
Ryan Harper (raharper) wrote :

We plan to start a curtin SRU next week.

Paride Legovini (paride)
Changed in curtin (Ubuntu):
status: New → Triaged
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

20:37 < rharper> powersj: jhobbs: re: curtin SRU; I had planned to SRU in Nov, but the fix that landed at the time was not complete; I was still able to recreate the failure. We have a more omplete fix that's passing all of the vmtest scenarios with bcache; that's landed, so likely SRU will start at the start of 2020.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I tested with 19.3-787-gb022ed4-0ubuntu1+228~trunk~ubuntu18.04.1 and it can repeatedly install just fine - no issues with the setup described above.

Revision history for this message
Joshua Powers (powersj) wrote :

The SRU of curtin 19.3 is now complete to Xenial, Bionic, and Eoan. This bug is thought to be fixed in all releases now. I am marking this fix released.

Versions
---
19.3-26-g82f23e3d-0ubuntu1~16.04.1
19.3-26-g82f23e3d-0ubuntu1~18.04.1
19.3-26-g82f23e3d-0ubuntu1~19.10.1

Thanks!

Changed in curtin (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.