On 10/9/20 7:44 am, Jay Vosburgh wrote: > wgrant, you said: > > That :a-0000152 is meant to be /sys/kernel/slab/:a-0000152. Even a > working kernel shows some trouble there: > > $ uname -a > Linux 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux > $ ls -l /sys/kernel/slab | grep a-0000152 > lrwxrwxrwx 1 root root 0 Sep 8 03:20 dm_bufio_buffer -> :a-0000152 > > Are you saying that the symlink is "some trouble" here? Because that > part isn't an error, that's the effect of slab merge (that the kernel > normally treats all slabs of the same size as one big slab with multiple > references, more or less). The symlink itself is indeed not a bug. But there's one reference, and the thing it's referencing doesn't exist. I don't think that symlink should be dangling. > Slab merge can be disabled via "slab_nomerge" on the command line. Thanks for the slab_nomerge hint. That gets 5.4.0-47 to boot, but dm_bufio_buffer interestingly doesn't show up in /proc/slabinfo or /sys/kernel/slab at all, unlike in earlier kernels. There's no 152-byte slab: $ sudo cat /sys/kernel/slab/*/slab_size | grep ^152$ $ I've also just reproduced this on a second host by rebooting it into the same updated kernel -- identical hardware except for a couple of things like SSDs, and fairly similar software configuration. ... some digging later ... The trigger on boot is the parallel pvscans launched by lvm2-pvscan@.service in the presence of several PVs. If I mask that service, the system boots fine on the updated kernel (without slab_nomerge). And then this crashes it: for i in 259:1 259:2 259:3 8:32 8:48 8:64 8:80; do sudo /sbin/lvm pvscan --cache --activate ay $i & done` I think the key is to have no active VGs with snapshots, then simultaneously activate two VGs with snapshots. Armed with that hypothesis, I set up a boring local bionic qemu-kvm instance, installed linux-generic-hwe-18.04, and reproduced the problem with a couple of loop devices: $ sudo dd if=/dev/zero of=pv1.img bs=1M count=1 seek=1024 $ sudo dd if=/dev/zero of=pv2.img bs=1M count=1 seek=1024 $ sudo losetup -f pv1.img $ sudo losetup -f pv2.img $ sudo vgcreate vg1 /dev/loop0 $ sudo vgcreate vg2 /dev/loop1 $ sudo lvcreate --type snapshot -L4M -V10G -n test vg1 $ sudo lvcreate --type snapshot -L4M -V10G -n test vg2 $ sudo systemctl mask lvm2-pvscan@.service $ sudo reboot $ sudo losetup -f pv1.img $ sudo losetup -f pv2.img $ for i in 7:0 7:1; do sudo /sbin/lvm pvscan --cache --activate ay $i & done $ # Be glad if you can still type by this point. The oops is not 100% reproducible in this configuration, but it seems fairly reliable with four vCPUs. If not, a few cycles of rebooting and running those last three commands always worked for me. The console sometimes remains responsive after the oops, allowing me to capture good and bad `dmsetup table -v` output. Not sure how helpful that is, but I've attached an example (from a slightly different configuration, where each VG has a linear LV with a snapshot, rather than a snapshot-backed thin LV). I've also been able to reproduce the fault on a pure focal system, but it doesn't always happen on boot; lvm2-pvscan@.service (or a manual pvscan afterwards) fails to activate the VGs. Something is creating /run/lvm/vgs_online/$VG too early, so pvscan thinks it's already done and I end up needing to activate them manually later. This seems unrelated, and only affects a subset of my VMs. But when it happens, that actually makes it easier to reproduce, since the system boots without having the unit masked. So you can then crash with just: $ for VG in vg1 vg2; do sudo vgchange -ay $VG & done While debugging locally I also found that groovy with 5.8.0-18 is affected. Because when I stopped a VM with PVs on real block devices the host (my desktop, on which I nearly lost this email, oops) dutifully ran pvscan over them, got very sad, and needed to be rebooted with slab_nomerge to recover: [ DO NOT BLINDLY RUN THIS, it may well crash the host. ] $ lxc launch --vm ubuntu:focal bug-1894780-focal-2 $ lxc storage volume create default lvm-1 --type=block size=10GB $ lxc storage volume create default lvm-2 --type=block size=10GB $ lxc stop bug-1894780-focal-2 $ lxc storage volume attach default lvm-1 bug-1894780-focal-2 lvm-1 $ lxc storage volume attach default lvm-2 bug-1894780-focal-2 lvm-2 $ lxc start bug-1894780-focal-2 $ lxc exec bug-1894780-focal-2 bash # vgcreate vg1 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_lvm-1 # vgcreate vg2 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_lxd_lvm-2 # lvcreate --type snapshot -L4M -V10G -n test vg1 # lvcreate --type snapshot -L4M -V10G -n test vg2 # poweroff $ # Host sadness here, unless you're somehow immune. $ lxc start bug-1894780-focal-2 $ lxc exec bug-1894780-focal-2 bash # for VG in vg1 vg2; do sudo vgchange -ay $VG & done # # Guest sadness here. So that's reproduced on metal and VM on 5.4.0-47 and 5.8.0-18 on two different hosts (one an EPYC 7501 server, the other a Ryzen 7 1700X desktop, both Zen 1 but I doubt that's relevant). Hopefully one of the recipes works for you too.