Comment 0 for bug 1894780

Revision history for this message
William Grant (wgrant) wrote : Oops when starting LVM snapshots on 5.4.0-47

One of my bionic servers with HWE 5.4.0 hangs on boot (apparently while starting LVM snapshots) after upgrading from Linux 5.4.0-42 to 5.4.0-47, with the following trace:

  [ 29.126292] kobject_add_internal failed for :a-0000152 with -EEXIST, don't try to register things with the same name in the same directory.
  [ 29.138854] BUG: kernel NULL pointer dereference, address: 0000000000000020
  [ 29.145977] #PF: supervisor read access in kernel mode
  [ 29.145979] #PF: error_code(0x0000) - not-present page
  [ 29.145981] PGD 0 P4D 0
  [ 29.158800] Oops: 0000 [#1] SMP NOPTI
  [ 29.162468] CPU: 6 PID: 2532 Comm: lvm Not tainted 5.4.0-46-generic #50~18.04.1-Ubuntu
  [ 29.170378] Hardware name: Supermicro AS -2023US-TR4/H11DSU-iN, BIOS 1.3 07/15/2019
  [ 29.178038] RIP: 0010:free_percpu+0x120/0x1f0
  [ 29.183786] Code: 43 64 48 01 d0 49 39 c4 0f 83 71 ff ff ff 65 8b 05 a5 4e bc 58 48 8b 15 0e 4e 20 01 48 98 48 8b 3c c2 4c 01 e7 e8 f0 97 02 00 <48> 8b 58 20 48 8b 53 38 e9 48 ff ff ff f3 c3 48 8b 43 38 48 89 45
  [ 29.202530] RSP: 0018:ffffa2f69c3d38e8 EFLAGS: 00010046
  [ 29.209204] RAX: 0000000000000000 RBX: ffff92202ff397c0 RCX: ffffffffa880a000
  [ 29.216336] RDX: cf35c0f24f2cc3c0 RSI: 43817c451b92afcb RDI: 0000000000000000
  [ 29.223469] RBP: ffffa2f69c3d3918 R08: 0000000000000000 R09: ffffffffa74a5300
  [ 29.230609] R10: ffffa2f69c3d3820 R11: 0000000000000000 R12: cf35c0f24f14c3c0
  [ 29.237745] R13: cf362fb2a054c3c0 R14: 0000000000000287 R15: 0000000000000008
  [ 29.244878] FS: 00007f93a04b0900(0000) GS:ffff913faed80000(0000) knlGS:0000000000000000
  [ 29.252961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 29.258707] CR2: 0000000000000020 CR3: 0000003fa9d90000 CR4: 00000000003406e0
  [ 29.265883] Call Trace:
  [ 29.268346] __kmem_cache_release+0x1a/0x30
  [ 29.273913] __kmem_cache_create+0x4f9/0x550
  [ 29.278192] ? __kmalloc_node+0x1eb/0x320
  [ 29.282205] ? kvmalloc_node+0x31/0x80
  [ 29.285962] create_cache+0x120/0x1f0
  [ 29.291003] kmem_cache_create_usercopy+0x17d/0x270
  [ 29.295882] kmem_cache_create+0x16/0x20
  [ 29.300152] dm_bufio_client_create+0x1af/0x3f0 [dm_bufio]
  [ 29.305644] ? snapshot_map+0x5e0/0x5e0 [dm_snapshot]
  [ 29.310693] persistent_read_metadata+0x1ed/0x500 [dm_snapshot]
  [ 29.316627] ? _cond_resched+0x19/0x40
  [ 29.320384] snapshot_ctr+0x79e/0x910 [dm_snapshot]
  [ 29.325276] dm_table_add_target+0x18d/0x370
  [ 29.329552] table_load+0x12a/0x370
  [ 29.333045] ctl_ioctl+0x1e2/0x590
  [ 29.336450] ? retrieve_status+0x1c0/0x1c0
  [ 29.340551] dm_ctl_ioctl+0xe/0x20
  [ 29.343958] do_vfs_ioctl+0xa9/0x640
  [ 29.347547] ? ksys_semctl.constprop.19+0xf7/0x190
  [ 29.352337] ksys_ioctl+0x75/0x80
  [ 29.355663] __x64_sys_ioctl+0x1a/0x20
  [ 29.359421] do_syscall_64+0x57/0x190
  [ 29.363094] entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [ 29.368144] RIP: 0033:0x7f939f0286d7
  [ 29.371732] Code: b3 66 90 48 8b 05 b1 47 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 47 2d 00 f7 d8 64 89 01 48
  [ 29.390478] RSP: 002b:00007ffe918df168 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
  [ 29.398045] RAX: ffffffffffffffda RBX: 0000561c107f672c RCX: 00007f939f0286d7
  [ 29.405175] RDX: 0000561c1107c610 RSI: 00000000c138fd09 RDI: 0000000000000009
  [ 29.412309] RBP: 00007ffe918df220 R08: 00007f939f59d120 R09: 00007ffe918defd0
  [ 29.419442] R10: 0000561c1107c6c0 R11: 0000000000000202 R12: 00007f939f59c4e6
  [ 29.426623] R13: 00007f939f59c4e6 R14: 00007f939f59c4e6 R15: 00007f939f59c4e6
  [ 29.433778] Modules linked in: dm_snapshot dm_bufio dm_zero nls_iso8859_1 ipmi_ssif input_leds amd64_edac_mod edac_mce_amd joydev kvm_amd kvm ccp k10temp ipmi_si ipmi_devintf ipmi_msghandler mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core bcache crc64 hid_generic crct10dif_pclmul mlx5_core crc32_pclmul ast ghash_clmulni_intel drm_vram_helper pci_hyperv_intf ttm aesni_intel mpt3sas nvme crypto_simd drm_kms_helper syscopyarea igb cryptd raid_class sysfillrect ahci tls sysimgblt glue_helper dca usbhid fb_sys_fops libahci nvme_core mlxfw i2c_algo_bit scsi_transport_sas drm hid i2c_piix4
  [ 29.507853] CR2: 0000000000000020
  [ 29.511174] ---[ end trace 43bd923f80cbdf52 ]---

That :a-0000152 is meant to be /sys/kernel/debug/:a-0000152. Even a working kernel shows some trouble there:

  $ uname -a
  Linux <REDACTED> 5.4.0-42-generic #46~18.04.1-Ubuntu SMP Fri Jul 10 07:21:24 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  $ ls -l /sys/kernel/slab | grep a-0000152
  lrwxrwxrwx 1 root root 0 Sep 8 03:20 dm_bufio_buffer -> :a-0000152

So on 5.4.0-42 the named node doesn't get created, but at least it doesn't crash. The same thing is visible on my 5.8.0-18 desktop, but I can't reproduce the crash on other machines with snapshot thin volumes despite it happening every time (even with maxcpus=1) on the affected system.

It should be noted that LVM was not in use on this system until just before it was rebooted into the new kernel, but downgrading to -42 does work so it seems like a coincidence. Before I realised it was a recent regression I dug through mm/slub.c's history and found dde3c6b7 ("mm/slub: fix a memory leak in sysfs_slab_add()") kind of suspicious -- it ostensibly fixes a leak from 80da026a ("mm/slub: fix slab double-free in case of duplicate sysfs filename"), exactly the codepath that seems to crash here.

There's clearly some existing bug causing the slab sysfs node to not be added, and I guess dde3c6b7 turns that into a crash on some systems. This is a test system, so I can do whatever debugging is required to narrow down the trigger.