[Intel Ubuntu 18.04 Bug] Null pointer dereference, when disconnecting RAID rebuild target
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Unassigned |
Bug Description
Kernel panic with null pointer dereference, when RAID10 rebuild target is disconnected during rebuild. It's sporadical issue.
Steps to reproduce:
1) Create raid10 with mdadm
2) Wait for resync to end
3) Add spare drive
4) Fail one of the member drive
- Raid becomes degraded, rebuild to spare from step 3 starts.
5) disconnect the drive added in step 3 (rebuild target)
trace:
[ 1022.872118] BUG: unable to handle kernel NULL pointer dereference at 00000000000000f0
[ 1022.881072] IP: raid10d+
[ 1022.886071] PGD 0 P4D 0
[ 1022.889033] Oops: 0002 [#1] SMP PTI
[ 1022.893056] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_
[ 1022.973751] ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log nouveau hid_generic mxm_wmi mgag200 video i2c_algo_bit usbhid ttm e1000e i40e crct10dif_pclmul hid ptp crc32_pclmul ghash_clmulni_intel pcbc drm_kms_helper aesni_intel syscopyarea aes_x86_64 raid1 sysfillrect crypto_simd sysimgblt glue_helper uas fb_sys_fops cryptd ahci vmd drm usb_storage pps_core libahci wmi
[ 1023.026580] CPU: 90 PID: 6373 Comm: md126_raid10 Not tainted 4.15.0-10-generic #11-Ubuntu
[ 1023.035831] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYDTRL1.
[ 1023.046913] RIP: 0010:raid10d+
[ 1023.052479] RSP: 0018:ffffb51787
[ 1023.058429] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff99b3bd0d5e20
[ 1023.066517] RDX: ffffffffc025f8c0 RSI: 0000000000000286 RDI: ffff99b7a1ed5c00
[ 1023.074605] RBP: ffffb5178747be90 R08: 0000000000000349 R09: 0000000000000000
[ 1023.082697] R10: ffffb5178747bd70 R11: 0000000000000365 R12: 0000000000000000
[ 1023.090790] R13: ffff99b3d97dbf70 R14: ffff99b3bd0d5e00 R15: ffff99b3bd0d5e00
[ 1023.098883] FS: 000000000000000
[ 1023.108051] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1023.114602] CR2: 00000000000000f0 CR3: 00000001d0c0a004 CR4: 00000000007606e0
[ 1023.122707] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1023.130804] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1023.138915] PKRU: 55555554
[ 1023.142086] Call Trace:
[ 1023.144964] ? __clear_
[ 1023.149105] ? __schedule+
[ 1023.153340] ? __clear_
[ 1023.157478] ? schedule+0x2c/0x80
[ 1023.161326] md_thread+
[ 1023.165273] ? raid10_
[ 1023.171349] ? md_thread+
[ 1023.175484] ? wait_woken+
[ 1023.179521] kthread+0x121/0x140
[ 1023.183267] ? find_pers+0x70/0x70
[ 1023.187204] ? kthread_
[ 1023.192986] ? do_syscall_
[ 1023.197505] ret_from_
[ 1023.201631] Code: e4 48 8b 57 48 0f 84 92 08 00 00 49 83 7c 24 48 00 0f 84 86 08 00 00 48 63 d8 48 c1 e3 05 48 85 d2 74 41 49 8b 46 08 48 8b 04 18 <f0> ff 80 f0 00 00 00 49 8b 46 08 48 8b 04 18 48 8b 40 30 48 8b
[ 1023.223245] RIP: raid10d+
[ 1023.230490] CR2: 00000000000000f0
[ 1023.234340] ---[ end trace 12e1280fca9f2646 ]---
Additional information:
Following upstream patches solves the issue:
md: document lifetime of internal rdev pointer.
https:/
https:/
md: only allow remove_
https:/
https:/
information type: | Public → Private |
tags: | added: kernel-da-key |
Changed in linux (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
tags: | added: bionic |
Changed in linux (Ubuntu Bionic): | |
assignee: | Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury) |
status: | Triaged → In Progress |
information type: | Private → Public |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
I built a test kernel with commits f2785b527cda and 39772f0a7be. The test kernel can be downloaded from: kernel. ubuntu. com/~jsalisbury /lp1759279
http://
Can you test this kernel and see if it resolves this bug?
Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages.
Thanks in advance!