[Regression] Failed to boot disco kernel built from master-next (kernel kernel NULL pointer dereference)

Bug #1853981 reported by Po-Hsu Lin on 2019-11-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Disco
High
Stefan Bader

Bug Description

Build environment: Tangerine build server with dchroot disco-amd64
Build command: fakeroot debian/rules do_tools=0 no_dumpfile=1 clean binary-generic binary-headers

While building the test kernel for Disco, the kernel built from master-next branch (head: e903d6a UBUNTU: upstream stable to v4.19.83, v5.3.10) will result in the following error on boot with an AMD64 KVM node:
[ 2.724234] [TTM] Initializing pool allocator
[ 2.725770] [TTM] Initializing DMA pool allocator
[ 2.727918] [drm] fb mappable at 0xFC000000
[ 2.729462] [drm] vram aper at 0xFC000000
[ 2.729980] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 2.730902] [drm] size 33554432
[ 2.735026] #PF error: [normal kernel read fault]
[ 2.735028] PGD 0 P4D 0
[ 2.735032] Oops: 0000 [#1] SMP PTI
[ 2.735036] CPU: 1 PID: 211 Comm: systemd-udevd Not tainted 5.0.0-37-generic #40
[ 2.736230] [drm] fb depth is 16
[ 2.738490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 2.739526] [drm] pitch is 2048
[ 2.740751] RIP: 0010:attempt_merge+0x145/0x9d0
[ 2.740753] Code: 84 a2 02 00 00 41 8b 7c 24 24 4d 8b 44 24 30 c1 ef 09 89 f8 4c 01 c0 48 3b 43 30 0f 85 0e ff ff ff 48 8b 53 38 4d 8b 7c 24 40 <8b> 42 20 45 8b 5f 24 89 44 24 3c 4d 85 ff 74 20 41 8b 47 30 85 c0
[ 2.740754] RSP: 0018:ffffb4964067f2f8 EFLAGS: 00010246
[ 2.753490] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input4
[ 2.758417] RAX: 0000000000000000 RBX: ffff9a1b33261200 RCX: 0000000000000000
[ 2.758418] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2.758418] RBP: ffffb4964067f378 R08: 0000000000000000 R09: 0000000000000004
[ 2.758419] R10: 0000000000000001 R11: 0000000000000800 R12: ffff9a1b33260000
[ 2.758420] R13: ffff9a1b330c7070 R14: ffff9a1b330c7070 R15: 0000000000000000
[ 2.758421] FS: 00007fd50d4748c0(0000) GS:ffff9a1b7db00000(0000) knlGS:0000000000000000
[ 2.758422] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2.758423] CR2: 0000000000000020 CR3: 000000003377e000 CR4: 00000000000006e0
[ 2.758426] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2.786182] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2.786183] Call Trace:
[ 2.786188] ? insert_work+0x6c/0x80
[ 2.786193] ? __sbitmap_get_word+0x31/0x90
[ 2.786194] blk_attempt_req_merge+0xe/0x30
[ 2.786198] elv_attempt_insert_merge+0x35/0x90
[ 2.786201] blk_mq_sched_try_insert_merge+0x42/0x50
[ 2.786203] dd_insert_requests+0x90/0x1c0
[ 2.786206] blk_mq_sched_insert_request+0x12d/0x1a0
[ 2.786208] blk_mq_make_request+0x36c/0x4d0
[ 2.786210] generic_make_request+0x19e/0x400
[ 2.786212] submit_bio+0x49/0x140
[ 2.786215] ? guard_bio_eod+0x32/0x100
[ 2.786217] submit_bh_wbc+0x185/0x1b0
[ 2.786219] block_read_full_page+0x24d/0x330
[ 2.786222] ? check_disk_change+0x70/0x70
[ 2.786225] ? count_shadow_nodes+0x140/0x140
[ 2.786228] blkdev_readpage+0x18/0x20
[ 2.786232] do_read_cache_page+0x374/0x7a0
[ 2.786234] ? blkdev_writepages+0x10/0x10
[ 2.786237] ? get_page_from_freelist+0xefe/0x1440
[ 2.786240] ? __radix_tree_replace+0x59/0xf0
[ 2.786243] read_cache_page+0x12/0x20
[ 2.786244] read_dev_sector+0x2d/0xd0
[ 2.786246] read_lba+0xcd/0x220
[ 2.786248] efi_partition+0x1e4/0x6de
[ 2.786250] ? vsnprintf+0x103/0x520
[ 2.786252] ? snprintf+0x49/0x60
[ 2.786253] ? is_gpt_valid.part.7+0x470/0x470
[ 2.786255] check_partition+0x13e/0x238
[ 2.786256] rescan_partitions+0xaf/0x2a0
[ 2.786259] ? _cond_resched+0x19/0x30
[ 2.786260] __blkdev_get+0x393/0x550
[ 2.786262] blkdev_get+0x10c/0x330
[ 2.786264] ? wake_up_bit+0x42/0x50
[ 2.786267] ? unlock_new_inode+0x53/0x70
[ 2.786269] ? bdget+0x111/0x130
[ 2.786270] __device_add_disk+0x321/0x480
[ 2.786272] device_add_disk+0x13/0x20
[ 2.786275] virtblk_probe+0x4ef/0x79f [virtio_blk]
[ 2.786278] virtio_dev_probe+0x172/0x230
[ 2.786282] really_probe+0xfe/0x3b0
[ 2.786284] driver_probe_device+0xba/0x100
[ 2.786286] __driver_attach+0xf1/0x120
[ 2.786288] ? driver_probe_device+0x100/0x100
[ 2.786290] bus_for_each_dev+0x79/0xc0
[ 2.786292] ? kmem_cache_alloc_trace+0x1bd/0x1d0
[ 2.786294] driver_attach+0x1e/0x20
[ 2.786295] bus_add_driver+0x159/0x230
[ 2.786297] ? 0xffffffffc0410000
[ 2.786299] driver_register+0x70/0xc0
[ 2.786300] ? 0xffffffffc0410000
[ 2.786302] register_virtio_driver+0x20/0x30
[ 2.786304] init+0x56/0x1000 [virtio_blk]
[ 2.786308] do_one_initcall+0x4a/0x1c4
[ 2.786310] ? _cond_resched+0x19/0x30
[ 2.786311] ? kmem_cache_alloc_trace+0x167/0x1d0
[ 2.786314] do_init_module+0x60/0x220
[ 2.786316] load_module+0x1797/0x1a00
[ 2.786319] __do_sys_finit_module+0xbd/0x120
[ 2.786321] ? __do_sys_finit_module+0xbd/0x120
[ 2.786323] __x64_sys_finit_module+0x1a/0x20
[ 2.786325] do_syscall_64+0x5a/0x110
[ 2.786329] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2.786330] RIP: 0033:0x7fd50d7012e9
[ 2.786332] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 77 cb 0c 00 f7 d8 64 89 01 48
[ 2.786332] RSP: 002b:00007fff6dd12248 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 2.786334] RAX: ffffffffffffffda RBX: 000055b98c946aa0 RCX: 00007fd50d7012e9
[ 2.786335] RDX: 0000000000000000 RSI: 00007fd50d5e2cad RDI: 0000000000000005
[ 2.786335] RBP: 00007fd50d5e2cad R08: 0000000000000000 R09: 000055b98c926ab0
[ 2.786336] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
[ 2.786337] R13: 000055b98c934800 R14: 0000000000020000 R15: 000055b98c946aa0
[ 2.786338] Modules linked in: psmouse cirrus(+) ttm virtio_blk(+) drm_kms_helper virtio_net(+) syscopyarea sysfillrect net_failover sysimgblt failover fb_sys_fops drm floppy i2c_piix4 pata_acpi
[ 2.786345] CR2: 0000000000000020
[ 2.786379] ---[ end trace 1c7dd9ecceb0cff9 ]---
[ 2.786382] RIP: 0010:attempt_merge+0x145/0x9d0
[ 2.786383] Code: 84 a2 02 00 00 41 8b 7c 24 24 4d 8b 44 24 30 c1 ef 09 89 f8 4c 01 c0 48 3b 43 30 0f 85 0e ff ff ff 48 8b 53 38 4d 8b 7c 24 40 <8b> 42 20 45 8b 5f 24 89 44 24 3c 4d 85 ff 74 20 41 8b 47 30 85 c0
[ 2.786384] RSP: 0018:ffffb4964067f2f8 EFLAGS: 00010246
[ 2.786385] RAX: 0000000000000000 RBX: ffff9a1b33261200 RCX: 0000000000000000
[ 2.786386] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2.786386] RBP: ffffb4964067f378 R08: 0000000000000000 R09: 0000000000000004
[ 2.786387] R10: 0000000000000001 R11: 0000000000000800 R12: ffff9a1b33260000
[ 2.786388] R13: ffff9a1b330c7070 R14: ffff9a1b330c7070 R15: 0000000000000000
[ 2.786389] FS: 00007fd50d4748c0(0000) GS:ffff9a1b7db00000(0000) knlGS:0000000000000000
[ 2.786390] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2.786391] CR2: 0000000000000020 CR3: 000000003377e000 CR4: 00000000000006e0
[ 2.786394] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2.786395] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2.786728] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input3
[ 2.787939] fbcon: cirrusdrmfb (fb0) is primary device
[ 2.802935] Console: switching to colour frame buffer device 128x48
[ 2.969597] virtio_net virtio1 ens4: renamed from eth0
[ 3.078798] cirrus 0000:00:02.0: fb0: cirrusdrmfb frame buffer device

Po-Hsu Lin (cypressyew) on 2019-11-26
description: updated

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1853981

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Disco):
status: New → Incomplete
tags: added: disco
Po-Hsu Lin (cypressyew) wrote :

Current bisect result (not finished yet):
git bisect start
# bad: [e903d6a85368180af5f09bf5d0be0308b757694d] UBUNTU: upstream stable to v4.19.83, v5.3.10
git bisect bad e903d6a85368180af5f09bf5d0be0308b757694d
# good: [e421062fe0c6fba6d639c9ca57e3587171e7664f] UBUNTU: Ubuntu-5.0.0-37.40
git bisect good e421062fe0c6fba6d639c9ca57e3587171e7664f
# bad: [120233013313edb21f4003433af1661e8961739d] tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
git bisect bad 120233013313edb21f4003433af1661e8961739d
# good: [482f7e77e9d7193bc165fab34bb54f142157d270] ARM: dts: Fix wrong clocks for dra7 mcasp
git bisect good 482f7e77e9d7193bc165fab34bb54f142157d270
# bad: [cc337c66ceb8c0bd641a821889fd06016106422f] dm snapshot: rework COW throttling to fix deadlock
git bisect bad cc337c66ceb8c0bd641a821889fd06016106422f

Po-Hsu Lin (cypressyew) wrote :

Bisect complete:
# first bad commit: [1078686a99eabb5f627cf0f1e4d9f4a05993917a] blk-mq: honor IO scheduler for multiqueue devices

Complete bisect log can be found in the attachment.

I have a test kernel with this patch reverted, and it boots fine.

Po-Hsu Lin (cypressyew) wrote :

With kernel built with b99079a8045a084722a6a33cca3c396634e6d241 (UBUNTU: upstream stable to v4.19.86, v5.3.13)
this issue still exists.

A more complete dmesg output could be found in the attachment.

Stefan Bader (smb) wrote :

I think that "blk-mq: honor IO scheduler for multiqueue devices" is broken in disco because it misses the following commit:

commit 970d168de636ddac8221cbd4a11d7678943e7379
blk-mq: simplify blk_mq_make_request()

Move the blk_mq_bio_to_request() call in front of the if-statement.

Either this has to be picked before the mq change or there must be a call to blk_mq_bio_to_request(rq, bio) before calling blk_mq_sched_insert_request(rq, false, true, true):

                }

                blk_add_rq_to_plug(plug, rq);
+ } else if (q->elevator) {
+ blk_mq_bio_to_request(rq, bio);
+ blk_mq_sched_insert_request(rq, false, true, true);
        } else if (plug && !blk_queue_nomerges(q)) {
                blk_mq_bio_to_request(rq, bio);

Changed in linux (Ubuntu):
status: Incomplete → Invalid
Changed in linux (Ubuntu Disco):
status: Incomplete → In Progress
Stefan Bader (smb) on 2019-12-02
Changed in linux (Ubuntu Disco):
importance: Undecided → High
assignee: nobody → Stefan Bader (smb)
Sean Feole (sfeole) wrote :

I attempted to test a boot kernel built by smb, using the same test env filed in the bug originally,

linux-buildinfo-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-cloud-tools-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-headers-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-image-unsigned-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-modules-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-modules-extra-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb
linux-tools-5.0.0-37-generic_5.0.0-37.40+mnext1_amd64.deb

dmesg attached, I did not see the failure as noted in the original description. Nor any other failures to note.

Linux automation-vm1 5.0.0-37-generic #40+mnext1 SMP Mon Dec 2 13:37:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

The system booted successfully

Stefan Bader (smb) on 2019-12-02
Changed in linux (Ubuntu Disco):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
Khaled El Mously (kmously) wrote :

Set verification-done-disco based on Sean Feole's comment

tags: added: verification-done-disco
removed: verification-needed-disco
Launchpad Janitor (janitor) wrote :
Download full text (42.3 KiB)

This bug was fixed in the package linux - 5.0.0-38.41

---------------
linux (5.0.0-38.41) disco; urgency=medium

  * disco/linux: 5.0.0-38.41 -proposed tracker (LP: #1854788)

  * [Regression] Failed to boot disco kernel built from master-next (kernel
    kernel NULL pointer dereference) (LP: #1853981)
    - SAUCE: blk-mq: Fix blk_mq_make_request for mq devices

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * [CML] New device id's for CMP-H (LP: #1846335)
    - mmc: sdhci-pci: Add another Id for Intel CML
    - i2c: i801: Add support for Intel Comet Lake PCH-H
    - mtd: spi-nor: intel-spi: Add support for Intel Comet Lake-H SPI serial flash
    - mfd: intel-lpss: Add Intel Comet Lake PCH-H PCI IDs

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * [SRU][B/OEM-B/OEM-OSP1/D] Enable new Elan touchpads which are not in current
    whitelist (LP: #1853246)
    - Input: elan_i2c - export the device id whitelist
    - HID: quirks: Refactor ELAN 400 and 401 handling

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * [CML-S62] Need enable turbostat patch support for Comet lake- S 6+2
    (LP: #1847451)
    - SAUCE: tools/power turbostat: Add Cometlake support

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerpc/book3s64: Fix link stack flush on context switch
    - KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel

  * Raydium Touchscreen on ThinkPad L390 does not work (LP: #1849721)
    - HID: i2c-hid: fix no irq after reset on raydium 3118

  * Make Goodix I2C touchpads work (LP: #1853842)
    - HID: i2c-hid: Remove runtime power management
    - HID: i2c-hid: Send power-on command after reset

  * Touchpad doesn't work on Dell Inspiron 7000 2-in-1 (LP: #1851901)
    - Revert "UBUNTU: SAUCE: mfd: intel-lpss: add quirk for Dell XPS 13 7390
      2-in-1"
    - lib: devres: add a helper function for ioremap_uc
    - mfd: intel-lpss: Use devm_ioremap_uc for MMIO

  * CVE-2019-19055
    - nl80211: fix memory leak in nl80211_get_ftm_responder_stats

  * [CML-S62] Need enable intel_rapl patch support for Comet lake- S 6+2
    (LP: #1847454)
    - powercap/intel_rapl: add support for CometLake Mobile
    - powercap/intel_rapl: add support for Cometlake desktop

  * [CML-S62] Need enable intel_pmc_core driver patch for Comet l...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers