kernel oops when I undocked

Bug #1767452 reported by Frew Schmidt
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Low
Unassigned

Bug Description

Here's a snippet from the kernel log which I think will make the actual issue clear:

Apr 27 09:49:20 caliburn kernel: [ 4558.910412] pcieport 0000:0a:03.0: Refused to change power state, currently in D3
Apr 27 09:49:20 caliburn kernel: [ 4558.917717] xhci_hcd 0000:0d:00.0: remove, state 1
Apr 27 09:49:20 caliburn kernel: [ 4558.917724] usb usb6: USB disconnect, device number 1
Apr 27 09:49:20 caliburn kernel: [ 4558.917725] usb 6-1: USB disconnect, device number 2
Apr 27 09:49:20 caliburn kernel: [ 4558.917774] xhci_hcd 0000:0d:00.0: xHCI host controller not responding, assume dead
Apr 27 09:49:20 caliburn kernel: [ 4558.968688] xhci_hcd 0000:0d:00.0: USB bus 6 deregistered
Apr 27 09:49:20 caliburn kernel: [ 4558.968694] xhci_hcd 0000:0d:00.0: remove, state 1
Apr 27 09:49:20 caliburn kernel: [ 4558.968700] usb usb5: USB disconnect, device number 1
Apr 27 09:49:20 caliburn kernel: [ 4558.968701] usb 5-3: USB disconnect, device number 3
Apr 27 09:49:20 caliburn kernel: [ 4559.076020] usb 5-4: USB disconnect, device number 4
Apr 27 09:49:20 caliburn kernel: [ 4559.076023] usb 5-4.2: USB disconnect, device number 5
Apr 27 09:49:20 caliburn kernel: [ 4559.123365] usb 5-4.4: USB disconnect, device number 7
Apr 27 09:49:20 caliburn kernel: [ 4559.125340] xhci_hcd 0000:0d:00.0: Host halt failed, -19
Apr 27 09:49:20 caliburn kernel: [ 4559.125343] xhci_hcd 0000:0d:00.0: Host not accessible, reset failed.
Apr 27 09:49:20 caliburn kernel: [ 4559.125467] xhci_hcd 0000:0d:00.0: USB bus 5 deregistered
Apr 27 09:49:21 caliburn kernel: [ 4559.146170] pcieport 0000:0a:02.0: Refused to change power state, currently in D3
Apr 27 09:49:21 caliburn kernel: [ 4559.147809] pcieport 0000:0a:01.0: Refused to change power state, currently in D3
Apr 27 09:49:21 caliburn kernel: [ 4559.149445] xhci_hcd 0000:0b:00.0: remove, state 4
Apr 27 09:49:21 caliburn kernel: [ 4559.149451] usb usb4: USB disconnect, device number 1
Apr 27 09:49:21 caliburn kernel: [ 4559.149700] xhci_hcd 0000:0b:00.0: USB bus 4 deregistered
Apr 27 09:49:21 caliburn kernel: [ 4559.149706] xhci_hcd 0000:0b:00.0: xHCI host controller not responding, assume dead
Apr 27 09:49:21 caliburn kernel: [ 4559.149718] xhci_hcd 0000:0b:00.0: remove, state 1
Apr 27 09:49:21 caliburn kernel: [ 4559.149722] usb usb3: USB disconnect, device number 1
Apr 27 09:49:21 caliburn kernel: [ 4559.149724] usb 3-1: USB disconnect, device number 2
Apr 27 09:49:21 caliburn kernel: [ 4559.206497] usb 3-2: USB disconnect, device number 3
Apr 27 09:49:21 caliburn kernel: [ 4559.462604] usb 3-4: USB disconnect, device number 4
Apr 27 09:49:21 caliburn kernel: [ 4559.464185] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034
Apr 27 09:49:21 caliburn kernel: [ 4559.464193] IP: tty_unregister_driver+0xd/0x70
Apr 27 09:49:21 caliburn kernel: [ 4559.464195] PGD 0 P4D 0
Apr 27 09:49:21 caliburn kernel: [ 4559.464197] Oops: 0000 [#1] SMP PTI
Apr 27 09:49:21 caliburn kernel: [ 4559.464199] Modules linked in: veth ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack libcrc32c br_netfilter bridge stp llc overlay hid_led hid_generic snd_usb_audio snd_usbmidi_lib usbhid cdc_ether usbnet r8152 mii bnep nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_seq_midi snd_hda_ext_core snd_seq_midi_event arc4 snd_soc_sst_dsp intel_rapl snd_soc_sst_ipc snd_rawmidi x86_pkg_temp_thermal snd_soc_acpi intel_powerclamp snd_hda_codec_hdmi coretemp snd_hda_codec_conexant snd_hda_codec_generic kvm_intel snd_soc_core snd_compress ac97_bus kvm snd_pcm_dmaengine snd_seq irqbypass intel_cstate intel_rapl_perf snd_hda_intel joydev uvcvideo input_leds
Apr 27 09:49:21 caliburn kernel: [ 4559.464233] serio_raw snd_hda_codec thinkpad_acpi videobuf2_vmalloc videobuf2_memops nvram videobuf2_v4l2 iwlmvm snd_hda_core snd_hwdep videobuf2_core wmi_bmof intel_wmi_thunderbolt snd_pcm mac80211 snd_seq_device snd_timer videodev btusb media btrtl btbcm btintel iwlwifi mei_me bluetooth cfg80211 rtsx_pci_ms memstick snd ecdh_generic mei ucsi_acpi typec_ucsi intel_pch_thermal typec shpchp soundcore tpm_crb mac_hid acpi_pad sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 algif_skcipher af_alg dm_crypt rtsx_pci_sdmmc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc i915 aesni_intel i2c_algo_bit aes_x86_64 drm_kms_helper crypto_simd e1000e glue_helper cryptd syscopyarea psmouse sysfillrect sysimgblt nvme ptp fb_sys_fops pps_core thunderbolt nvme_core drm rtsx_pci wmi i2c_hid
Apr 27 09:49:21 caliburn kernel: [ 4559.464273] video hid
Apr 27 09:49:21 caliburn kernel: [ 4559.464276] CPU: 1 PID: 35555 Comm: kworker/u8:1 Not tainted 4.15.0-20-generic #21-Ubuntu
Apr 27 09:49:21 caliburn kernel: [ 4559.464278] Hardware name: LENOVO 20HR000FUS/20HR000FUS, BIOS N1MET39W (1.24 ) 09/27/2017
Apr 27 09:49:21 caliburn kernel: [ 4559.464281] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
Apr 27 09:49:21 caliburn kernel: [ 4559.464285] RIP: 0010:tty_unregister_driver+0xd/0x70
Apr 27 09:49:21 caliburn kernel: [ 4559.464286] RSP: 0000:ffffb5964451faf0 EFLAGS: 00010246
Apr 27 09:49:21 caliburn kernel: [ 4559.464288] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Apr 27 09:49:21 caliburn kernel: [ 4559.464289] RDX: ffff8e754ba5de00 RSI: ffffee6c52323ac0 RDI: 0000000000000000
Apr 27 09:49:21 caliburn kernel: [ 4559.464290] RBP: ffffb5964451faf8 R08: ffff8e754c8eb908 R09: 00000001801e0017
Apr 27 09:49:21 caliburn kernel: [ 4559.464292] R10: ffffee6c5228a800 R11: 0000000000000000 R12: ffff8e754b060230
Apr 27 09:49:21 caliburn kernel: [ 4559.464293] R13: ffff8e754b06027c R14: ffff8e754b060390 R15: 0000000000000060
Apr 27 09:49:21 caliburn kernel: [ 4559.464294] FS: 0000000000000000(0000) GS:ffff8e7561480000(0000) knlGS:0000000000000000
Apr 27 09:49:21 caliburn kernel: [ 4559.464296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 27 09:49:21 caliburn kernel: [ 4559.464297] CR2: 0000000000000034 CR3: 000000048280a005 CR4: 00000000003606e0
Apr 27 09:49:21 caliburn kernel: [ 4559.464299] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 27 09:49:21 caliburn kernel: [ 4559.464300] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 27 09:49:21 caliburn kernel: [ 4559.464300] Call Trace:
Apr 27 09:49:21 caliburn kernel: [ 4559.464305] xhci_dbc_tty_unregister_driver+0x15/0x30
Apr 27 09:49:21 caliburn kernel: [ 4559.464307] xhci_dbc_exit+0x2e/0x50
Apr 27 09:49:21 caliburn kernel: [ 4559.464309] xhci_stop+0x5b/0x1e0
Apr 27 09:49:21 caliburn kernel: [ 4559.464312] usb_remove_hcd+0x105/0x250
Apr 27 09:49:21 caliburn kernel: [ 4559.464314] usb_hcd_pci_remove+0x74/0x130
Apr 27 09:49:21 caliburn kernel: [ 4559.464317] xhci_pci_remove+0x6b/0x70
Apr 27 09:49:21 caliburn kernel: [ 4559.464320] pci_device_remove+0x3e/0xb0
Apr 27 09:49:21 caliburn kernel: [ 4559.464323] device_release_driver_internal+0x15b/0x220
Apr 27 09:49:21 caliburn kernel: [ 4559.464325] device_release_driver+0x12/0x20
Apr 27 09:49:21 caliburn kernel: [ 4559.464328] pci_stop_bus_device+0x7f/0xa0
Apr 27 09:49:21 caliburn kernel: [ 4559.464330] pci_stop_bus_device+0x30/0xa0
Apr 27 09:49:21 caliburn kernel: [ 4559.464333] pci_stop_bus_device+0x41/0xa0
Apr 27 09:49:21 caliburn kernel: [ 4559.464335] pci_stop_and_remove_bus_device+0x12/0x20
Apr 27 09:49:21 caliburn kernel: [ 4559.464338] trim_stale_devices+0x11d/0x150
Apr 27 09:49:21 caliburn kernel: [ 4559.464340] trim_stale_devices+0xa9/0x150
Apr 27 09:49:21 caliburn kernel: [ 4559.464342] trim_stale_devices+0xbb/0x150
Apr 27 09:49:21 caliburn kernel: [ 4559.464343] ? get_slot_status+0xa3/0xe0
Apr 27 09:49:21 caliburn kernel: [ 4559.464346] acpiphp_check_bridge.part.7+0x100/0x140
Apr 27 09:49:21 caliburn kernel: [ 4559.464348] acpiphp_hotplug_notify+0x18e/0x220
Apr 27 09:49:21 caliburn kernel: [ 4559.464349] ? free_bridge+0x100/0x100
Apr 27 09:49:21 caliburn kernel: [ 4559.464351] acpi_device_hotplug+0xa4/0x4b0
Apr 27 09:49:21 caliburn kernel: [ 4559.464353] acpi_hotplug_work_fn+0x1e/0x30
Apr 27 09:49:21 caliburn kernel: [ 4559.464356] process_one_work+0x1de/0x410
Apr 27 09:49:21 caliburn kernel: [ 4559.464358] worker_thread+0x32/0x410
Apr 27 09:49:21 caliburn kernel: [ 4559.464360] kthread+0x121/0x140
Apr 27 09:49:21 caliburn kernel: [ 4559.464362] ? process_one_work+0x410/0x410
Apr 27 09:49:21 caliburn kernel: [ 4559.464364] ? kthread_create_worker_on_cpu+0x70/0x70
Apr 27 09:49:21 caliburn kernel: [ 4559.464367] ? do_syscall_64+0x73/0x130
Apr 27 09:49:21 caliburn kernel: [ 4559.464369] ? SyS_exit_group+0x14/0x20
Apr 27 09:49:21 caliburn kernel: [ 4559.464371] ret_from_fork+0x35/0x40
Apr 27 09:49:21 caliburn kernel: [ 4559.464372] Code: c2 bf 2c 54 b4 48 c7 c7 90 0d a4 b4 e8 ed 92 ee ff 48 89 df e8 85 c7 c6 ff 5b 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb <8b> 77 34 8b 7f 2c c1 e7 14 0b 7b 30 e8 22 15 ca ff 48 c7 c7 e0
Apr 27 09:49:21 caliburn kernel: [ 4559.464396] RIP: tty_unregister_driver+0xd/0x70 RSP: ffffb5964451faf0
Apr 27 09:49:21 caliburn kernel: [ 4559.464396] CR2: 0000000000000034
Apr 27 09:49:21 caliburn kernel: [ 4559.464398] ---[ end trace eca0969987c11306 ]---
Apr 27 09:49:21 caliburn kernel: [ 4559.515107] thinkpad_acpi: EC reports that Thermal Table has changed
Apr 27 09:49:22 caliburn kernel: [ 4560.472897] audit: type=1107 audit(1524847762.333:50): pid=937 uid=103 auid=4294967295 ses=4294967295 msg='apparmor="DENIED" operation="dbus_signal" bus="system" path="/org/freedesktop/login1" interface="org.freedesktop.login1.Manager" member="PrepareForSleep" name=":1.10" mask="receive" pid=4254 label="snap.keepassxc.keepassxc" peer_pid=899 peer_label="unconfined"
Apr 27 09:49:22 caliburn kernel: [ 4560.472897] exe="/usr/bin/dbus-daemon" sauid=103 hostname=? addr=? terminal=?'
Apr 27 09:49:22 caliburn kernel: [ 4560.947925] PM: suspend entry (deep)

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-20-generic 4.15.0-20.21
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
Uname: Linux 4.15.0-20-generic x86_64
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
Date: Fri Apr 27 10:31:58 2018
InstallationDate: Installed on 2018-04-27 (0 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
SourcePackage: linux-signed
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Frew Schmidt (frooh) wrote :
Revision history for this message
Frew Schmidt (frooh) wrote :

Here's the above in an (unmunged) attachment

Revision history for this message
Anders Kvist (akv) wrote :

I assume you have a Thunderbolt dock? I see the same, happens each time after undocking and then the dock won't work anymore until a reboot.

It seems to be a problem with a double dereference in xhci dbgtty. Check this bug at redhat, which describes the problem and also has a fix:

https://bugzilla.redhat.com/show_bug.cgi?id=1565131

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed (Ubuntu):
status: New → Confirmed
Revision history for this message
Frew Schmidt (frooh) wrote : Re: [Bug 1767452] Re: kernel oops when I undocked

I do have a Thunderbolt dock, but it is not so trivially reproducable
for me. Most of the time I can undock and redock just fine. I
suspect I did something in an unusual order, like closed my laptop and
*then* undocked, instead of undocking first.

On Mon, Apr 30, 2018 at 04:49:52PM -0000, Anders Kvist wrote:
> I assume you have a Thunderbolt dock? I see the same, happens each time
> after undocking and then the dock won't work anymore until a reboot.
>
> It seems to be a problem with a double dereference in xhci dbgtty. Check
> this bug at redhat, which describes the problem and also has a fix:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1565131
>
> ** Bug watch added: Red Hat Bugzilla #1565131
> https://bugzilla.redhat.com/show_bug.cgi?id=1565131
>
> ** Attachment added: "dmesg - thunderbolt attach and deattach"
> https://bugs.launchpad.net/ubuntu/+source/linux-signed/+bug/1767452/+attachment/5130886/+files/thunderbolt_crash
>

--
fREW Schmidt
https://blog.afoolishmanifesto.com

Revision history for this message
Frew Schmidt (frooh) wrote :

I reproduced this this afternoon with the latest stable kernel (4.16.6.) See attached subset of the kernel log. I'll try with 4.17 tomorrow.

Revision history for this message
Anders Kvist (akv) wrote :

I haven't had a single undock that worked, but I only tried with the 18.04 live installer, so guess that kernel already may be behind...

My hardware is a Lenovo X1 Carbon 5th generatin and a Lenovo ThinkPad Thunderbolt 3 Dock

Revision history for this message
Frew Schmidt (frooh) wrote :

I just realized why I thought this was not predictable. The system continues to function just fine after the Oops *until* I close my laptop. Once I close my laptop the whole thing freezes and does not recover. Attaching one last kernel log and then will leave it alone till someone asks for more detail.

Revision history for this message
Anders Kvist (akv) wrote :

Yea, I cannot do a shutdown, it will hang during something with thunderbolt and I have to do a hard poweroff :/

Revision history for this message
Frew Schmidt (frooh) wrote :

Same.

Revision history for this message
Frew Schmidt (frooh) wrote :

Good news! the 4.17.0-rc3 kernel *fixed this issue*! You can get it here: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc3/ I only tried undocking once so far but hey, I was able to undock and redock and there was no oops in dmesg. I will likely try again later throughout the day for various meetings. Will update here if I have any problems.

Revision history for this message
Anders Kvist (akv) wrote :

Cool! I'll try this weekend :D

affects: linux-signed (Ubuntu) → linux (Ubuntu)
Revision history for this message
penalvch (penalvch) wrote :

Frew Schmidt, please address all of the following:

1) Could you please execute the following via a terminal to attach debugging logs:
apport-collect 1767452

2) The next step is to fully reverse commit bisect from kernel 4.16.6 to 4.17-rc3 in order to identify the last bad commit, followed immediately by the first good one. Once this good commit has been identified, it may be reviewed for backporting. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection ?

Please note, finding adjacent kernel versions, or providing a commit from a kernel version bisect is not fully commit bisecting.

Also, the kernel release names are irrelevant for the purposes of bisecting.

It is most helpful that after the fix commit (not kernel version) has been identified, you then mark this report Status Confirmed.

Thank you for your help.

tags: added: kernel-fixed-upstream kernel-fixed-upstream-4.17-rc3 needs-reverse-bisect
Changed in linux (Ubuntu):
importance: Undecided → Low
status: Confirmed → Incomplete
Revision history for this message
Anders Kvist (akv) wrote :

Christopher, I believe the solution is in the redhat bug mentioned earlier.

https://bugzilla.redhat.com/show_bug.cgi?id=1565131

Revision history for this message
penalvch (penalvch) wrote :

Anders Kvist:
>"Christopher, I believe the solution is in the redhat bug mentioned earlier.
https://bugzilla.redhat.com/show_bug.cgi?id=1565131"

What is most helpful is if one confirms with their hardware via adding and reverting the potential fix commit that it does indeed address their issue.

Revision history for this message
Frew Schmidt (frooh) wrote :

I intend to do this, but I expect it to eat up a few hours of time so
just haven't gotten around to it.

On Wed, May 30, 2018 at 01:50:18PM -0000, Christopher M. Penalver wrote:
> Anders Kvist:
> >"Christopher, I believe the solution is in the redhat bug mentioned earlier.
> https://bugzilla.redhat.com/show_bug.cgi?id=1565131"
>
> What is most helpful is if one confirms with their hardware via adding
> and reverting the potential fix commit that it does indeed address their
> issue.
>

--
fREW Schmidt
https://blog.afoolishmanifesto.com

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

I guess this is a dupe of LP: #1768852.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.