fwupd hangs after 19.04 upgrade from 18.10

Bug #1826691 reported by Jason Pritchard
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Unknown
Low
fwupd (Ubuntu)
Fix Released
Low
Unassigned
Disco
Won't Fix
Low
Unassigned
Eoan
Fix Released
Low
Unassigned
linux (Ubuntu)
Fix Released
High
Unassigned
Disco
Won't Fix
High
Unassigned
Eoan
Won't Fix
High
Unassigned

Bug Description

1) The release of Ubuntu you are using, via 'lsb_release -rd' or System -> About Ubuntu
$ lsb_release -rd
Description: Ubuntu 19.04
Release: 19.04

Being a firmware updater, the machine is probably relevant - Dell 7730.

2) The version of the package you are using, via 'apt-cache policy pkgname' or by checking in Software Center
$ apt policy fwupd
fwupd:
  Installed: (none)
  Candidate: 1.2.5-1ubuntu1
  Version table:
     1.2.5-1ubuntu1 500
        500 http://us.archive.ubuntu.com/ubuntu disco/main amd64 Packages
        100 /var/lib/dpkg/status

3) What you expected to happen
I expected few/no issues after 19.04 upgrade. The fwupd process worked perfectly in 18.10. It's upgraded the firmware on my laptop twice since I installed 18.10.

4) What happened instead
After upgrading from 18.10 to 19.04 I had an issues that when I tried to suspend my laptop using the same button press as previous, the laptop would be hot and screen flickering when I returned to reopen it. I thought there were power management issues in the upgrade, but I've traced it to fwupd not letting the PM suspend process complete (see dmesg below). I tried shutting down the fwupd daemon with systemctl but the command hangs forever. All of the fwupdmgr commands timeout.

Thinking something might have broken in the upgrade, I tried uninstalling (purge) the fwupd package and reinstalling. Starting the process with systemctl hangs forever like stopping it. Trying to kill -9 the process does not work - in uninterruptible sleep (D).

This crash is probably related. It happened a couple of ours before I tried to put it to sleep the last time before uninstalling fwupd.

Apr 23 19:29:14 texas kernel: [ 133.673290] [drm] REG_WAIT timeout 10us * 160 tries - submit_channel_request line:246
Apr 23 19:29:14 texas kernel: [ 133.673348] WARNING: CPU: 6 PID: 2467 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:249 generic_reg_wait.cold.3+0x25/0x2c [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673349] Modules linked in: thunderbolt rfcomm xt_owner ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge xt_CHECKSUM xt_tcpudp stp llc iptable_filter iptable_mangle bpfilter ccm snd_hda_codec_realtek snd_hda_codec_generic pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) cmac vboxdrv(OE) bnep binfmt_misc dell_rbtn nls_iso8859_1 joydev arc4 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus intel_rapl snd_pcm_dmaengine x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi kvm_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul i915 snd_pcm crc32_pclmul iwlmvm uvcvideo amdgpu snd_seq_midi ghash_clmulni_intel snd_seq_midi_event mac80211 videobuf2_vmalloc kvmgt videobuf2_memops vfio_mdev videobuf2_v4l2 snd_rawmidi dell_laptop mdev videobuf2_common
Apr 23 19:29:14 texas kernel: [ 133.673362] ledtrig_audio vfio_iommu_type1 videodev dell_smm_hwmon vfio snd_seq dell_wmi media kvm chash btusb snd_seq_device amd_iommu_v2 btrtl snd_timer btbcm dell_smbios gpu_sched irqbypass btintel dcdbas ttm aesni_intel iwlwifi bluetooth drm_kms_helper aes_x86_64 crypto_simd cryptd glue_helper rtsx_pci_ms input_leds snd drm ecdh_generic intel_cstate mei_me ucsi_acpi cfg80211 serio_raw dell_wmi_descriptor intel_wmi_thunderbolt wmi_bmof memstick i2c_algo_bit mei fb_sys_fops intel_rapl_perf idma64 syscopyarea hid_multitouch processor_thermal_device soundcore sysfillrect virt_dma typec_ucsi sysimgblt intel_soc_dts_iosf intel_pch_thermal typec int3403_thermal int340x_thermal_zone dell_smo8800 acpi_pad intel_hid int3400_thermal mac_hid acpi_thermal_rel sparse_keymap sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic rtsx_pci_sdmmc nvme e1000e i2c_i801 intel_lpss_pci rtsx_pci nvme_core intel_lpss i2c_hid wmi hid video pinctrl_cannonlake pinctrl_intel
Apr 23 19:29:14 texas kernel: [ 133.673381] CPU: 6 PID: 2467 Comm: fwupd Tainted: G OE 5.0.0-13-generic #14-Ubuntu
Apr 23 19:29:14 texas kernel: [ 133.673382] Hardware name: Dell Inc. Precision 7730/05W5TJ, BIOS 1.7.0 02/19/2019
Apr 23 19:29:14 texas kernel: [ 133.673417] RIP: 0010:generic_reg_wait.cold.3+0x25/0x2c [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673418] Code: e9 37 7e fe ff 44 8b 45 20 48 8b 4d 18 48 c7 c7 40 34 17 c1 8b 55 10 8b 75 d4 e8 3b ce 82 e1 41 83 7d 20 01 0f 84 0c c3 fe ff <0f> 0b e9 05 c3 fe ff 55 48 89 e5 e8 5d de ec ff 48 c7 c7 00 a0 18
Apr 23 19:29:14 texas kernel: [ 133.673419] RSP: 0018:ffffbbaa0612fbb0 EFLAGS: 00010297
Apr 23 19:29:14 texas kernel: [ 133.673420] RAX: 0000000000000049 RBX: 00000000000000a1 RCX: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673420] RDX: 0000000000000000 RSI: ffff94843c196448 RDI: ffff94843c196448
Apr 23 19:29:14 texas kernel: [ 133.673420] RBP: ffffbbaa0612fbf8 R08: 0000000000000001 R09: 00000000000004fa
Apr 23 19:29:14 texas kernel: [ 133.673421] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000005c04
Apr 23 19:29:14 texas kernel: [ 133.673421] R13: ffff948431de5840 R14: 00000000ffffffff R15: ffff948431de5840
Apr 23 19:29:14 texas kernel: [ 133.673422] FS: 00007ff849b11b40(0000) GS:ffff94843c180000(0000) knlGS:0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673422] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 23 19:29:14 texas kernel: [ 133.673423] CR2: 00007ff83400f6d8 CR3: 0000000847680006 CR4: 00000000003606e0
Apr 23 19:29:14 texas kernel: [ 133.673423] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673424] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 23 19:29:14 texas kernel: [ 133.673424] Call Trace:
Apr 23 19:29:14 texas kernel: [ 133.673461] submit_channel_request+0x3fd/0x780 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673492] dc_link_aux_transfer+0xc6/0x150 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673526] dm_dp_aux_transfer+0x61/0x130 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673531] drm_dp_dpcd_access+0x75/0x110 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673533] drm_dp_dpcd_read+0x33/0xc0 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673537] auxdev_read_iter+0xe6/0x1a0 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673539] new_sync_read+0x109/0x170
Apr 23 19:29:14 texas kernel: [ 133.673541] __vfs_read+0x29/0x40
Apr 23 19:29:14 texas kernel: [ 133.673542] vfs_read+0x99/0x160
Apr 23 19:29:14 texas kernel: [ 133.673542] ksys_read+0x55/0xc0
Apr 23 19:29:14 texas kernel: [ 133.673543] __x64_sys_read+0x1a/0x20
Apr 23 19:29:14 texas kernel: [ 133.673545] do_syscall_64+0x5a/0x110
Apr 23 19:29:14 texas kernel: [ 133.673546] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 23 19:29:14 texas kernel: [ 133.673547] RIP: 0033:0x7ff84ccb4d94
Apr 23 19:29:14 texas kernel: [ 133.673548] Code: 84 00 00 00 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 5b fc ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 48 89 44 24 08 e8 97 fc ff ff 48
Apr 23 19:29:14 texas kernel: [ 133.673548] RSP: 002b:00007ffe268d7130 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673549] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007ff84ccb4d94
Apr 23 19:29:14 texas kernel: [ 133.673549] RDX: 0000000000000001 RSI: 00007ffe268d7194 RDI: 0000000000000013
Apr 23 19:29:14 texas kernel: [ 133.673550] RBP: 00007ffe268d7194 R08: 0000000000000000 R09: 00007ff84cca13d0
Apr 23 19:29:14 texas kernel: [ 133.673550] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
Apr 23 19:29:14 texas kernel: [ 133.673550] R13: 00007ffe268d7200 R14: 0000000000000001 R15: 000056469f7bd0e0
Apr 23 19:29:14 texas kernel: [ 133.673552] ---[ end trace b363bbe01edada49 ]---
Apr 23 19:29:14 texas kernel: [ 133.673574] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Apr 23 19:29:14 texas kernel: [ 133.673576] #PF error: [normal kernel read fault]
Apr 23 19:29:14 texas kernel: [ 133.673577] PGD 0 P4D 0
Apr 23 19:29:14 texas kernel: [ 133.673578] Oops: 0000 [#1] SMP PTI
Apr 23 19:29:14 texas kernel: [ 133.673579] CPU: 6 PID: 2467 Comm: fwupd Tainted: G W OE 5.0.0-13-generic #14-Ubuntu
Apr 23 19:29:14 texas kernel: [ 133.673580] Hardware name: Dell Inc. Precision 7730/05W5TJ, BIOS 1.7.0 02/19/2019
Apr 23 19:29:14 texas kernel: [ 133.673614] RIP: 0010:dal_ddc_close+0xd/0x30 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673615] Code: e8 38 f5 ff ff 48 8b 55 f8 65 48 33 14 25 28 00 00 00 75 02 c9 c3 e8 02 01 84 e1 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb <48> 8b 7f 08 e8 0a f6 ff ff 48 8b 3b e8 02 f6 ff ff 5b 5d c3 66 2e
Apr 23 19:29:14 texas kernel: [ 133.673615] RSP: 0018:ffffbbaa0612fc28 EFLAGS: 00010246
Apr 23 19:29:14 texas kernel: [ 133.673616] RAX: ffffffffc1052ad0 RBX: 0000000000000000 RCX: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673617] RDX: 00000000ffffffff RSI: 0000000000005c04 RDI: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673617] RBP: ffffbbaa0612fc30 R08: 0000000000000001 R09: 000000000000000a
Apr 23 19:29:14 texas kernel: [ 133.673618] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673618] R13: ffffbbaa0612fdc0 R14: 0000000000000000 R15: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673619] FS: 00007ff849b11b40(0000) GS:ffff94843c180000(0000) knlGS:0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673620] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 23 19:29:14 texas kernel: [ 133.673620] CR2: 0000000000000008 CR3: 0000000847680006 CR4: 00000000003606e0
Apr 23 19:29:14 texas kernel: [ 133.673621] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673622] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 23 19:29:14 texas kernel: [ 133.673622] Call Trace:
Apr 23 19:29:14 texas kernel: [ 133.673657] release_engine+0x1e/0xd0 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673687] dc_link_aux_transfer+0xfc/0x150 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673720] dm_dp_aux_transfer+0x61/0x130 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673723] drm_dp_dpcd_access+0x75/0x110 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673726] drm_dp_dpcd_read+0x33/0xc0 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673730] auxdev_read_iter+0xe6/0x1a0 [drm_kms_helper]
Apr 23 19:29:14 texas kernel: [ 133.673731] new_sync_read+0x109/0x170
Apr 23 19:29:14 texas kernel: [ 133.673733] __vfs_read+0x29/0x40
Apr 23 19:29:14 texas kernel: [ 133.673734] vfs_read+0x99/0x160
Apr 23 19:29:14 texas kernel: [ 133.673735] ksys_read+0x55/0xc0
Apr 23 19:29:14 texas kernel: [ 133.673736] __x64_sys_read+0x1a/0x20
Apr 23 19:29:14 texas kernel: [ 133.673737] do_syscall_64+0x5a/0x110
Apr 23 19:29:14 texas kernel: [ 133.673738] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 23 19:29:14 texas kernel: [ 133.673739] RIP: 0033:0x7ff84ccb4d94
Apr 23 19:29:14 texas kernel: [ 133.673740] Code: 84 00 00 00 00 00 41 54 49 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 5b fc ff ff 4c 89 e2 48 89 ee 89 df 41 89 c0 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 38 44 89 c7 48 89 44 24 08 e8 97 fc ff ff 48
Apr 23 19:29:14 texas kernel: [ 133.673740] RSP: 002b:00007ffe268d7130 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.673741] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007ff84ccb4d94
Apr 23 19:29:14 texas kernel: [ 133.673742] RDX: 0000000000000001 RSI: 00007ffe268d7194 RDI: 0000000000000013
Apr 23 19:29:14 texas kernel: [ 133.673742] RBP: 00007ffe268d7194 R08: 0000000000000000 R09: 00007ff84cca13d0
Apr 23 19:29:14 texas kernel: [ 133.673743] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
Apr 23 19:29:14 texas kernel: [ 133.673743] R13: 00007ffe268d7200 R14: 0000000000000001 R15: 000056469f7bd0e0
Apr 23 19:29:14 texas kernel: [ 133.673744] Modules linked in: thunderbolt rfcomm xt_owner ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge xt_CHECKSUM xt_tcpudp stp llc iptable_filter iptable_mangle bpfilter ccm snd_hda_codec_realtek snd_hda_codec_generic pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) cmac vboxdrv(OE) bnep binfmt_misc dell_rbtn nls_iso8859_1 joydev arc4 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus intel_rapl snd_pcm_dmaengine x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi kvm_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul i915 snd_pcm crc32_pclmul iwlmvm uvcvideo amdgpu snd_seq_midi ghash_clmulni_intel snd_seq_midi_event mac80211 videobuf2_vmalloc kvmgt videobuf2_memops vfio_mdev videobuf2_v4l2 snd_rawmidi dell_laptop mdev videobuf2_common
Apr 23 19:29:14 texas kernel: [ 133.673754] ledtrig_audio vfio_iommu_type1 videodev dell_smm_hwmon vfio snd_seq dell_wmi media kvm chash btusb snd_seq_device amd_iommu_v2 btrtl snd_timer btbcm dell_smbios gpu_sched irqbypass btintel dcdbas ttm aesni_intel iwlwifi bluetooth drm_kms_helper aes_x86_64 crypto_simd cryptd glue_helper rtsx_pci_ms input_leds snd drm ecdh_generic intel_cstate mei_me ucsi_acpi cfg80211 serio_raw dell_wmi_descriptor intel_wmi_thunderbolt wmi_bmof memstick i2c_algo_bit mei fb_sys_fops intel_rapl_perf idma64 syscopyarea hid_multitouch processor_thermal_device soundcore sysfillrect virt_dma typec_ucsi sysimgblt intel_soc_dts_iosf intel_pch_thermal typec int3403_thermal int340x_thermal_zone dell_smo8800 acpi_pad intel_hid int3400_thermal mac_hid acpi_thermal_rel sparse_keymap sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic rtsx_pci_sdmmc nvme e1000e i2c_i801 intel_lpss_pci rtsx_pci nvme_core intel_lpss i2c_hid wmi hid video pinctrl_cannonlake pinctrl_intel
Apr 23 19:29:14 texas kernel: [ 133.673765] CR2: 0000000000000008
Apr 23 19:29:14 texas kernel: [ 133.673766] ---[ end trace b363bbe01edada4a ]---
Apr 23 19:29:14 texas kernel: [ 133.696801] RIP: 0010:dal_ddc_close+0xd/0x30 [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.696804] Code: e8 38 f5 ff ff 48 8b 55 f8 65 48 33 14 25 28 00 00 00 75 02 c9 c3 e8 02 01 84 e1 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb <48> 8b 7f 08 e8 0a f6 ff ff 48 8b 3b e8 02 f6 ff ff 5b 5d c3 66 2e
Apr 23 19:29:14 texas kernel: [ 133.696805] RSP: 0018:ffffbbaa0612fc28 EFLAGS: 00010246
Apr 23 19:29:14 texas kernel: [ 133.696806] RAX: ffffffffc1052ad0 RBX: 0000000000000000 RCX: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696807] RDX: 00000000ffffffff RSI: 0000000000005c04 RDI: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696808] RBP: ffffbbaa0612fc30 R08: 0000000000000001 R09: 000000000000000a
Apr 23 19:29:14 texas kernel: [ 133.696809] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696809] R13: ffffbbaa0612fdc0 R14: 0000000000000000 R15: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696810] FS: 00007ff849b11b40(0000) GS:ffff94843c180000(0000) knlGS:0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696811] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 23 19:29:14 texas kernel: [ 133.696812] CR2: 0000000000000008 CR3: 0000000847680006 CR4: 00000000003606e0
Apr 23 19:29:14 texas kernel: [ 133.696813] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 23 19:29:14 texas kernel: [ 133.696813] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

This is where I tried to put the machine to sleep. This just repeats over and over for hours.

Apr 23 21:24:35 texas kernel: [ 7054.876888] PM: suspend entry (deep)
Apr 23 21:24:55 texas kernel: [ 7054.876891] PM: Syncing filesystems ... done.
Apr 23 21:24:55 texas kernel: [ 7054.882703] Freezing user space processes ...
Apr 23 21:24:55 texas kernel: [ 7074.885463] Freezing of tasks failed after 20.002 seconds (2 tasks refusing to freeze, wq_busy=0):
Apr 23 21:24:55 texas kernel: [ 7074.885566] fwupd D 0 3560 1 0x00000324
Apr 23 21:24:55 texas kernel: [ 7074.885572] Call Trace:
Apr 23 21:24:55 texas kernel: [ 7074.885586] __schedule+0x2d0/0x840
Apr 23 21:24:55 texas kernel: [ 7074.885593] schedule+0x2c/0x70
Apr 23 21:24:55 texas kernel: [ 7074.885599] schedule_preempt_disabled+0xe/0x10
Apr 23 21:24:55 texas kernel: [ 7074.885605] __mutex_lock.isra.10+0x2e4/0x4c0
Apr 23 21:24:55 texas kernel: [ 7074.885612] ? mntput+0x24/0x40
Apr 23 21:24:55 texas kernel: [ 7074.885618] __mutex_lock_slowpath+0x13/0x20
Apr 23 21:24:55 texas kernel: [ 7074.885624] mutex_lock+0x2c/0x30
Apr 23 21:24:55 texas kernel: [ 7074.885645] drm_dp_dpcd_access+0x62/0x110 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885660] drm_dp_dpcd_read+0x33/0xc0 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885677] auxdev_read_iter+0xe6/0x1a0 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885686] new_sync_read+0x109/0x170
Apr 23 21:24:55 texas kernel: [ 7074.885694] __vfs_read+0x29/0x40
Apr 23 21:24:55 texas kernel: [ 7074.885700] vfs_read+0x99/0x160
Apr 23 21:24:55 texas kernel: [ 7074.885704] ksys_read+0x55/0xc0
Apr 23 21:24:55 texas kernel: [ 7074.885709] __x64_sys_read+0x1a/0x20
Apr 23 21:24:55 texas kernel: [ 7074.885716] do_syscall_64+0x5a/0x110
Apr 23 21:24:55 texas kernel: [ 7074.885721] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 23 21:24:55 texas kernel: [ 7074.885725] RIP: 0033:0x7f6385956d94
Apr 23 21:24:55 texas kernel: [ 7074.885736] Code: Bad RIP value.
Apr 23 21:24:55 texas kernel: [ 7074.885740] RSP: 002b:00007ffdae1bf660 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 21:24:55 texas kernel: [ 7074.885745] RAX: ffffffffffffffda RBX: 0000000000000012 RCX: 00007f6385956d94
Apr 23 21:24:55 texas kernel: [ 7074.885748] RDX: 0000000000000001 RSI: 00007ffdae1bf6c4 RDI: 0000000000000012
Apr 23 21:24:55 texas kernel: [ 7074.885751] RBP: 00007ffdae1bf6c4 R08: 0000000000000000 R09: 00007f63859433d0
Apr 23 21:24:55 texas kernel: [ 7074.885754] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
Apr 23 21:24:55 texas kernel: [ 7074.885757] R13: 00007ffdae1bf730 R14: 0000000000000001 R15: 000055d7942800e0
Apr 23 21:24:55 texas kernel: [ 7074.885888] fwupd D 0 15153 1 0x00000324
Apr 23 21:24:55 texas kernel: [ 7074.885893] Call Trace:
Apr 23 21:24:55 texas kernel: [ 7074.885900] __schedule+0x2d0/0x840
Apr 23 21:24:55 texas kernel: [ 7074.885906] schedule+0x2c/0x70
Apr 23 21:24:55 texas kernel: [ 7074.885912] schedule_preempt_disabled+0xe/0x10
Apr 23 21:24:55 texas kernel: [ 7074.885917] __mutex_lock.isra.10+0x2e4/0x4c0
Apr 23 21:24:55 texas kernel: [ 7074.885922] ? mntput+0x24/0x40
Apr 23 21:24:55 texas kernel: [ 7074.885928] __mutex_lock_slowpath+0x13/0x20
Apr 23 21:24:55 texas kernel: [ 7074.885934] mutex_lock+0x2c/0x30
Apr 23 21:24:55 texas kernel: [ 7074.885948] drm_dp_dpcd_access+0x62/0x110 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885963] drm_dp_dpcd_read+0x33/0xc0 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885980] auxdev_read_iter+0xe6/0x1a0 [drm_kms_helper]
Apr 23 21:24:55 texas kernel: [ 7074.885987] new_sync_read+0x109/0x170
Apr 23 21:24:55 texas kernel: [ 7074.885995] __vfs_read+0x29/0x40
Apr 23 21:24:55 texas kernel: [ 7074.886001] vfs_read+0x99/0x160
Apr 23 21:24:55 texas kernel: [ 7074.886006] ksys_read+0x55/0xc0
Apr 23 21:24:55 texas kernel: [ 7074.886010] __x64_sys_read+0x1a/0x20
Apr 23 21:24:55 texas kernel: [ 7074.886016] do_syscall_64+0x5a/0x110
Apr 23 21:24:55 texas kernel: [ 7074.886021] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 23 21:24:55 texas kernel: [ 7074.886024] RIP: 0033:0x7f2d10a13d94
Apr 23 21:24:55 texas kernel: [ 7074.886031] Code: Bad RIP value.
Apr 23 21:24:55 texas kernel: [ 7074.886033] RSP: 002b:00007ffea8d1fcc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 23 21:24:55 texas kernel: [ 7074.886037] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007f2d10a13d94
Apr 23 21:24:55 texas kernel: [ 7074.886040] RDX: 0000000000000001 RSI: 00007ffea8d1fd24 RDI: 0000000000000013
Apr 23 21:24:55 texas kernel: [ 7074.886043] RBP: 00007ffea8d1fd24 R08: 0000000000000000 R09: 00007f2d10a003d0
Apr 23 21:24:55 texas kernel: [ 7074.886045] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
Apr 23 21:24:55 texas kernel: [ 7074.886048] R13: 00007ffea8d1fd90 R14: 0000000000000001 R15: 0000559521ae10e0
Apr 23 21:24:55 texas kernel: [ 7074.886098] OOM killer enabled.
Apr 23 21:24:55 texas kernel: [ 7074.886100] Restarting tasks ... done.
Apr 23 21:24:55 texas kernel: [ 7074.970970] PM: suspend exit
Apr 23 21:24:55 texas kernel: [ 7074.971019] PM: suspend entry (s2idle)
Apr 23 21:25:15 texas kernel: [ 7074.971020] PM: Syncing filesystems ... done.
Apr 23 21:25:15 texas kernel: [ 7074.975574] Freezing user space processes ...
---
ProblemType: Bug
ApportVersion: 2.20.10-0ubuntu27
Architecture: amd64
CurrentDesktop: Budgie:GNOME
DistroRelease: Ubuntu 19.04
InstallationDate: Installed on 2019-02-04 (84 days ago)
InstallationMedia: Ubuntu-Budgie 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.2)
Package: linux
PackageArchitecture: amd64
ProcVersionSignature: Ubuntu 5.0.0-13.14-generic 5.0.6
Tags: disco
Uname: Linux 5.0.0-13-generic x86_64
UpgradeStatus: Upgraded to disco on 2019-04-19 (10 days ago)
UserGroups: adm cdrom dialout dip kismet lpadmin plugdev sambashare sudo tty uucp wireshark
_MarkForUpload: True
mtime.conffile..etc.fwupd.daemon.conf: 2019-04-28T08:55:40.597463

Revision history for this message
Mario Limonciello (superm1) wrote :

This is a bug in the amdgpu kernel module.
When trying to access the DP aux channel it is causing system hangs.

At least in upstream fwupd there is a recently committed workaround: https://github.com/hughsie/fwupd/commit/57816a7907e5f56c2136af31a4f1ae5186d87565

For now you can do the following:
1. Blacklist synapticsmst plugin in fwupd in /etc/fwupd/daemon.conf
Or
2. Try a newer kernel (like 5.x)
Or
3. Backport that patch from fwupd

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1826691

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: disco
Revision history for this message
Jason Pritchard (jasonpritchard) wrote :

Thank you very much. I think I can live with the first option. I'll look more into the implications, but it seems to be better than not having the process running at all.

Not sure if I need to do anything to close this.

> 1. Blacklist synapticsmst plugin in fwupd in /etc/fwupd/daemon.conf

This seems to help. I reinstalled the fwupd package and added synapticsmst to the blacklist. Now I get responses back from the fwupdmgr command. It also responds to systemctl start/stop.

> 2. Try a newer kernel (like 5.x)

19.04 is running 5.0.

$ uname -rmv
5.0.0-13-generic #14-Ubuntu SMP Mon Apr 15 14:59:14 UTC 2019 x86_64

> 3. Backport that patch from fwupd

I tried pulling the repo and building the latest 1.2.8 tag, but I didn't make it very far into the build. I'm not very familiar with ninja, so I'll have to see if I can figure out why it's failing. If I get it all the way through, I'll see if this works without the blacklisting.

Revision history for this message
Mario Limonciello (superm1) wrote :

This is definitely a kernel bug, but it's exaggerated by userspace.
So whether it should be worked around in userspace or fixed in the kernel space is debatable.

For now, I believe you can employ that workaround to keep your system stable, but I would ask the following.

1) Can you please follow the things that the bot said to add your logs to the bug.
2) Can you please try latest mainline (5.1rc7: https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.1-rc7/) and confirm it doesn't help? I don't expect it to.
3) If the fwupd workaround does end up being backported to Disco, please test it for the purposes of the SRU, you would see notification in this bug.

Changed in fwupd (Ubuntu):
status: New → Triaged
Changed in fwupd (Ubuntu Disco):
status: New → Triaged
Changed in linux (Ubuntu Disco):
status: New → Incomplete
Changed in linux (Ubuntu Eoan):
importance: Undecided → High
Changed in linux (Ubuntu Disco):
importance: Undecided → High
Changed in fwupd (Ubuntu Eoan):
importance: Undecided → Low
Changed in fwupd (Ubuntu Disco):
importance: Undecided → Low
Revision history for this message
Jason Pritchard (jasonpritchard) wrote : Dependencies.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Jason Pritchard (jasonpritchard) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Jason Pritchard (jasonpritchard) wrote : ProcEnviron.txt

apport information

Revision history for this message
Jason Pritchard (jasonpritchard) wrote : modified.conffile..etc.fwupd.daemon.conf.txt

apport information

Revision history for this message
You-Sheng Yang (vicamo) wrote :

This is a small C program doing exactly the same thing as fwupd does to trigger this issue. Usage: `sudo ./read-dpcd /dev/drm_dp_aux<N>`.

To find out what <N> is, replace ExecStart property in `/lib/systemd/system/fwupd.service` from `/usr/lib/fwupd/fwupd` to `/usr/bin/strace -D -tt -y -f -o /var/lib/fwupd/strace.log /usr/lib/fwupd/fwupd`. Find the last open /dev/drm_dp_aux*.

This was first reported on a premature hardware platform. It was fixed automatically with a second DVT release and without additional effort being done in the kernel space.

Revision history for this message
Jason Pritchard (jasonpritchard) wrote :

You're probably right about the driver. I've seen other complaints from the drm layer around amdgpu. I'll try to watch dmesg more often to catch more of these.

Just added the files from apport.

As for the rest:

> 1) Can you please follow the things that the bot said to add your logs to the bug.

Done

> 2) Can you please try latest mainline (5.1rc7: https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.1-rc7/) and confirm it doesn't help? I don't expect it to.

I'll look into this. May be the weekend before I get some more time.

> 3) If the fwupd workaround does end up being backported to Disco, please test it for the purposes of the SRU, you would see notification in this bug.

Will do.

FWIW, I finally got fwupd to compile fully. I was missing gio-cli. If I can figure out how to set a prefix in meson to something I can easily uninstall, I'll test an install of the 1.2.8 tag or a cherry pick of that patch on 1.2.5.

Revision history for this message
Jason Pritchard (jasonpritchard) wrote :

@You-Sheng

Sorry, our messages must have crossed paths.

Appears to be the third device from the strace of the fwupd process in the hung state.

$ grep -i drm strace.log | tail -n 2
2897 20:01:01.099170 lseek(19</dev/drm_dp_aux3>, 1200, SEEK_SET) = 1200
2897 20:01:01.099193 read(19</dev/drm_dp_aux3>, <unfinished ...>) = ?

So as you suspected, when I ran your test program I got the same hang.

sudo ./read-dpcd /dev/drm_dp_aux3

Hangs forever. Does not respond to various kill signals. Must be the caps aux_read call because there are no prints before the hang.

Revision history for this message
Mario Limonciello (superm1) wrote :

So this issue needs to be reported upstream to AMD @FDO. https://bugs.freedesktop.org
Having the very simple replicator should be good to include it too.

Revision history for this message
In , Jason Pritchard (jasonpritchard) wrote :
Download full text (14.3 KiB)

Created attachment 144154
Test file: read-dpcd.c

While working with the Ubuntu maintainers of fwupd, they've determined that I have an issue with the AMD driver on Ubuntu's 5.0 kernel in 19.04. In the sample program that they provided (see read-dpcd.c attached) the call to aux_read(fd, REG_RC_CAP, buf, 1) hangs on my machine. They recommended I post the issue here.

Machine is a Dell 7730 with AMD WX4150 graphics.

See the original bug report here:
https://bugs.launchpad.net/ubuntu/+source/fwupd/+bug/1826691

Not sure if it's related, but here are dmesg warnings from the other ticket.

Apr 23 19:29:14 texas kernel: [ 133.673290] [drm] REG_WAIT timeout 10us * 160 tries - submit_channel_request line:246
Apr 23 19:29:14 texas kernel: [ 133.673348] WARNING: CPU: 6 PID: 2467 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:249 generic_reg_wait.cold.3+0x25/0x2c [amdgpu]
Apr 23 19:29:14 texas kernel: [ 133.673349] Modules linked in: thunderbolt rfcomm xt_owner ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge xt_CHECKSUM xt_tcpudp stp llc iptable_filter iptable_mangle bpfilter ccm snd_hda_codec_realtek snd_hda_codec_generic pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) cmac vboxdrv(OE) bnep binfmt_misc dell_rbtn nls_iso8859_1 joydev arc4 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus intel_rapl snd_pcm_dmaengine x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi kvm_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep crct10dif_pclmul i915 snd_pcm crc32_pclmul iwlmvm uvcvideo amdgpu snd_seq_midi ghash_clmulni_intel snd_seq_midi_event mac80211 videobuf2_vmalloc kvmgt videobuf2_memops vfio_mdev videobuf2_v4l2 snd_rawmidi dell_laptop mdev videobuf2_common
Apr 23 19:29:14 texas kernel: [ 133.673362] ledtrig_audio vfio_iommu_type1 videodev dell_smm_hwmon vfio snd_seq dell_wmi media kvm chash btusb snd_seq_device amd_iommu_v2 btrtl snd_timer btbcm dell_smbios gpu_sched irqbypass btintel dcdbas ttm aesni_intel iwlwifi bluetooth drm_kms_helper aes_x86_64 crypto_simd cryptd glue_helper rtsx_pci_ms input_leds snd drm ecdh_generic intel_cstate mei_me ucsi_acpi cfg80211 serio_raw dell_wmi_descriptor intel_wmi_thunderbolt wmi_bmof memstick i2c_algo_bit mei fb_sys_fops intel_rapl_perf idma64 syscopyarea hid_multitouch processor_thermal_device soundcore sysfillrect virt_dma typec_ucsi sysimgblt intel_soc_dts_iosf intel_pch_thermal typec int3403_thermal int340x_thermal_zone dell_smo8800 acpi_pad intel_hid int3400_thermal mac_hid acpi_thermal_rel sparse_keymap sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic rtsx_pci_sdmmc nvme e1000e i2c_i801 intel_lpss_pci rtsx_pci nvme_core intel_lpss i2c_hid wmi hid video pinctrl_cannonlake pinctrl_intel
Apr 23 19:29:14 texas kernel: [ 133.673381] CPU: 6 PID: 2467 Comm: fwupd Tainted: G OE 5.0.0-13-generic #14-Ubuntu
Apr 23 19:29:14 texas kernel: [ 133.673382] Hardware name: Dell Inc. Precision 7730/05W5TJ, BIOS 1.7.0 02/19/2019
Apr 23 19:29:14 texas kernel: [ 133.67...

Revision history for this message
Jason Pritchard (jasonpritchard) wrote :
Changed in linux (Ubuntu Disco):
status: Incomplete → Triaged
Changed in linux (Ubuntu Eoan):
status: Incomplete → Triaged
Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
Mario Limonciello (superm1) wrote :

I'm marking eoan as fixed, as it has 1.2.10 which contains the workaround to avoid using synaptics mst on amdgpu devices.

The kernel tasks is still open as this is still a real kernel problem.

Changed in fwupd (Ubuntu Eoan):
status: Triaged → Fix Released
Changed in fwupd (Ubuntu Disco):
status: Triaged → Won't Fix
Revision history for this message
In , Martin-peres-n (martin-peres-n) wrote :

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/760.

Changed in linux:
importance: Unknown → Low
Steve Langasek (vorlon)
Changed in linux (Ubuntu Disco):
status: Triaged → Won't Fix
Revision history for this message
Brian Murray (brian-murray) wrote :

eoan has reached end of life, so this bug will not be fixed for that release

Changed in linux (Ubuntu Eoan):
status: Triaged → Won't Fix
Revision history for this message
Mario Limonciello (superm1) wrote :

So I have reason to believe this is fixed in kernel 5.2 and later by this commit: https://github.com/torvalds/linux/commit/8ae5b1d78d4acbe9755570f26703962877f9108a

Would someone affected be able to check the with a modern kernel if it can still reproduce? On some Renoir hardware I found that as far back as Ubuntu's 5.8 kernel things are working properly with the test application. I can't go much older than that though because of when Renoir was introduced.

Revision history for this message
Jason Pritchard (jasonpritchard) wrote :

I tried to reproduce using the same read-dpcd as above on #11. I don't see the hang anymore using 5.4 on 20.04.

Thanks for following up.

Changed in linux (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.