the machine of lenovo M715 with the AMD GPU (Radeon Vega 8 Mobile, rev ca, 1002:15dd) often hangs randomly

Bug #1796789 reported by Hui Wang
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Critical
Hui Wang
Bionic
Fix Released
Undecided
Unassigned
linux-oem (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
On the machine of Lenovo M715, there is an AMD GPU (1022:15dd rev ca), when it
switchs to amdgpufb, the system will hang randomlly, sometimes it hangs during
boot, reboot or poweroff, sometimes it hangs with longtime standby.

[Fix]
Through bisecting, I found this patch can fix the problem, looks like without
this patch the ATOM BIOS can't be parsed correctlly.

[Test Case]
Did the test of "boot, reboot and poweroff" 5 times, worked very well.
Let the system standby over one night, worked very well.

[Regression Potential]
Very low, I tested this patch on at least 6 differnt lenovo machines
and those machines have different AMD GPUs on them, all of them worked
as well as before.

Hui Wang (hui.wang)
Changed in linux (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1796789

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Hui Wang (hui.wang)
tags: added: originate-from-1789802 sutton
Hui Wang (hui.wang)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.18.0-10.11

---------------
linux (4.18.0-10.11) cosmic; urgency=medium

  * linux: 4.18.0-10.11 -proposed tracker (LP: #1797379)

  * the machine of lenovo M715 with the AMD GPU (Radeon Vega 8 Mobile, rev ca,
    1002:15dd) often hangs randomly (LP: #1796789)
    - drm/amd: Add missing fields in atom_integrated_system_info_v1_11

  * Miscellaneous Ubuntu changes
    - [Config] CONFIG_VBOXGUEST=n
    - ubuntu: vbox -- update to 5.2.18-dfsg-2
    - ubuntu: enable vbox build

 -- Seth Forshee <email address hidden> Thu, 11 Oct 2018 08:24:45 -0500

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Bionic):
status: New → In Progress
status: In Progress → Fix Committed
Timo Aaltonen (tjaalton)
Changed in linux-oem (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Hui Wang (hui.wang)
tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package linux - 4.15.0-39.42

---------------
linux (4.15.0-39.42) bionic; urgency=medium

  * linux: 4.15.0-39.42 -proposed tracker (LP: #1799411)

  * Linux: insufficient shootdown for paging-structure caches (LP: #1798897)
    - mm: move tlb_table_flush to tlb_flush_mmu_free
    - mm/tlb: Remove tlb_remove_table() non-concurrent condition
    - mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
    - [Config] CONFIG_HAVE_RCU_TABLE_INVALIDATE=y

  * Ubuntu18.04: GPU total memory is reduced (LP: #1792102)
    - Revert "powerpc/powernv: Increase memory block size to 1GB on radix"

  * arm64: snapdragon: reduce boot noise (LP: #1797154)
    - [Config] arm64: snapdragon: DRM_MSM=m
    - [Config] arm64: snapdragon: SND*=m
    - [Config] arm64: snapdragon: disable ARM_SDE_INTERFACE
    - [Config] arm64: snapdragon: disable DRM_I2C_ADV7511_CEC
    - [Config] arm64: snapdragon: disable VIDEO_ADV7511, VIDEO_COBALT

  * [Bionic] CPPC bug fixes (LP: #1796949)
    - ACPI / CPPC: Update all pr_(debug/err) messages to log the susbspace id
    - cpufreq: CPPC: Don't set transition_latency
    - ACPI / CPPC: Fix invalid PCC channel status errors

  * regression in 'ip --family bridge neigh' since linux v4.12 (LP: #1796748)
    - rtnetlink: fix rtnl_fdb_dump() for ndmsg header

  * screen displays abnormally on the lenovo M715 with the AMD GPU (Radeon Vega
    8 Mobile, rev ca, 1002:15dd) (LP: #1796786)
    - drm/amd/display: Fix takover from VGA mode
    - drm/amd/display: early return if not in vga mode in disable_vga
    - drm/amd/display: Refine disable VGA

  * arm64: snapdragon: WARNING: CPU: 0 PID: 1 arch/arm64/kernel/setup.c:271
    reserve_memblock_reserved_regions (LP: #1797139)
    - SAUCE: arm64: Fix /proc/iomem for reserved but not memory regions

  * The front MIC can't work on the Lenovo M715 (LP: #1797292)
    - ALSA: hda/realtek - Fix the problem of the front MIC on the Lenovo M715

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  * rpi3b+: ethernet not working (LP: #1797406)
    - lan78xx: Don't reset the interface on open

  * 87cdf3148b11 was never backported to 4.15 (LP: #1795653)
    - xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto

  * [Ubuntu18.04][Power9][DD2.2]package installation segfaults inside debian
    chroot env in P9 KVM guest with HTM enabled (kvm) (LP: #1792501)
    - KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds

  * Provide mode where all vCPUs on a core must be the same VM (LP: #1792957)
    - KVM: PPC: Book3S HV: Provide mode where all vCPUs on a core must be the same
      VM

  * fscache: bad refcounting in fscache_op_complete leads to OOPS (LP: #1797314)
    - SAUCE: fscache: Fix race in decrementing refcount of op->npages

  * CVE-2018-9363
    - Bluetooth: hidp: buffer overflow in hidp_process_report

  * CVE-20...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.6 KiB)

This bug was fixed in the package linux-oem - 4.15.0-1026.31

---------------
linux-oem (4.15.0-1026.31) bionic; urgency=medium

  * linux-oem: 4.15.0-1026.31 -proposed tracker (LP: #1800788)

  * Thunderbolt runtime D3 and PCIe D3 Cold support (LP: #1800770)
    - ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
    - ACPI / hotplug / PCI: Mark stale PCI devices disconnected
    - ACPI / hotplug / PCI: Drop unnecessary parentheses
    - PCI: Account for all bridges on bus when distributing bus numbers
    - PCI: Move resource distribution for single bridge outside loop
    - PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
    - ACPICA: Recognize the Windows 10 version 1607 and 1703 OSI strings
    - ACPICA: Recognize the _OSI string "Windows 2017.2"
    - PCI: Do not skip power-managed bridges in pci_enable_wake()
    - PCI / ACPI: Enable wake automatically for power managed bridges
    - PCI: pciehp: Fix use-after-free on unplug
    - PCI: hotplug: Drop checking of PCI_BRIDGE_CONTROL in *_unconfigure_device()
    - PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
    - PCI: pciehp: Declare pciehp_unconfigure_device() void
    - PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on
      resume
    - PCI: pciehp: Document struct slot and struct controller
    - PCI: hotplug: Don't leak pci_slot on registration failure
    - PCI: pciehp: Fix unprotected list iteration in IRQ handler
    - PCI: pciehp: Drop unnecessary NULL pointer check
    - PCI: pciehp: Convert to threaded IRQ
    - PCI: pciehp: Convert to threaded polling
    - PCI: pciehp: Stop blinking on slot enable failure
    - PCI: pciehp: Handle events synchronously
    - PCI: pciehp: Drop slot workqueue
    - PCI/hotplug: ppc: correct a php_slot usage after free
    - PCI: hotplug: Demidlayer registration with the core
    - PCI: pciehp: Publish to user space last on probe
    - PCI: pciehp: Track enable/disable status
    - PCI: pciehp: Enable/disable exclusively from IRQ thread
    - PCI: pciehp: Drop enable/disable lock
    - PCI: pciehp: Declare pciehp_enable/disable_slot() static
    - PCI: pciehp: Tolerate initially unstable link
    - PCI: pciehp: Become resilient to missed events
    - PCI: pciehp: Always enable occupied slot on probe
    - PCI: pciehp: Avoid slot access during reset
    - PCI: portdrv: Deduplicate PM callback iterator
    - PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/
    - PCI/portdrv: Merge pcieport_if.h into portdrv.h
    - PCI/PM: Move pcie_clear_root_pme_status() to core
    - PCI/portdrv: Remove pcie_port_bus_type link order dependency
    - PCI/portdrv: Disable port driver in compat mode
    - PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
    - PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
    - PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
    - PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
    - PCI: pciehp: Clear spurious events earlier on resume
    - PCI: pciehp: Obey compulsory command delay after resume
    - PCI: pciehp: Support interrupts sent from D3hot
    - PCI: pciehp: Resume to D0 on enable/disable
    - PCI: pciehp: Resum...

Read more...

Changed in linux-oem (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux-oem (Ubuntu):
status: New → Fix Released
Changed in hwe-next:
status: New → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.