[18.04] GLK hang after a while

Bug #1760545 reported by quanxian
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Seth Forshee
Bionic
Fix Released
Medium
Seth Forshee
linux-oem (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

SRU Justification

[Impact]
i915 lacks information about the glk_dmc_ver1_04.bin firmware file in
modinfo, so it is not included in the initrd along with the i915 driver.
Thus the firmware does not get loaded. Loading the firmware is said to
prevent a hang.

In addition to that, this also causes GLK's HDMI audio codec stops
working after S3.

[Fix]
Add a MODULE_FIRMWARE statement for the firmware.

This information is required to let initramfs-tools includes the
firmware.

[Test Case]
Without the firmware there will be a "Direct firmware load for
i915/glk_dmc_ver1_04.bin failed with error -2" line in dmesg. With the
firmware there is no such message.

I can confirm the GLK HDMI audio issue is gone when firmware is loaded.

[Regression Potential]
Minimal. Will only be loaded by i915 for specific hardware, and loading
the firmware is known to fix a hang.

---

Description:

Platform information:
Label: GLK02SDP
Processor: Silver N5000
Bios: GELKRVPA.X64.0083.B30.1801162142
OS: Ubuntu 18.04
Kernel: 4.15.0-10-generic

Details:
Power on, and enter into the system;
After a period, sometimes about 20 minutes, sometimes 1 hour, the machine will hang.
We can only power it off manually, then power on to enter into OS again.

Revision history for this message
quanxian (quanxian-wang) wrote :

This should be a regression issue. We don't find this in 17.10.

Revision history for this message
quanxian (quanxian-wang) wrote :

from the log file, we found glk_dmc firmware load failed.

I have tried this, seems GLK back to well.

Log:

test-NUC7PJYH:~ #dmesg | grep -i 'error\|fail\|unknown\|unsupport\|bad\|warning'
[ 1.985921] i915 0000:00:02.0: Direct firmware load for i915/glk_dmc_ver1_04.bin failed with error -2
[ 1.985924] i915 0000:00:02.0: Failed to load DMC firmware [https://01.org/linuxgraphics/downloads/firmware], disabling runtime power management.

You can download this dmc file from
https://01.org/sites/default/files/downloads/intelr-graphics-linux/glkdmcver104.tar_0.bz2

By the way, when you download this file, please use install.sh in the package to install the firmware, seems others files are needed.

Revision history for this message
Seth Forshee (sforshee) wrote :

We have the exact same glk_dmc file in our linux-firmware package, which we received from upstream linux-firmware. The additional files should not be needed. The -2 error code does indicate it couldn't find the file, but perhaps the linux-firmware package was not up to date?

Can you confirm that you have the latest linux-firmware installed (1.173 as of right now) and that the /lib/firmware/glk_dmc_ver1_04.bin is present? Then if you still see problems, please let me know.

Changed in intel:
status: New → Incomplete
Revision history for this message
Seth Forshee (sforshee) wrote :

Oh, or likely the issue is that the firmware doesn't get included in the initrd because there's no MODULE_FIRMWARE statement for it in the driver.

Revision history for this message
Seth Forshee (sforshee) wrote :

$ modinfo -F firmware /lib/modules/$(uname -r)/kernel/drivers/gpu/drm/i915/i915.ko
i915/bxt_dmc_ver1_07.bin
i915/skl_dmc_ver1_26.bin
i915/kbl_dmc_ver1_01.bin
i915/skl_guc_ver6_1.bin
i915/kbl_huc_ver02_00_1810.bin
i915/bxt_huc_ver01_07_1398.bin
i915/skl_huc_ver01_07_1398.bin

No glk_dmc file listed.

Seth Forshee (sforshee)
information type: Proprietary → Public
Changed in linux (Ubuntu Bionic):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → Medium
status: New → In Progress
Seth Forshee (sforshee)
description: updated
Revision history for this message
Linuxium (linuxium.com.au) wrote :

I've already submitted a patch to mainline Cc: stable (https://patchwork.kernel.org/patch/10335195) which will address this issue if accepted (note: the current discussion is about the 'acceptability and health' of using the firmware with recent kernels.

Revision history for this message
Linuxium (linuxium.com.au) wrote :

A mainline patch is available now in v4.17-rc4.

It is probably best to back port commit b607990c76ceda0a7a7ceacabab174cdc8b9beee for Ubuntu 4.15 and Ubuntu 4.16 (if released) as the mainline 'Cc: stable' tag is stalled.

The patch/commit fixes the bug as documented above:

ubuntu@ubuntu:~$ dmesg | egrep 'Linux version|DMI:|i915'
[ 0.000000] Linux version 4.17.0-041700rc4-generic (kernel@kathleen) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #201805070430 SMP Mon May 7 04:31:46 UTC 2018
[ 0.000000] DMI: Intel Corporation NUC7CJYS/NUC7JYB, BIOS JYGLKCPX.86A.0024.2017.1229.1454 12/29/2017
[ 2.489175] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 2.489705] [drm] Finished loading DMC firmware i915/glk_dmc_ver1_04.bin (v1.4)
[ 2.493790] [drm] Initialized i915 1.6.0 20180308 for 0000:00:02.0 on minor 0
[ 2.567821] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
ubuntu@ubuntu:~$

Revision history for this message
Linuxium (linuxium.com.au) wrote :

Attached is the patch (or commit b607990c76ceda0a7a7ceacabab174cdc8b9beee).

tags: added: patch
Changed in intel:
status: Incomplete → Triaged
no longer affects: intel
Revision history for this message
Chris Allen (callen92) wrote :

What is the timeline for having this fix backported to the 4.15 kernel?

description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Timo Aaltonen (tjaalton)
Changed in linux-oem (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Verified by Alex Tu.

tags: added: verification-done-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem - 4.15.0-1024.29

---------------
linux-oem (4.15.0-1024.29) bionic; urgency=medium

  * linux-oem: 4.15.0-1024.29 -proposed tracker (LP: #1797069)

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  [ Ubuntu: 4.15.0-38.41 ]

  * linux: 4.15.0-38.41 -proposed tracker (LP: #1797061)
  * Silent data corruption in Linux kernel 4.15 (LP: #1796542)
    - block: add a lower-level bio_add_page interface
    - block: bio_iov_iter_get_pages: fix size of last iovec
    - blkdev: __blkdev_direct_IO_simple: fix leak in error case
    - block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

 -- Chia-Lin Kao (AceLan) <email address hidden> Tue, 16 Oct 2018 10:32:03 +0800

Changed in linux-oem (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
N1nj4888 (n1nj4888) wrote :

Hi Guys,

This might be a very noob question but, if I'm running Ubuntu 18.04.1 LTS, how do I install this fix? I noted that the fix is done in linux-oem package but a "sudo apt list --installed" seems to suggest this isn't currently an installed package on my installation?

Is it simply a case of either (A) running "sudo apt-get install linux-oem" or (B) Waiting for this to be fixed in 18.04.1 LTS and available via the standard apt-get update / apt-get upgrade mechanism?

If this is still to be fixed/released in 18.04.1 LTS, is there a timeline for when the fix would be available?

Thanks!

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

The fix will be in next -generic kernel release.

Revision history for this message
N1nj4888 (n1nj4888) wrote :

Thanks! Is there a timeline for wen this new linux-generic kernel may be released for 18.04 / 4.15.x?

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

The kernel is in bionic-proposed, -39.42. For some reason this bug wasn't spammed about it.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem - 4.15.0-1024.29

---------------
linux-oem (4.15.0-1024.29) bionic; urgency=medium

  * linux-oem: 4.15.0-1024.29 -proposed tracker (LP: #1797069)

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  [ Ubuntu: 4.15.0-38.41 ]

  * linux: 4.15.0-38.41 -proposed tracker (LP: #1797061)
  * Silent data corruption in Linux kernel 4.15 (LP: #1796542)
    - block: add a lower-level bio_add_page interface
    - block: bio_iov_iter_get_pages: fix size of last iovec
    - blkdev: __blkdev_direct_IO_simple: fix leak in error case
    - block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

 -- Chia-Lin Kao (AceLan) <email address hidden> Tue, 16 Oct 2018 10:32:03 +0800

Changed in linux-oem (Ubuntu):
status: New → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package linux - 4.15.0-39.42

---------------
linux (4.15.0-39.42) bionic; urgency=medium

  * linux: 4.15.0-39.42 -proposed tracker (LP: #1799411)

  * Linux: insufficient shootdown for paging-structure caches (LP: #1798897)
    - mm: move tlb_table_flush to tlb_flush_mmu_free
    - mm/tlb: Remove tlb_remove_table() non-concurrent condition
    - mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
    - [Config] CONFIG_HAVE_RCU_TABLE_INVALIDATE=y

  * Ubuntu18.04: GPU total memory is reduced (LP: #1792102)
    - Revert "powerpc/powernv: Increase memory block size to 1GB on radix"

  * arm64: snapdragon: reduce boot noise (LP: #1797154)
    - [Config] arm64: snapdragon: DRM_MSM=m
    - [Config] arm64: snapdragon: SND*=m
    - [Config] arm64: snapdragon: disable ARM_SDE_INTERFACE
    - [Config] arm64: snapdragon: disable DRM_I2C_ADV7511_CEC
    - [Config] arm64: snapdragon: disable VIDEO_ADV7511, VIDEO_COBALT

  * [Bionic] CPPC bug fixes (LP: #1796949)
    - ACPI / CPPC: Update all pr_(debug/err) messages to log the susbspace id
    - cpufreq: CPPC: Don't set transition_latency
    - ACPI / CPPC: Fix invalid PCC channel status errors

  * regression in 'ip --family bridge neigh' since linux v4.12 (LP: #1796748)
    - rtnetlink: fix rtnl_fdb_dump() for ndmsg header

  * screen displays abnormally on the lenovo M715 with the AMD GPU (Radeon Vega
    8 Mobile, rev ca, 1002:15dd) (LP: #1796786)
    - drm/amd/display: Fix takover from VGA mode
    - drm/amd/display: early return if not in vga mode in disable_vga
    - drm/amd/display: Refine disable VGA

  * arm64: snapdragon: WARNING: CPU: 0 PID: 1 arch/arm64/kernel/setup.c:271
    reserve_memblock_reserved_regions (LP: #1797139)
    - SAUCE: arm64: Fix /proc/iomem for reserved but not memory regions

  * The front MIC can't work on the Lenovo M715 (LP: #1797292)
    - ALSA: hda/realtek - Fix the problem of the front MIC on the Lenovo M715

  * Keyboard backlight sysfs sometimes is missing on Dell laptops (LP: #1797304)
    - platform/x86: dell-smbios: Correct some style warnings
    - platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
    - platform/x86: dell-smbios: Link all dell-smbios-* modules together
    - [Config] CONFIG_DELL_SMBIOS_SMM=y, CONFIG_DELL_SMBIOS_WMI=y

  * rpi3b+: ethernet not working (LP: #1797406)
    - lan78xx: Don't reset the interface on open

  * 87cdf3148b11 was never backported to 4.15 (LP: #1795653)
    - xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto

  * [Ubuntu18.04][Power9][DD2.2]package installation segfaults inside debian
    chroot env in P9 KVM guest with HTM enabled (kvm) (LP: #1792501)
    - KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds

  * Provide mode where all vCPUs on a core must be the same VM (LP: #1792957)
    - KVM: PPC: Book3S HV: Provide mode where all vCPUs on a core must be the same
      VM

  * fscache: bad refcounting in fscache_op_complete leads to OOPS (LP: #1797314)
    - SAUCE: fscache: Fix race in decrementing refcount of op->npages

  * CVE-2018-9363
    - Bluetooth: hidp: buffer overflow in hidp_process_report

  * CVE-20...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.