DINO2M - System hangs with a black screen during s4 stress test

Bug #1616781 reported by AceLan Kao
This bug affects 2 people
Affects Status Importance Assigned to Milestone
HWE Next
linux (Ubuntu)
AceLan Kao

Bug Description

System hangs with a black screen during s4 stress test

Keyboard backlight LED is still on and adjustable, but tty1 is not avalible to switch to.
System needs to be hardware shutdown.

1. install Ubuntu 16.04 and boot to OS
2. run s4 stress test (30 cycles)
3. check if the test can be completed

Expected results: Test should be able to be completed

Actual results: System hangs with a black screen

Additional information:
BIOS: 99.2.22

CPU: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz (4x)

GPU: 00:02.0 VGA compatible controller: Intel Corporation Device 5916 (rev 02)

Revision history for this message
AceLan Kao (acelankao) wrote :
Download full text (3.7 KiB)

This commit fixes this issue

commit 65c0554b73c920023cc8998802e508b798113b46
Author: Rafael J. Wysocki <email address hidden>
Date: Thu Jun 30 18:11:41 2016 +0200

    x86/power/64: Fix kernel text mapping corruption during image restoration

    Logan Gunthorpe reports that hibernation stopped working reliably for
    him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table
    and rodata).

    That turns out to be a consequence of a long-standing issue with the
    64-bit image restoration code on x86, which is that the temporary
    page tables set up by it to avoid page tables corruption when the
    last bits of the image kernel's memory contents are copied into
    their original page frames re-use the boot kernel's text mapping,
    but that mapping may very well get corrupted just like any other
    part of the page tables. Of course, if that happens, the final
    jump to the image kernel's entry point will go to nowhere.

    The exact reason why commit ab76f7b4ab23 matters here is that it
    sometimes causes a PMD of a large page to be split into PTEs
    that are allocated dynamically and get corrupted during image
    restoration as described above.

    To fix that issue note that the code copying the last bits of the
    image kernel's memory contents to the page frames occupied by them
    previoulsy doesn't use the kernel text mapping, because it runs from
    a special page covered by the identity mapping set up for that code
    from scratch. Hence, the kernel text mapping is only needed before
    that code starts to run and then it will only be used just for the
    final jump to the image kernel's entry point.

    Accordingly, the temporary page tables set up in swsusp_arch_resume()
    on x86-64 need to contain the kernel text mapping too. That mapping
    is only going to be used for the final jump to the image kernel, so
    it only needs to cover the image kernel's entry point, because the
    first thing the image kernel does after getting control back is to
    switch over to its own original page tables. Moreover, the virtual
    address of the image kernel's entry point in that mapping has to be
    the same as the one mapped by the image kernel's page tables.

    With that in mind, modify the x86-64's arch_hibernation_header_save()
    and arch_hibernation_header_restore() routines to pass the physical
    address of the image kernel's entry point (in addition to its virtual
    address) to the boot kernel (a small piece of assembly code involved
    in passing the entry point's virtual address to the image kernel is
    not necessary any more after that, so drop it). Update RESTORE_MAGIC
    too to reflect the image header format change.

    Next, in set_up_temporary_mappings(), use the physical and virtual
    addresses of the image kernel's entry point passed in the image
    header to set up a minimum kernel text mapping (using memory pages
    that won't be overwritten by the image kernel's memory contents) that
    will map those addresses to each other as appropriate.

    This makes the concern about the possible corruption of the original...


Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1616781

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
AceLan Kao (acelankao)
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
AceLan Kao (acelankao)
Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Tim Gardner (timg-tpi) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
AceLan Kao (acelankao) wrote :

Tested S4 for 100 times on kernel 4.4.0-38 without any problem.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (22.8 KiB)

This bug was fixed in the package linux - 4.4.0-38.57

linux (4.4.0-38.57) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1620658

  * CIFS client: access problems after updating to kernel 4.4.0-29-generic
    (LP: #1612135)
    - Revert "UBUNTU: SAUCE: (namespace) Bypass sget() capability check for nfs"
    - fs: Call d_automount with the filesystems creds

  * apt-key add fails in overlayfs (LP: #1618572)
    - SAUCE: overlayfs: fix regression in whiteout detection

linux (4.4.0-37.56) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1618040

  * [Feature] Instruction decoder support for new SKX instructions- AVX512
    (LP: #1591655)
    - x86/insn: perf tools: Fix vcvtph2ps instruction decoding
    - x86/insn: Add AVX-512 support to the instruction decoder
    - perf tools: Add AVX-512 support to the instruction decoder used by Intel PT
    - perf tools: Add AVX-512 instructions to the new instructions test

  * [Ubuntu 16.04] FCoE Lun not visible in OS with inbox driver - Issue with
    ioremap() call on 32bit kernel (LP: #1608652)
    - lpfc: Correct issue with ioremap() call on 32bit kernel

  * [Feature] turbostat support for Skylake-SP server (LP: #1591802)
    - tools/power turbostat: decode more CPUID fields
    - tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
    - tools/power turbostat: decode HWP registers
    - tools/power turbostat: Decode MSR_MISC_PWR_MGMT
    - tools/power turbostat: allow sub-sec intervals
    - tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
    - tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
    - tools/power turbostat: re-name "%Busy" field to "Busy%"
    - tools/power turbostat: add --out option for saving output in a file
    - tools/power turbostat: fix compiler warnings
    - tools/power turbostat: make fewer systems calls
    - tools/power turbostat: show IRQs per CPU
    - tools/power turbostat: show GFXMHz
    - tools/power turbostat: show GFX%rc6
    - tools/power turbostat: detect and work around syscall jitter
    - tools/power turbostat: indicate SMX and SGX support
    - tools/power turbostat: call __cpuid() instead of __get_cpuid()
    - tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
    - tools/power turbostat: bugfix: TDP MSRs print bits fixing
    - tools/power turbostat: SGX state should print only if --debug
    - tools/power turbostat: print IRTL MSRs
    - tools/power turbostat: initial BXT support
    - tools/power turbostat: decode BXT TSC frequency via CPUID
    - tools/power turbostat: initial SKX support

  * [BYT] display hotplug doesn't work on console (LP: #1616894)
    - drm/i915/vlv: Make intel_crt_reset() per-encoder
    - drm/i915/vlv: Reset the ADPA in vlv_display_power_well_init()
    - drm/i915/vlv: Disable HPD in valleyview_crt_detect_hotplug()
    - drm/i915: Enable polling when we don't have hpd

  * [Feature]intel_idle enabling on Broxton-P (LP: #1520446)
    - intel_idle: add BXT support

  * [Feature] EDAC: Update driver for SKX-SP (LP: #1591815)
    - [Config] CONFIG_EDAC_SKX=m
    - EDAC, skx_edac: Ad...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
AceLan Kao (acelankao)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in hwe-next:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers