[Bug]SKL-H boot hang when c8+c9+c10 enabled by intel_idle driver

Bug #1559918 reported by XiongZhang
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
HWE Next
Fix Released
Undecided
Unassigned
intel
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Undecided
Tim Gardner
Wily
Fix Released
Undecided
Tim Gardner
Xenial
Fix Released
Undecided
Tim Gardner

Bug Description

System will hang immediately with 4.3 kernel above on SKL-H GT4 sdv.
The workaround for this is intel_idle.max_cstate=6/7 boot option.
But intel_idle.max_cstate=8 will generate system hang

Bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=109081
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1550194

Revision history for this message
XiongZhang (xiong-y-zhang) wrote :

The patch send to ML and is under reviewing:
http://www.spinics.net/lists/stable/msg124799.html

Please integrate this patch into 16.04 kernel before kernel freeze although it have not entered into mainline kernel.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

April 7 is kernel freeze. I'd like to see this patch make it into linux-next before I apply it.

Revision history for this message
XiongZhang (xiong-y-zhang) wrote :

The patch is in v4.6-rc1
d70e28f intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled

Revision history for this message
XiongZhang (xiong-y-zhang) wrote :

Ubuntu 15.04 has SKL-H intel_idle driver support also, so we need to back port this patch into 15.04.

Revision history for this message
XiongZhang (xiong-y-zhang) wrote :

Reply comment #4.
Sorry, it is 15.10 not 15.04.

Tim Gardner (timg-tpi)
information type: Proprietary → Public
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → Fix Committed
Changed in linux (Ubuntu Wily):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Phidias (phidias-chiang)
tags: added: originate-from-1550112 somerville
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (7.5 KiB)

This bug was fixed in the package linux - 4.4.0-17.33

---------------
linux (4.4.0-17.33) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1563441

  * ISST-LTE: pVM:high cpus number need a high crashkernel value in kdump
    (LP: #1560552)
    - SAUCE: (noup) ppc64 boot: Wait for boot cpu to show up if nr_cpus limit is
      about to hit.

  * Predictable naming mechanism is leading to issues in DLPAR operations of
    NICs (LP: #1560514)
    - SAUCE: (noup) powerpc/pci: Assign fixed PHB number based on device-tree
      properties

  * ThunderX: support alternative phy implementations (LP: #1562968)
    - net: thunderx: Cleanup PHY probing code.
    - [Config] CONFIG_MDIO_CAVIUM=m
    - phy: mdio-octeon: Refactor into two files/modules
    - [Config] CONFIG_MDIO_THUNDER=m
    - phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.
    - phy: mdio-cavium: Add missing MODULE_* annotations.
    - net: cavium: For Kconfig THUNDER_NIC_BGX, select MDIO_THUNDER.
    - phy: mdio-thunder: Fix some Kconfig typos
    - [d-i] Add phy drivers for Cavium ThunderX to nic-modules udeb

  * linux: exclude ZONE_DEVICE from GFP_ZONE_TABLE (LP: #1563293)
    - Revert "mm: CONFIG_NR_ZONES_EXTENDED"
    - mm: exclude ZONE_DEVICE from GFP_ZONE_TABLE

  * lots of printk to serial console can hang system for long time
    (LP: #1534216)
    - printk: set may_schedule for some of console_trylock() callers

  * [i915_bpo] Update i915 backport driver (LP: #1560395)
    - SAUCE: i915_bpo: Update to drm-intel-next-fixes-2016-03-16
    - PM / runtime: Add new helper for conditional usage count incrementation
    - drm/core: Add drm_for_each_encoder_mask, v2.
    - drm/atomic-helper: Implement subsystem-level suspend/resume

  * [Hyper-V] VM Sockets (LP: #1541585)
    - Drivers: hv: vmbus: Cleanup vmbus_set_event()
    - Drivers: hv: vmbus: Add vendor and device atttributes
    - Drivers: hv: vmbus: avoid infinite loop in init_vp_index()
    - Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()
    - Drivers: hv: vmbus: don't manipulate with clocksources on crash
    - Drivers: hv: vmbus: add a helper function to set a channel's pending send size
    - Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)
    - Drivers: hv: vmbus: vmbus_sendpacket_ctl: hvsock: avoid unnecessary signaling
    - Drivers: hv: vmbus: define a new VMBus message type for hvsock
    - Drivers: hv: vmbus: add a hvsock flag in struct hv_driver
    - Drivers: hv: vmbus: add a per-channel rescind callback
    - Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()
    - Drivers: hv: vmbus: Eliminate the spin lock on the read path
    - Drivers: hv: vmbus: Give control over how the ring access is serialized
    - drivers/hv: Move VMBus hypercall codes into Hyper-V UAPI header
    - Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages
    - Drivers: hv: vmbus: avoid wait_for_completion() on crash
    - Drivers: hv: vmbus: remove code duplication in message handling
    - Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()
    - Drivers: hv: util: Pass the chann...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-wily' to 'verification-done-wily'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-wily
tags: added: verification-done-wily
removed: verification-needed-wily
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (30.4 KiB)

This bug was fixed in the package linux - 4.2.0-36.41

---------------
linux (4.2.0-36.41) wily; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1571667

  [ Benjamin Tissoires ]

  * SAUCE: Input: synaptics - handle spurious release of trackstick
    buttons, again
    - LP: #1553811

  [ dann frazier ]

  * Revert "SAUCE: arm64, numa, dt: adding dt based numa support using dt
    node property arm, associativity"
    - LP: #1558828
  * Revert "SAUCE: Documentation: arm64/arm: dt bindings for numa."
    - LP: #1558828
  * Revert "SAUCE: arm64, numa: adding numa support for arm64 platforms."
    - LP: #1558828
  * Revert "[Config] Enable NUMA on ARM64"
    - LP: #1558828

  [ K. Y. Srinivasan ]

  * SAUCE: (noup): Drivers: hv: vmbus: Fix a bug in
    hv_need_to_signal_on_read()
    - LP: #1556264

  [ Kamal Mostafa ]

  * [debian] BugLink: close LP: bugs only for Launchpad urls
  * [Config] updateconfigs after v4.2.8-ckt7

  [ Upstream Kernel Changes ]

  * Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
    - LP: #1561677
  * tipc: fix connection abort during subscription cancel
    - LP: #1561677
  * tipc: fix nullptr crash during subscription cancel
    - LP: #1561677
  * s390/mm: four page table levels vs. fork
    - LP: #1561677
  * Input: aiptek - fix crash on detecting device without endpoints
    - LP: #1561677
  * wext: fix message delay/ordering
    - LP: #1561677
  * cfg80211/wext: fix message ordering
    - LP: #1561677
  * mac80211: fix use of uninitialised values in RX aggregation
    - LP: #1561677
  * mac80211: minstrel: Change expected throughput unit back to Kbps
    - LP: #1561677
  * libata: fix HDIO_GET_32BIT ioctl
    - LP: #1561677
  * iwlwifi: mvm: inc pending frames counter also when txing non-sta
    - LP: #1561677
  * [media] adv7604: fix tx 5v detect regression
    - LP: #1561677
  * ahci: add new Intel device IDs
    - LP: #1561677
  * ahci: Order SATA device IDs for codename Lewisburg
    - LP: #1561677
  * Adding Intel Lewisburg device IDs for SATA
    - LP: #1561677
  * ASoC: samsung: Use IRQ safe spin lock calls
    - LP: #1561677
  * mac80211: minstrel_ht: set default tx aggregation timeout to 0
    - LP: #1561677
  * usb: chipidea: otg: change workqueue ci_otg as freezable
    - LP: #1561677
  * jffs2: Fix page lock / f->sem deadlock
    - LP: #1561677
  * Fix directory hardlinks from deleted directories
    - LP: #1561677
  * iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
    - LP: #1561677
  * iommu/amd: Apply workaround for ATS write permission check
    - LP: #1561677
  * libata: Align ata_device's id on a cacheline
    - LP: #1561677
  * can: gs_usb: fixed disconnect bug by removing erroneous use of kfree()
    - LP: #1561677
  * fbcon: set a default value to blink interval
    - LP: #1561677
  * KVM: x86: fix root cause for missed hardware breakpoints
    - LP: #1561677
  * arm64: vmemmap: use virtual projection of linear region
    - LP: #1561677
  * vfio: fix ioctl error handling
    - LP: #1561677
  * ALSA: ctl: Fix ioctls for X32 ABI
    - LP: #1561677
  * ALSA: pcm: Fix ioctls for X32 ABI
    - LP: #1561677
  * ALSA: rawmidi: Fix ioct...

Changed in linux (Ubuntu Wily):
status: In Progress → Fix Released
Changed in intel:
status: New → Fix Released
Changed in hwe-next:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.