“[drm:pch_irq_handler] *ERROR* PCH poison interrupt” shows after S3 resume when connecting with an external monitor

Bug #1031630 reported by Anthony Wong
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
High
Unassigned
Precise
High
Anthony Wong
Quantal
High
Unassigned

Bug Description

SRU Justification:

Impact:
        - without the patch, Screen shows “[drm:pch_irq_handler] *ERROR* PCH
        poison interrupt” continuously
Fix:
        - After applying the patch, Screen doesn't show “[drm:pch_irq_handler]
        *ERROR* PCH poison interrupt” continuously

1. Connect the notebook with an external monitor via D-sub
2. Suspend the system and then resume
3. Screen shows “[drm:pch_irq_handler] *ERROR* PCH poison interrupt” continuously

Upstream bug: https://bugs.freedesktop.org/show_bug.cgi?id=35103

Revision history for this message
In , Bo-b-wang (bo-b-wang) wrote :

Created attachment 44218
dmesg.txt is the bug's dmesg information

System Environment:
--------------------------
Platform: Sugarbay
Libdrm: (master)2.4.24-6-g3b04c73650b5e9bbcb602fdb8cea0b16ad82d0c0
Mesa: (7.10)68cdea9fb2c33aba5459fd79f57faadf9800e5bb
Xserver: (master)xorg-server-1.10.0
Xf86_video_intel: (master)2.14.901
Cairo: (master)f1d313e042af89b2f5f5d09d3eb1703d0517ecd7
Kernel: (drm-intel-fixes)91355834646328e7edc6bd25176ae44bcd7386c7

Bug detailed description:
-------------------------
When I do “ echo mem >/sys/power/state” ,the memory can be suspended. And Then I press power button to wake up memory, the Memory can be waked up normally. However, An error “[drm:pch_irq_handler] *ERROR* PCH poison interrupt”message appears in the terminal. This issue occurs in Sugarbar rev09 GT2 every time. But never have occurred int Sugarbar rev09 GT1.
It’s kernel regression.
Reproduce steps:
----------------
1. Don’t start X
2. echo mem>/sys/power/state
3. press power button to wake up

Revision history for this message
In , Gordon Jin (gordon-jin) wrote :

The "bad" PCH for the "Sugarbay rev09 GT2" is DH SDP board.
The "working" PCH for the "Sugarbay rev09 GT1" is Intel DH67CL board.

Revision history for this message
In , Chris Wilson (ickle) wrote :

It's not a regression, as since we weren't checking for the PCH interrupts earlier we have no idea if this was recently introduced. I've seen these on a i5-2500 which is a GT1 device.

Revision history for this message
In , Gordon Jin (gordon-jin) wrote :

right, it should not be regression. It's a new machine for us.

Revision history for this message
In , Gordon Jin (gordon-jin) wrote :

Chris, can you evaluate the impact of this bug?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Not sure what it means, but reading poisoned memory sounds like a serious issue... But it could also be a spurious interrupt from resume.

Revision history for this message
In , Mengmeng-meng (mengmeng-meng) wrote :

The problem also exists on our IvyBridge .
-----------------------------------------------------
Libdrm: (master)2.4.26
Mesa: (master)8875dd58719b978283e89acf04422a4eaf9b021d
Xserver:
(server-1.10-branch)xorg-server-1.10.2-11-g9551f5041915fa00ca243a279efb55de2ff11a00
Xf86_video_intel:(master)2.15.0-96-ga1ee4b930846d4ba9274028c08800b882fc926f1
Cairo: (master)3b9c8744898823a4b09917f0540a324318fef726
Libva: (master)3c1b6875b589f3a40709a889da85b979e34db625
Kernel: (drm-intel-fixes)6a574b5b9b186e28abd3e571dfd1700c5220b510

Revision history for this message
In , Yi-sun (yi-sun) wrote :

After running the testdisplay with HDMI and VGA on IVB platform, we unplug the VGA monitor. Then the error information will appears.

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Should be fixed with

commit 23e81d691a813839020f6e516b398d0f9369fe8b
Author: Adam Jackson <email address hidden>
Date: Wed Jun 6 15:45:44 2012 -0400

    drm/i915: pch_irq_handler -> {ibx, cpt}_irq_handler

in drm-intel-fixes. It's simply a matter of us complaining about an interrupt with new (but harmless) meaning.

tags: added: blocks-hwcert-enablement
summary: - “[drm:pch_irq_handler] *ERROR* PCH poison interrupt” after S3 resume
- when connecting with an external monitor
+ “[drm:pch_irq_handler] *ERROR* PCH poison interrupt” shows after S3
+ resume when connecting with an external monitor
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1031630

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Ming Lei (tom-leiming) wrote : Re: [Bug 1031630] [NEW] “[drm:pch_irq_handler] *ERROR* PCH poison interrupt” shows after S3 resume when connecting with an external monitor

Could you test the image which integrates the upstream fix in the below link?

        http://kernel.ubuntu.com/~ming/bugs/1031630/

Thanks,

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Anthony Wong (anthonywong) wrote :

@Ming, it has been confirmed that your packages can fix the problem, could you go ahead for SRU?

Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux:
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
Ming Lei (tom-leiming) wrote : Re: [Bug 1031630] Re: “[drm:pch_irq_handler] *ERROR* PCH poison interrupt” shows after S3 resume when connecting with an external monitor

Sent out the patch already to ubuntu kernel mail list.

description: updated
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Precise in -proposed solves the problem (3.2.0-30.47). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
Revision history for this message
Luis Henriques (henrix) wrote :

Anthony, would it be possible for you to test the -proposed kernel as per comment #14?

Revision history for this message
Anthony Wong (anthonywong) wrote :

Luis, I don't have the hardware to verify but I'm now asking someone who can do that.

Chris Van Hoof (vanhoof)
Changed in linux (Ubuntu Quantal):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Precise):
status: New → Fix Committed
importance: Undecided → High
Changed in linux (Ubuntu Quantal):
milestone: precise-updates → none
assignee: Ming Lei (tom-leiming) → nobody
Changed in linux (Ubuntu Precise):
milestone: none → precise-updates
assignee: nobody → Ming Lei (tom-leiming)
Revision history for this message
Luis Henriques (henrix) wrote :

Can someone with access to the hardware please test the -proposed kernel? We're already delayed on the SRU cycle and I'm affraid we'll need to revert this patch if it is not verified.

Revision history for this message
Chris Van Hoof (vanhoof) wrote :

@Anthony/@James -- Can we get this bug verified today so that it's not pulled from precise-proposed?

Changed in linux (Ubuntu Precise):
assignee: Ming Lei (tom-leiming) → Anthony Wong (anthonywong)
Revision history for this message
Luis Henriques (henrix) wrote :

Verification that this bug is fixed has not been completed by the deadline for the current stable kernel release cycle. The change will be reverted and this bug is being set to incomplete.

In order to have this fix considered for reapplication to the kernel, please follow the process documented here:

https://wiki.ubuntu.com/Kernel/StableReleaseCadence

Discussions about the new process tend to take place in #ubuntu-kernel on IRC, so please contribute to the discussion there if you would like.

Thank you!

Changed in linux (Ubuntu Precise):
status: Fix Committed → Incomplete
Luis Henriques (henrix)
tags: added: verification-reverted-precise
removed: verification-needed-precise
Revision history for this message
Luis Henriques (henrix) wrote :

After an IRC discussion, it has been agreed the patch should be OK to be included in current SRU cycle, as it is mostly a debug patch. For this reason, I'm reverting the changes in the bug and tagging it as verified.

Changed in linux (Ubuntu Precise):
status: Incomplete → Fix Committed
tags: added: verification-done-precise
removed: verification-reverted-precise
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.6 KiB)

This bug was fixed in the package linux - 3.2.0-30.48

---------------
linux (3.2.0-30.48) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1041217

  [ Upstream Kernel Changes ]

  * mutex: Place lock in contended state after fastpath_lock failure
    - LP: #1041114

linux (3.2.0-30.47) precise-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1036581

  [ Andy Whitcroft ]

  * add support for generating binary device trees and install them in
    /lib/firmware
    - LP: #1030600
  * [Config] add dtb_file configuration for highbank
    - LP: #1030600

  [ Chris Van Hoof ]

  * SAUCE: dell-laptop: additional rfkill blacklist Dell XPS 13
    - LP: #1030957
  * [Config] Add cifs support to the nfs-modules list
    - LP: #1031398

  [ Daniel P. Berrange ]

  * SAUCE: (drop after 3.6) Forbid invocation of kexec_load() outside
    initial PID namespace
    - LP: #1034125

  [ Dann Frazier ]

  * [Config] Compile the rtc-pl031 driver builtin on the highbank kernel
    flavour
    - LP: #1035110

  [ Douglas Bagnall ]

  * SAUCE: Unlock the rc_dev lock when the raw device is missing
    - LP: #1015836

  [ Rob Herring ]

  * SAUCE: ARM: highbank: add soft power and reset key event handling
    - LP: #1033853
  * SAUCE: ARM: highbank: use writel_relaxed variant for pwr requests
    - LP: #1033853
  * SAUCE: ahci: un-staticize ahci_dev_classify
    - LP: #1033853
  * SAUCE: ahci_platform: add custom hard reset for Calxeda ahci ctrlr
    - LP: #1033853

  [ Stefan Bader ]

  * (pre-stable) KVM: VMX: Set CPU_BASED_RDPMC_EXITING for nested
    - LP: #1031090

  [ Tim Gardner ]

  * [Config] updateconfigs

  [ Upstream Kernel Changes ]

  * ideapad: generate valid key event only
    - LP: #1029834
  * mm: reduce the amount of work done when updating min_free_kbytes
    - LP: #1032640
  * mm: compaction: allow compaction to isolate dirty pages
    - LP: #1032640
  * mm: compaction: determine if dirty pages can be migrated without
    blocking within ->migratepage
    - LP: #1032640
  * mm: page allocator: do not call direct reclaim for THP allocations
    while compaction is deferred
    - LP: #1032640
  * mm: compaction: make isolate_lru_page() filter-aware again
    - LP: #1032640
  * mm: compaction: introduce sync-light migration for use by compaction
    - LP: #1032640
  * mm: vmscan: when reclaiming for compaction, ensure there are sufficient
    free pages available
    - LP: #1032640
  * mm: vmscan: do not OOM if aborting reclaim to start compaction
    - LP: #1032640
  * mm: vmscan: check if reclaim should really abort even if
    compaction_ready() is true for one zone
    - LP: #1032640
  * vmscan: promote shared file mapped pages
    - LP: #1032640
  * vmscan: activate executable pages after first usage
    - LP: #1032640
  * mm/vmscan.c: consider swap space when deciding whether to continue
    reclaim
    - LP: #1032640
  * mm: test PageSwapBacked in lumpy reclaim
    - LP: #1032640
  * mm: vmscan: convert global reclaim to per-memcg LRU lists
    - LP: #1032640
  * cpuset: mm: reduce large amounts of memory barrier related damage v3
    - LP: #1032640
  * mm/hugetlb: fix warni...

Changed in linux (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.