"Unknown unclaimed register" messages in haswell

Bug #1138787 reported by James M. Leddy
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Low
linux (Ubuntu)
Fix Released
Low
Timo Aaltonen
Quantal
Fix Released
Low
Timo Aaltonen
Raring
Won't Fix
Undecided
Unassigned

Bug Description

[Impact]
There are a number of troubling "Unknown unclaimed register" messages that get put to dmesg with default log level on haswell machines. Some of these are erroneous, for example indicative of untidy bios.

This bug is basically a tracker bug for https://bugs.freedesktop.org/58897

first round of patches is here:

http://lists.freedesktop.org/archives/intel-gfx/2013-February/thread.html#24866

..but instead of backporting those it makes more sense to just silence the error messages, since they are harmless.

[Test case]
just use the machine for a while, these messages appear for instance when suspending/resuming when the VT is shown for a while

[Regression potential]
none really, it just changes the severity of the messages so that they are not shown by default

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

Created attachment 72323
dmesg from Booting up

Environment:
---------------------
kernel commit: c0c36b941b6f0be6ac74f340040cbb29d6a0b06c
drm/i915: Return the real error code from intel_set_mode()

Description:
---------------------
After booting up, there's an error in dmesg like below:

[drm:i915_write32] *ERROR* Unknown unclaimed register before writing to c5100

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

Created attachment 72324
message from intel_reg_dumper

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

Created attachment 72326
lspci message

Revision history for this message
In , Gordon Jin (gordon-jin) wrote :

is this regression?

I'm decreasing severity as I don't see real bad impact to users.

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

The first one is likely garbage left over from the BIOS, I think we need to tune down the debug message there. The later ones sound like real bugs ...

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Created attachment 72614
tune down debug message

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

(In reply to comment #5)
> Created attachment 72614 [details] [review]
> tune down debug message

This patch worked, this error dmesg doesn't exist while this patch applied on latest -next-queued commit.

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Created attachment 72775
reset unclaimed register writes errors on takeover

New patch for you to test, with a different approach.

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

(In reply to comment #7)
> Created attachment 72775 [details] [review]
> reset unclaimed register writes errors on takeover
>
> New patch for you to test, with a different approach.

Apply this patch on top commit of branch -next-queued,

this error information will appeared in dmesg.

commit: 9c7a47e7ca7c694ff4f19b568ad4ce1dba64dbd0

[ 1.786664] [drm:intel_dsm_platform_mux_info] *ERROR* MUX INFO call failed

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

Created attachment 72840
dmesg

(In reply to comment #8)
> (In reply to comment #7)
> > Created attachment 72775 [details] [review] [review]
> > reset unclaimed register writes errors on takeover
> >
> > New patch for you to test, with a different approach.
>
> Apply this patch on top commit of branch -next-queued,
>
> this error information will appeared in dmesg.
>
> commit: 9c7a47e7ca7c694ff4f19b568ad4ce1dba64dbd0
>
> [ 1.786664] [drm:intel_dsm_platform_mux_info] *ERROR* MUX INFO call failed

Modify:
---------------
[ 5.032931] [drm:i915_write32] *ERROR* Unknown unclaimed register before writing to c5100

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Hm, then there's something strange going on, I need to analyze this further ...

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 59549 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Paulo Zanoni (pzanoni) wrote :

Daniel's patch certainly fixes the problem for me... Are you really sure you tested a Kernel with Daniel's patch?

Anyway, I've proposed changing from GEN7_ERROR_INT to FPGA_DBG, so there's a new patch on the mailing list that will fix this bug after we convert to FPGA_DBG.

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

(In reply to comment #12)
> Daniel's patch certainly fixes the problem for me... Are you really sure you
> tested a Kernel with Daniel's patch?
>
I applied the patch on top -next-queued commit, then retest it on my HSW machine, the error dmesg also exist.

Machine Information:
----------------------------
HSW desktop:
BIOS Version: HSWLPTU1.86C.0093.R00.1209242321
Stepping: C0

> Anyway, I've proposed changing from GEN7_ERROR_INT to FPGA_DBG, so there's a
> new patch on the mailing list that will fix this bug after we convert to
> FPGA_DBG.

Please give a link of the patch, I can't find a patch related to this bug on mailing list. Thanks...

Revision history for this message
In , Paulo Zanoni (pzanoni) wrote :

(In reply to comment #13)
> (In reply to comment #12)
> > Daniel's patch certainly fixes the problem for me... Are you really sure you
> > tested a Kernel with Daniel's patch?
> >
> I applied the patch on top -next-queued commit, then retest it on my HSW
> machine, the error dmesg also exist.

Can you please attach dmesg?

>
> Machine Information:
> ----------------------------
> HSW desktop:
> BIOS Version: HSWLPTU1.86C.0093.R00.1209242321
> Stepping: C0
>
> > Anyway, I've proposed changing from GEN7_ERROR_INT to FPGA_DBG, so there's a
> > new patch on the mailing list that will fix this bug after we convert to
> > FPGA_DBG.
>
> Please give a link of the patch, I can't find a patch related to this bug on
> mailing list. Thanks...

http://lists.freedesktop.org/archives/intel-gfx/2013-January/024116.html
http://lists.freedesktop.org/archives/intel-gfx/2013-January/024117.html
http://lists.freedesktop.org/archives/intel-gfx/2013-January/024211.html

If this also doesn't help you, please attach dmesg.

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

> http://lists.freedesktop.org/archives/intel-gfx/2013-January/024116.html
> http://lists.freedesktop.org/archives/intel-gfx/2013-January/024117.html
> http://lists.freedesktop.org/archives/intel-gfx/2013-January/024211.html
>
> If this also doesn't help you, please attach dmesg.

I tested with -next-queued top commit: 209d52110a32c2069b5d870504e73fdb0e30fc51,
and applied the patches. The error also exists in dmesg. I attached the dmesg in attachment, any more message you want, I'm glad to try and doing for this.

Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

Created attachment 74031
error in dmesg after booting up

description: updated
Changed in linux:
importance: Unknown → Low
status: Unknown → Confirmed
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

still happening with 3.9-rc6

Changed in linux (Ubuntu):
importance: Undecided → Low
Revision history for this message
In , Timo Aaltonen (tjaalton) wrote :

this seems to have been fixed in drm-intel-nightly which has those three commits

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

but fixed in drm-intel-nightly, likely due to these three commits:

02bcca0d72a1491 drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
3f1e109a8be5670 drm/i915: use FPGA_DBG for the "unclaimed register" checks
115bc2de52af131 drm/i915: create functions for the "unclaimed register" checks

Changed in linux:
status: Confirmed → Fix Released
Revision history for this message
In , Yangweix-shui (yangweix-shui) wrote :

(In reply to comment #17)
> this seems to have been fixed in drm-intel-nightly which has those three
> commits

Our HSW C0 has been updated to C2, and I tried latest drm-intel-nightly and drm-intel-next-queued, this bug has gone. Verified here.

Revision history for this message
In , remcomeeder (r-meeder) wrote :

In which kernel version is the fix included? I run 3.8.0-27 on a Ubuntu 13.04 istallation and this message is still displayed.

Revision history for this message
In , Chris Wilson (ickle) wrote :

3.8.0-27 is Ubuntu's kernel, almost 9 months behind the curve. Support queries should be addressed to them first.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

fixed in saucy

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

instead of trying to backport a bunch of patches it's easier to just make the message appear on debug output only

Changed in linux (Ubuntu Quantal):
status: New → In Progress
Andy Whitcroft (apw)
Changed in linux (Ubuntu Quantal):
status: In Progress → Fix Committed
assignee: nobody → Timo Aaltonen (tjaalton)
importance: Undecided → Low
Timo Aaltonen (tjaalton)
description: updated
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-quantal' to 'verification-done-quantal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-quantal
Revision history for this message
Shengyao Xue (xueshengyao) wrote :

Hi Brad,

I tested the new kernel in -proposed (3.5.0-43), and the problem was solved.

Tag changed.

tags: added: verification-done-quantal
removed: verification-needed-quantal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (19.1 KiB)

This bug was fixed in the package linux - 3.5.0-43.66

---------------
linux (3.5.0-43.66) quantal; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1242895

  [ Timo Aaltonen ]

  * SAUCE: ubuntu/i915: silence unclaimed register poking debug messages
    - LP: #1138787

  [ Upstream Kernel Changes ]

  * Revert "xfs: fix _xfs_buf_find oops on blocks beyond the filesystem
    end"
    - LP: #1236041
    - CVE-2013-1819 fix backport:
  * Revert "sctp: fix call to SCTP_CMD_PROCESS_SACK in
    sctp_cmd_interpreter()"
    - LP: #1241093
  * get rid of full-hash scan on detaching vfsmounts
    - LP: #1226726
  * Smack: Fix the bug smackcipso can't set CIPSO correctly
    - LP: #1236743
  * SAUCE: (no-up) Only let characters through when there are active
    readers.
    - LP: #1208740
  * usb: xhci: define port register names and use them instead of magic
    numbers
    - LP: #1229576
  * usb: xhci: add USB2 Link power management BESL support
    - LP: #1229576
  * iwl4965: fix rfkill set state regression
    - LP: #1241093
  * ath9k_htc: Restore skb headroom when returning skb to mac80211
    - LP: #1241093
  * ALSA: opti9xx: Fix conflicting driver object name
    - LP: #1241093
  * SUNRPC: Fix memory corruption issue on 32-bit highmem systems
    - LP: #1241093
  * drm/i915: ivb: fix edp voltage swing reg val
    - LP: #1241093
  * drm/vmwgfx: Split GMR2_REMAP commands if they are to large
    - LP: #1241093
  * ALSA: ak4xx-adda: info leak in ak4xxx_capture_source_info()
    - LP: #1241093
  * Bluetooth: Add support for Foxconn/Hon Hai [0489:e04d]
    - LP: #1241093
  * [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a
    signal
    - LP: #1241093
  * xen-gnt: prevent adding duplicate gnt callbacks
    - LP: #1241093
  * usb: config->desc.bLength may not exceed amount of data returned by the
    device
    - LP: #1241093
  * USB: cdc-wdm: fix race between interrupt handler and tasklet
    - LP: #1241093
  * xhci-plat: Don't enable legacy PCI interrupts.
    - LP: #1241093
  * ASoC: wm8960: Fix PLL register writes
    - LP: #1241093
  * rculist: list_first_or_null_rcu() should use list_entry_rcu()
    - LP: #1241093
  * USB: mos7720: use GFP_ATOMIC under spinlock
    - LP: #1241093
  * USB: mos7720: fix big-endian control requests
    - LP: #1241093
  * staging: comedi: dt282x: dt282x_ai_insn_read() always fails
    - LP: #1241093
  * usb: ehci-mxc: check for pdata before dereferencing
    - LP: #1241093
  * usb: xhci: Disable runtime PM suspend for quirky controllers
    - LP: #1241093
  * USB: OHCI: Allow runtime PM without system sleep
    - LP: #1241093
  * ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
    - LP: #1241093
  * ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
    - LP: #1241093
  * USB: fix build error when CONFIG_PM_SLEEP isn't enabled
    - LP: #1241093
  * ALSA: hda - hdmi: Fallback to ALSA allocation when selecting CA
    - LP: #1241093
  * regmap: silence GCC warning
    - LP: #1241093
  * target: Fix trailing ASCII space usage in INQUIRY vendor+model
    - LP: #1241093
  * iwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth
    - LP: #1...

Changed in linux (Ubuntu Quantal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Raring):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie raring. The bug task representing the raring nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Raring):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.