eDP powered off while attempting aux channel communication

Bug #1194934 reported by James M. Leddy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Undecided
Unassigned
intel
Fix Released
Undecided
Rodrigo-vivi
linux (Ubuntu)
Fix Released
Medium
Timo Aaltonen

Bug Description

May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835743] ------------[ cut here ]------------
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835747] WARNING: at /var/lib/dkms/intel-i915-backport-3.8-dkms/3.8.6.0/build/intel_dp.c:326 intel_dp_check_edp+0x6d/0xc0 [i915]()
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835748] Hardware name:
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835748] eDP powered off while attempting aux channel communication.
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835748] Modules linked in: rtbth(O) ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs reiserfs ext2 rt2800pci rt2800lib crc_ccitt rt2x00pci rt2x00lib mac80211 cfg80211 eeprom_93cx6 nls_iso8859_1 snd_hda_codec_realtek(O) snd_hda_intel(O) snd_hda_codec(O) rfcomm dm_multipath bnep scsi_dh bluetooth snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer parport_pc snd_seq_device coretemp ppdev uvcvideo videobuf2_core kvm videodev rtsx_pci_ms videobuf2_vmalloc memstick psmouse videobuf2_memops mei serio_raw ghash_clmulni_intel snd joydev aesni_intel hp_wmi soundcore snd_page_alloc cryptd aes_x86_64 sparse_keymap mac_hid lpc_ich microcode lp parport binfmt_misc dm_raid45 xor dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c hid_generic usbhid hid usb_storage rtsx_pci_sdmmc wmi rtsx_pci ahci libahci i915(O) drm_kms_helper drm i2c_algo_bit video [last unloaded: r8169]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835768] Pid: 5296, comm: kworker/u:18 Tainted: G W O 3.5.0-30-generic #51~precise1-Ubuntu
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835768] Call Trace:
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835769] [<ffffffff81052c8f>] warn_slowpath_common+0x7f/0xc0
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835770] [<ffffffff81052d86>] warn_slowpath_fmt+0x46/0x50
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835772] [<ffffffffa00c335d>] intel_dp_check_edp+0x6d/0xc0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835777] [<ffffffffa00c3e74>] intel_dp_aux_native_write+0x34/0xf0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835782] [<ffffffffa00c4575>] intel_dp_set_link_train+0xb5/0x300 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835787] [<ffffffffa00c4347>] ? intel_get_adjust_train+0x47/0x1c0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835792] [<ffffffffa00c63b3>] intel_dp_complete_link_train+0xe3/0x2b0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835797] [<ffffffffa00c66f0>] intel_dp_check_link_status+0xe0/0x1a0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835802] [<ffffffffa00c67c5>] intel_dp_hot_plug+0x15/0x20 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835807] [<ffffffffa008f5de>] i915_hotplug_work_func+0x6e/0xa0 [i915]
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835811] [<ffffffff81071af7>] process_one_work+0x127/0x470
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835812] [<ffffffff81072d35>] worker_thread+0x165/0x370
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835813] [<ffffffff81072bd0>] ? manage_workers.isra.29+0x130/0x130
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835814] [<ffffffff81077cd3>] kthread+0x93/0xa0
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835815] [<ffffffff816a6524>] kernel_thread_helper+0x4/0x10
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835816] [<ffffffff81077c40>] ? flush_kthread_worker+0xb0/0xb0
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835817] [<ffffffff816a6520>] ? gs_change+0x13/0x13
May 22 07:20:11 ubuntu-desktop kernel: [ 3687.835818] ---[ end trace 6b87f9ae8a9261a4 ]---

As you might expect, the intel-i915-backport-3.8-dkms is just a backport of the 3.8 driver code into the 3.5 kernel. We've experienced this problem on one late model HP system. The failure rate is 50%

Revision history for this message
James M. Leddy (jm-leddy) wrote :

From the engineer debugging the case:

I noted there is one thing interesting and may be useful for your 3.9.3 backporting:

the i915 Call Trace message will appear when the system is:
1. 3.5.0-32(default) + 3.8.6 i915 backport dkms
2. 3.9.3

the i915 Call Trace message won't appear when the system is:
1. 3.5.0-32(default) without 3.8.6 i915 backport dkms
2. 3.10-rc3

only "3.5.0-32(default) + 3.8.6 i915 backport dkms" raised this bug,
so I think the Call Trace message in comment 2 may not directly raise this bug

information type: Proprietary → Public
Changed in linux (Ubuntu):
assignee: nobody → Timo Aaltonen (tjaalton)
status: New → Confirmed
importance: Undecided → Medium
Changed in intel:
assignee: nobody → Rodrigo-vivi (rodrigo-vivi)
tags: added: blocks-hwcert-enablement
Revision history for this message
James M. Leddy (jm-leddy) wrote :

Test log attached.

At beginning I tested with
sudo fwts s3 --s3-multiple=30 --s3-min-delay=10 --s3-max-delay=15
and had following results:

3.5.0-32 (lts-quantal) PASS (30/30)
3.8.0-25-generic (lts-raring) FAIL (FAILED at 7/30)
3.8.0 (mainline ppa) FAIL (did twice, FAILED at 21/30, 16/30)
3.8.4 (mainline ppa) FAIL (FAILED at 8/30)
3.8.6 (mainline ppa) FAIL (FAILED at 21/30)
3.8.13 (mainline ppa) FAIL (FAILED at 5/30)
3.9.3 (mainline ppa) FAIL (FAILED at 9/30)
3.10.0 (mainline ppa) PASS (30/30)

When system FAIL, ALT+SYSRQ+B can't reboot the system.

After then I retested with
sudo fwts s3 --s3-multiple=30 --s3-min-delay=20 --s3-max-delay=40
on 3.9.3 kernel. (since it passed by #31).
It did pass the 30 times stress test, however after the last resume its display gone.
But ALT+SYSRQ+B can reboot the system and numlock led is still active also.
I'll have more tests tomorrow.

Revision history for this message
James M. Leddy (jm-leddy) wrote :

This looks very similar to another bug that we've raised to intel: https://bugs.freedesktop.org/show_bug.cgi?id=61508

Revision history for this message
Rodrigo-vivi (rodrigo-vivi) wrote :

After various hotplug fixes and rework that are landing on 3.11 this bug isn't reproducible with new kernels and production machines.
One thing it is worthful to try it the Takashi fix:

"In i915_drm_freeze(), dev_priv->enable_hotplug_processing must be set to false
before calling intel_modeset_disable(). (And better to cancel the pending,
too.) Otherwise you'll get spurious hotplug events"

Revision history for this message
James M. Leddy (jm-leddy) wrote :

Thanks for that Rodrigo. I will test the latest kernels to see if this is fixed.

Revision history for this message
James M. Leddy (jm-leddy) wrote :

In fact, we have not been able to reproduce this problem with a 3.10 kernel. Closing for now.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Closing per comment #6

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in intel:
status: New → Fix Released
Changed in linux:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.