Kernel 4.2 drm/i915 erratic error

Bug #1494903 reported by wicklow
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned
Wily
Triaged
High
Unassigned

Bug Description

I am encountering this following erratic error with Wily kernel 4.2.
With Kernel, 4.1 I did not encounter such issues with drm i915.

[ 5437.034758] ------------[ cut here ]------------
[ 5437.034784] WARNING: CPU: 2 PID: 0 at /build/linux-4dBub_/linux-4.2.0/drivers/gpu/drm/i915/intel_display.c:11098 intel_check_page_flip+0xfc/0x110 [i915]()
[ 5437.034785] Kicking stuck page flip: queued at 322707, now 322711
[ 5437.034786] Modules linked in: hid_generic snd_usb_audio usbhid snd_usbmidi_lib drbg ansi_cprng ctr ccm rfcomm binfmt_misc bnep nls_iso8859_1 joydev dcdbas dell_wmi arc4 hid_multitouch sparse_keymap intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_soc_rt286 snd_soc_rl6347a iwlmvm snd_soc_core crct10dif_pclmul mac80211 snd_compress ac97_bus crc32_pclmul snd_pcm_dmaengine snd_seq_midi ghash_clmulni_intel uvcvideo snd_seq_midi_event aesni_intel snd_rawmidi videobuf2_vmalloc aes_x86_64 videobuf2_memops lrw gf128mul glue_helper btusb iwlwifi ablk_helper cryptd input_leds videobuf2_core serio_raw v4l2_common dell_led btrtl videodev snd_seq snd_hda_codec_hdmi btbcm btintel snd_hda_codec_realtek media bluetooth snd_hda_codec_generic cfg80211 rtsx_pci_ms snd_hda_intel memstick snd_hda_codec
[ 5437.034821] snd_hda_core snd_hwdep snd_pcm lpc_ich snd_seq_device shpchp snd_timer snd soundcore soc_button_array int3403_thermal acpi_als kfifo_buf industrialio dw_dmac int3400_thermal spi_pxa2xx_platform processor_thermal_device dw_dmac_core acpi_thermal_rel snd_soc_sst_acpi intel_soc_dts_iosf 8250_dw i2c_designware_platform int3402_thermal tpm_crb iosf_mbi i2c_designware_core int340x_thermal_zone mac_hid acpi_pad parport_pc ppdev lp parport autofs4 rtsx_pci_sdmmc i915 i2c_algo_bit ahci drm_kms_helper libahci rtsx_pci drm wmi sdhci_acpi video sdhci i2c_hid hid
[ 5437.034846] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.2.0-7-generic #7-Ubuntu
[ 5437.034847] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS A05 07/14/2015
[ 5437.034849] ffffffffc02101e8 ffff88021f503d68 ffffffff817b0465 0000000000000000
[ 5437.034851] ffff88021f503db8 ffff88021f503da8 ffffffff81076536 ffff88021f503db8
[ 5437.034853] ffff880211a46800 ffff880211be3000 ffff880211a469a8 0000000000000000
[ 5437.034855] Call Trace:
[ 5437.034857] <IRQ> [<ffffffff817b0465>] dump_stack+0x45/0x57
[ 5437.034864] [<ffffffff81076536>] warn_slowpath_common+0x86/0xc0
[ 5437.034866] [<ffffffff810765b6>] warn_slowpath_fmt+0x46/0x50
[ 5437.034882] [<ffffffffc01b7d1c>] intel_check_page_flip+0xfc/0x110 [i915]
[ 5437.034894] [<ffffffffc0182749>] gen8_irq_handler+0x369/0x560 [i915]
[ 5437.034897] [<ffffffff810cca94>] handle_irq_event_percpu+0x74/0x180
[ 5437.034899] [<ffffffff810ccbe9>] handle_irq_event+0x49/0x70
[ 5437.034902] [<ffffffff810cfcd1>] handle_edge_irq+0x81/0x150
[ 5437.034904] [<ffffffff81016105>] handle_irq+0x25/0x40
[ 5437.034906] [<ffffffff817b98ef>] do_IRQ+0x4f/0xe0
[ 5437.034909] [<ffffffff817b786b>] common_interrupt+0x6b/0x6b
[ 5437.034910] <EOI> [<ffffffff810df5f4>] ? enqueue_hrtimer+0x44/0x80
[ 5437.034915] [<ffffffff81655f10>] ? cpuidle_enter_state+0x130/0x270
[ 5437.034917] [<ffffffff81655eeb>] ? cpuidle_enter_state+0x10b/0x270
[ 5437.034919] [<ffffffff81656087>] cpuidle_enter+0x17/0x20
[ 5437.034920] [<ffffffff810b6422>] call_cpuidle+0x32/0x60
[ 5437.034922] [<ffffffff81656063>] ? cpuidle_select+0x13/0x20
[ 5437.034923] [<ffffffff810b66a9>] cpu_startup_entry+0x259/0x320
[ 5437.034927] [<ffffffff81049f04>] start_secondary+0x174/0x1a0
[ 5437.034928] ---[ end trace 3d8fa70b3cf1a1ff ]---

ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: linux-image-4.2.0-7-generic 4.2.0-7.7
ProcVersionSignature: Ubuntu 4.2.0-7.7-generic 4.2.0
Uname: Linux 4.2.0-7-generic x86_64
ApportVersion: 2.18.1-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: loic 1504 F.... pulseaudio
 /dev/snd/controlC1: loic 1504 F.... pulseaudio
CurrentDesktop: Unity
Date: Fri Sep 11 21:10:42 2015
HibernationDevice: RESUME=UUID=8c8fb0c1-2656-491a-9e7c-df539df7f606
InstallationDate: Installed on 2015-08-08 (34 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150808)
MachineType: Dell Inc. XPS 13 9343
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.2.0-7-generic.efi.signed root=/dev/mapper/vgroot-lvroot ro priority=low i915.enable_ips=0 pcie_aspm=force radeon.modeset=0 nouveau.modeset=0 ipv6.disable=1 cgroup_disable=memory nmi_watchdog=0
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-7-generic N/A
 linux-backports-modules-4.2.0-7-generic N/A
 linux-firmware 1.147
SourcePackage: linux
UdevLog: Error: [Errno 2] Aucun fichier ou dossier de ce type: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/14/2015
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A05
dmi.board.name: 0TM99H
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA05:bd07/14/2015:svnDellInc.:pnXPS139343:pvr:rvnDellInc.:rn0TM99H:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: XPS 13 9343
dmi.sys.vendor: Dell Inc.
---
ApportVersion: 2.18.1-0ubuntu1
Architecture: amd64
CurrentDesktop: Unity
DistroRelease: Ubuntu 15.10
InstallationDate: Installed on 2015-08-08 (40 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150808)
Package: linux (not installed)
Tags: wily
Uname: Linux 4.3.0-040300rc1-generic x86_64
UnreportableReason: The running kernel is not an Ubuntu kernel
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True

Revision history for this message
wicklow (lduruel) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: Kernel 4.2 drm/i915 erractic error

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.3 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.
[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.3-rc1-unstable/

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
tags: added: regression-update
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
penalvch (penalvch)
tags: added: latest-bios-a05
Revision history for this message
wicklow (lduruel) wrote :

Okay on my side, I am going to test with the latest mainline Kernel (4.3 rc1) once it will have landed to the repository. Currently, only the Kernel headers are there.

Revision history for this message
wicklow (lduruel) wrote : JournalErrors.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
wicklow (lduruel) wrote : ProcEnviron.txt

apport information

Revision history for this message
wicklow (lduruel) wrote : Re: Kernel 4.2 drm/i915 erractic error

I tested with the latest upstream kernel (v4.3-rc1-unstable) and the problem does not occur.

In summary :

linux-image-4.1.0-3-generic (4.1.0-3.3) amd64 : Ok
linux-image-4.2.0-7-generic (4.2.0-7.7) amd64 : Not Ok (Kernel DRM stack trace mentionned in this bug report)
linux-image-4.2.0-10-generic (4.2.0-10.11) amd64 : Not Ok (Kernel DRM stack trace mentionned in this bug report)
linux-image-4.3.0-040300rc1-generic (4.3.0-040300rc1.201509160642) amd64 : Ok (Act as kernel 4.1, no DRM stack trace)

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu Wily):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

wicklow, to advise, you do not need to apport-collect further.

Despite this, the next step is to fully reverse commit bisect from kernel 4.2 to 4.3-rc1 in order to identify the last bad commit, followed immediately by the first good one. Once this commit has been identified, then it may be reviewed as a candidate for backporting into your release. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F ?

Please note, finding adjacent kernel versions is not fully commit bisecting.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

Changed in linux (Ubuntu Wily):
status: Confirmed → Incomplete
Revision history for this message
wicklow (lduruel) wrote :

Christopher, understood. Thanks for the advice.
Reverse commit bisecting ? Easy, but needs some mental contorsion.

I started a "Reverse" bisect between v4.2 final and v4.3-rc1. The first test kernel is built up to the following commit:

[dd5cdb48edfd34401799056a9acf61078d773f90] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Result : Kernel stack trace as describe in this bug report (Good)

I follow up on this.

wicklow (lduruel)
summary: - Kernel 4.2 drm/i915 erractic error
+ Kernel 4.2 drm/i915 erratic error
Revision history for this message
wicklow (lduruel) wrote :

Bisect progress

*************************
Test kernel up to the following commit:

[f377ea88b862bf7151be96d276f4cb740f8e1c41] Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[abebcdfb64f1b39eeeb14282d9cd4aad1ed86f8d] Merge tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Result : Kernel stack trace as described in this bug report (Good)

*************************
Test kernel up to the following commit:

[bef2c7bd578e91c9c10983e0c15c4501127b77ca] Merge tag 'drm/tegra/for-4.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
Result : No Stack trace (Bad)

*************************

Revision history for this message
wicklow (lduruel) wrote :

*************************
Test kernel up to the following commit:

[97d3308ab245c51ae237b3444afa7ae87aa9bcd4] drm/i915: Add HAS_CORE_RING_FREQ macro
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[75289874e4484cd4702b3341b654b45b4a09b9d3] drm/i915: Update add_request() to take a request structure
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[c0165304e10f317672e20f2b40770d74c51e287f] drm/i915: Only enable cursor if it can be enabled.
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[b432e5cfd5e92127ad2dd83bfc3083f1dbce43fb] drm/i915: BDW clock change support
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[34edce2fea6960ce5855d6e09902f82822c374c5] drm/i915: Add cdclk extraction for g33, g965gm and g4x
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[318bd821d65d37fb12c5673607e2b013f7a86a01] drm/i915/skl: Propagate the error if we fail to find a suitable DPLL divider
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[5fcece80ecdac932a0acb71e3a239c39dd4af20f] drm/i915: group all hotplug related fields into a new struct in dev_priv
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[0d2e42970cfa8814ce5f73e329f61c94b7ec2dab] drm/i915: reduce indent in i9xx_hpd_irq_handler
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[b1b38278e12b04cf9a227f6af2c24651cf6e8a85] drm/i915: add a context parameter to {en, dis}able zero address mapping
Result : No Stack trace (Bad)

++++++++++++++

b1b38278e12b04cf9a227f6af2c24651cf6e8a85 is the first bad commit
commit b1b38278e12b04cf9a227f6af2c24651cf6e8a85
Author: David Weinehall <email address hidden>
Date: Wed May 20 17:00:13 2015 +0300

    drm/i915: add a context parameter to {en, dis}able zero address mapping

    Export a new context parameter that can be set/queried through the
    context_{get,set}param ioctls. This parameter is passed as a context
    flag and decides whether or not a GPU address mapping is allowed to
    be made at address zero. The default is to allow such mappings.

    Signed-off-by: David Weinehall <email address hidden>
    Acked-by: "Zou, Nanhai" <email address hidden>
    Signed-off-by: Daniel Vetter <email address hidden>

:040000 040000 248175326f478b7bb1ade676c27cd28f2721e5b6 6abdf6db9ac859a7742ba3d6f9c93eb5f928cf76 M drivers
:040000 040000 3789caada5be99563007fae6c089a4be77c5cabf 795fafadea3f1ff99b0e4358b3168b30949406ac M include

Bisect completed

Revision history for this message
wicklow (lduruel) wrote :

Christopher,

I have completed the reverse commit bisecting. it pointed me the following commit : https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b1b38278e12b04cf9a227f6af2c24651cf6e8a85

Please find in attachment the "bisect log" file ans the "bisect visualize" file

Revision history for this message
wicklow (lduruel) wrote :
penalvch (penalvch)
tags: added: reverse-bisect-done
Changed in linux (Ubuntu Wily):
status: Incomplete → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.