Kernel 4.2 drm/i915 erratic error

Bug #1494903 reported by wicklow on 2015-09-11
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Wily
High
Unassigned

Bug Description

I am encountering this following erratic error with Wily kernel 4.2.
With Kernel, 4.1 I did not encounter such issues with drm i915.

[ 5437.034758] ------------[ cut here ]------------
[ 5437.034784] WARNING: CPU: 2 PID: 0 at /build/linux-4dBub_/linux-4.2.0/drivers/gpu/drm/i915/intel_display.c:11098 intel_check_page_flip+0xfc/0x110 [i915]()
[ 5437.034785] Kicking stuck page flip: queued at 322707, now 322711
[ 5437.034786] Modules linked in: hid_generic snd_usb_audio usbhid snd_usbmidi_lib drbg ansi_cprng ctr ccm rfcomm binfmt_misc bnep nls_iso8859_1 joydev dcdbas dell_wmi arc4 hid_multitouch sparse_keymap intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_soc_rt286 snd_soc_rl6347a iwlmvm snd_soc_core crct10dif_pclmul mac80211 snd_compress ac97_bus crc32_pclmul snd_pcm_dmaengine snd_seq_midi ghash_clmulni_intel uvcvideo snd_seq_midi_event aesni_intel snd_rawmidi videobuf2_vmalloc aes_x86_64 videobuf2_memops lrw gf128mul glue_helper btusb iwlwifi ablk_helper cryptd input_leds videobuf2_core serio_raw v4l2_common dell_led btrtl videodev snd_seq snd_hda_codec_hdmi btbcm btintel snd_hda_codec_realtek media bluetooth snd_hda_codec_generic cfg80211 rtsx_pci_ms snd_hda_intel memstick snd_hda_codec
[ 5437.034821] snd_hda_core snd_hwdep snd_pcm lpc_ich snd_seq_device shpchp snd_timer snd soundcore soc_button_array int3403_thermal acpi_als kfifo_buf industrialio dw_dmac int3400_thermal spi_pxa2xx_platform processor_thermal_device dw_dmac_core acpi_thermal_rel snd_soc_sst_acpi intel_soc_dts_iosf 8250_dw i2c_designware_platform int3402_thermal tpm_crb iosf_mbi i2c_designware_core int340x_thermal_zone mac_hid acpi_pad parport_pc ppdev lp parport autofs4 rtsx_pci_sdmmc i915 i2c_algo_bit ahci drm_kms_helper libahci rtsx_pci drm wmi sdhci_acpi video sdhci i2c_hid hid
[ 5437.034846] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.2.0-7-generic #7-Ubuntu
[ 5437.034847] Hardware name: Dell Inc. XPS 13 9343/0TM99H, BIOS A05 07/14/2015
[ 5437.034849] ffffffffc02101e8 ffff88021f503d68 ffffffff817b0465 0000000000000000
[ 5437.034851] ffff88021f503db8 ffff88021f503da8 ffffffff81076536 ffff88021f503db8
[ 5437.034853] ffff880211a46800 ffff880211be3000 ffff880211a469a8 0000000000000000
[ 5437.034855] Call Trace:
[ 5437.034857] <IRQ> [<ffffffff817b0465>] dump_stack+0x45/0x57
[ 5437.034864] [<ffffffff81076536>] warn_slowpath_common+0x86/0xc0
[ 5437.034866] [<ffffffff810765b6>] warn_slowpath_fmt+0x46/0x50
[ 5437.034882] [<ffffffffc01b7d1c>] intel_check_page_flip+0xfc/0x110 [i915]
[ 5437.034894] [<ffffffffc0182749>] gen8_irq_handler+0x369/0x560 [i915]
[ 5437.034897] [<ffffffff810cca94>] handle_irq_event_percpu+0x74/0x180
[ 5437.034899] [<ffffffff810ccbe9>] handle_irq_event+0x49/0x70
[ 5437.034902] [<ffffffff810cfcd1>] handle_edge_irq+0x81/0x150
[ 5437.034904] [<ffffffff81016105>] handle_irq+0x25/0x40
[ 5437.034906] [<ffffffff817b98ef>] do_IRQ+0x4f/0xe0
[ 5437.034909] [<ffffffff817b786b>] common_interrupt+0x6b/0x6b
[ 5437.034910] <EOI> [<ffffffff810df5f4>] ? enqueue_hrtimer+0x44/0x80
[ 5437.034915] [<ffffffff81655f10>] ? cpuidle_enter_state+0x130/0x270
[ 5437.034917] [<ffffffff81655eeb>] ? cpuidle_enter_state+0x10b/0x270
[ 5437.034919] [<ffffffff81656087>] cpuidle_enter+0x17/0x20
[ 5437.034920] [<ffffffff810b6422>] call_cpuidle+0x32/0x60
[ 5437.034922] [<ffffffff81656063>] ? cpuidle_select+0x13/0x20
[ 5437.034923] [<ffffffff810b66a9>] cpu_startup_entry+0x259/0x320
[ 5437.034927] [<ffffffff81049f04>] start_secondary+0x174/0x1a0
[ 5437.034928] ---[ end trace 3d8fa70b3cf1a1ff ]---

ProblemType: Bug
DistroRelease: Ubuntu 15.10
Package: linux-image-4.2.0-7-generic 4.2.0-7.7
ProcVersionSignature: Ubuntu 4.2.0-7.7-generic 4.2.0
Uname: Linux 4.2.0-7-generic x86_64
ApportVersion: 2.18.1-0ubuntu1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: loic 1504 F.... pulseaudio
 /dev/snd/controlC1: loic 1504 F.... pulseaudio
CurrentDesktop: Unity
Date: Fri Sep 11 21:10:42 2015
HibernationDevice: RESUME=UUID=8c8fb0c1-2656-491a-9e7c-df539df7f606
InstallationDate: Installed on 2015-08-08 (34 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150808)
MachineType: Dell Inc. XPS 13 9343
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.2.0-7-generic.efi.signed root=/dev/mapper/vgroot-lvroot ro priority=low i915.enable_ips=0 pcie_aspm=force radeon.modeset=0 nouveau.modeset=0 ipv6.disable=1 cgroup_disable=memory nmi_watchdog=0
RelatedPackageVersions:
 linux-restricted-modules-4.2.0-7-generic N/A
 linux-backports-modules-4.2.0-7-generic N/A
 linux-firmware 1.147
SourcePackage: linux
UdevLog: Error: [Errno 2] Aucun fichier ou dossier de ce type: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/14/2015
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A05
dmi.board.name: 0TM99H
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA05:bd07/14/2015:svnDellInc.:pnXPS139343:pvr:rvnDellInc.:rn0TM99H:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: XPS 13 9343
dmi.sys.vendor: Dell Inc.
---
ApportVersion: 2.18.1-0ubuntu1
Architecture: amd64
CurrentDesktop: Unity
DistroRelease: Ubuntu 15.10
InstallationDate: Installed on 2015-08-08 (40 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150808)
Package: linux (not installed)
Tags: wily
Uname: Linux 4.3.0-040300rc1-generic x86_64
UnreportableReason: The running kernel is not an Ubuntu kernel
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True

wicklow (lduruel) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.3 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.
[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.3-rc1-unstable/

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
tags: added: regression-update
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: latest-bios-a05
wicklow (lduruel) wrote :

Okay on my side, I am going to test with the latest mainline Kernel (4.3 rc1) once it will have landed to the repository. Currently, only the Kernel headers are there.

apport information

tags: added: apport-collected
description: updated

apport information

I tested with the latest upstream kernel (v4.3-rc1-unstable) and the problem does not occur.

In summary :

linux-image-4.1.0-3-generic (4.1.0-3.3) amd64 : Ok
linux-image-4.2.0-7-generic (4.2.0-7.7) amd64 : Not Ok (Kernel DRM stack trace mentionned in this bug report)
linux-image-4.2.0-10-generic (4.2.0-10.11) amd64 : Not Ok (Kernel DRM stack trace mentionned in this bug report)
linux-image-4.3.0-040300rc1-generic (4.3.0-040300rc1.201509160642) amd64 : Ok (Act as kernel 4.1, no DRM stack trace)

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu Wily):
status: Incomplete → Confirmed

wicklow, to advise, you do not need to apport-collect further.

Despite this, the next step is to fully reverse commit bisect from kernel 4.2 to 4.3-rc1 in order to identify the last bad commit, followed immediately by the first good one. Once this commit has been identified, then it may be reviewed as a candidate for backporting into your release. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F ?

Please note, finding adjacent kernel versions is not fully commit bisecting.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

Changed in linux (Ubuntu Wily):
status: Confirmed → Incomplete
wicklow (lduruel) wrote :

Christopher, understood. Thanks for the advice.
Reverse commit bisecting ? Easy, but needs some mental contorsion.

I started a "Reverse" bisect between v4.2 final and v4.3-rc1. The first test kernel is built up to the following commit:

[dd5cdb48edfd34401799056a9acf61078d773f90] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Result : Kernel stack trace as describe in this bug report (Good)

I follow up on this.

wicklow (lduruel) on 2015-09-19
summary: - Kernel 4.2 drm/i915 erractic error
+ Kernel 4.2 drm/i915 erratic error
wicklow (lduruel) wrote :

Bisect progress

*************************
Test kernel up to the following commit:

[f377ea88b862bf7151be96d276f4cb740f8e1c41] Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[abebcdfb64f1b39eeeb14282d9cd4aad1ed86f8d] Merge tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Result : Kernel stack trace as described in this bug report (Good)

*************************
Test kernel up to the following commit:

[bef2c7bd578e91c9c10983e0c15c4501127b77ca] Merge tag 'drm/tegra/for-4.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
Result : No Stack trace (Bad)

*************************

wicklow (lduruel) wrote :

*************************
Test kernel up to the following commit:

[97d3308ab245c51ae237b3444afa7ae87aa9bcd4] drm/i915: Add HAS_CORE_RING_FREQ macro
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[75289874e4484cd4702b3341b654b45b4a09b9d3] drm/i915: Update add_request() to take a request structure
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[c0165304e10f317672e20f2b40770d74c51e287f] drm/i915: Only enable cursor if it can be enabled.
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[b432e5cfd5e92127ad2dd83bfc3083f1dbce43fb] drm/i915: BDW clock change support
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[34edce2fea6960ce5855d6e09902f82822c374c5] drm/i915: Add cdclk extraction for g33, g965gm and g4x
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[318bd821d65d37fb12c5673607e2b013f7a86a01] drm/i915/skl: Propagate the error if we fail to find a suitable DPLL divider
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[5fcece80ecdac932a0acb71e3a239c39dd4af20f] drm/i915: group all hotplug related fields into a new struct in dev_priv
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[0d2e42970cfa8814ce5f73e329f61c94b7ec2dab] drm/i915: reduce indent in i9xx_hpd_irq_handler
Result : No Stack trace (Bad)

*************************
Test kernel up to the following commit:

[b1b38278e12b04cf9a227f6af2c24651cf6e8a85] drm/i915: add a context parameter to {en, dis}able zero address mapping
Result : No Stack trace (Bad)

++++++++++++++

b1b38278e12b04cf9a227f6af2c24651cf6e8a85 is the first bad commit
commit b1b38278e12b04cf9a227f6af2c24651cf6e8a85
Author: David Weinehall <email address hidden>
Date: Wed May 20 17:00:13 2015 +0300

    drm/i915: add a context parameter to {en, dis}able zero address mapping

    Export a new context parameter that can be set/queried through the
    context_{get,set}param ioctls. This parameter is passed as a context
    flag and decides whether or not a GPU address mapping is allowed to
    be made at address zero. The default is to allow such mappings.

    Signed-off-by: David Weinehall <email address hidden>
    Acked-by: "Zou, Nanhai" <email address hidden>
    Signed-off-by: Daniel Vetter <email address hidden>

:040000 040000 248175326f478b7bb1ade676c27cd28f2721e5b6 6abdf6db9ac859a7742ba3d6f9c93eb5f928cf76 M drivers
:040000 040000 3789caada5be99563007fae6c089a4be77c5cabf 795fafadea3f1ff99b0e4358b3168b30949406ac M include

Bisect completed

wicklow (lduruel) wrote :

Christopher,

I have completed the reverse commit bisecting. it pointed me the following commit : https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b1b38278e12b04cf9a227f6af2c24651cf6e8a85

Please find in attachment the "bisect log" file ans the "bisect visualize" file

wicklow (lduruel) wrote :
tags: added: reverse-bisect-done
Changed in linux (Ubuntu Wily):
status: Incomplete → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers