[Lenovo ThinkPad T450s] Issues after docking and locking: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

Bug #1727662 reported by Alexander Kops on 2017-10-26
80
This bug affects 14 people
Affects Status Importance Assigned to Milestone
Linux
Confirmed
Medium
linux (Ubuntu)
Medium
Unassigned

Bug Description

What happens is after I dock my laptop into a Lenovo ThinkPad Ultra Dock Type 40A2 20V, lock it via clicking the lock icon, and wait ~30 minutes, one of the following three things happen ~50% of the time when I come back to unlock it:
* Most of the time the computer is found shut down.
* Sometimes the notebook screen flickers weirdly and one of the two external monitors show pixelation.
* The notebook screen shows the lock screen but is completely frozen. Occasionally unlocking works.

I have two external monitors attached to the docking station.

When I dock my laptop, the battery indicator does recognize its charging.

I've not seen this issue when:
1) With the laptop in the dock, while actively using the laptop.
2) With the laptop out of the dock and no external monitors present.

This bug started after upgrading from Ubtuntu 17.04 to Ubuntu 17.10.

Using or not using the following kernel parameter didn't change anything:
i915.enable_rc6=0

Updated docking station firmware to latest 2.33.00.

PC temperatures are normal as per lm-sensors.

20171109 - Not reproducible testing drm-tip for 1.5 days.

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: linux-image-4.13.0-16-lowlatency 4.13.0-16.19
ProcVersionSignature: Ubuntu 4.13.0-16.19-lowlatency 4.13.4
Uname: Linux 4.13.0-16-lowlatency x86_64
ApportVersion: 2.20.7-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: akops 3337 F.... pulseaudio
 /dev/snd/controlC0: akops 3337 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
Date: Thu Oct 26 10:44:11 2017
HibernationDevice: RESUME=/dev/mapper/system-swap_1
MachineType: LENOVO 20BWS2YL00
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.13.0-16-lowlatency root=/dev/mapper/system-root ro quiet splash i915.enable_rc6=0 crashkernel=384M-:128M vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-16-lowlatency N/A
 linux-backports-modules-4.13.0-16-lowlatency N/A
 linux-firmware 1.169
SourcePackage: linux
UpgradeStatus: Upgraded to artful on 2017-10-20 (5 days ago)
dmi.bios.date: 07/08/2015
dmi.bios.vendor: LENOVO
dmi.bios.version: JBET51WW (1.16 )
dmi.board.asset.tag: Not Available
dmi.board.name: 20BWS2YL00
dmi.board.vendor: LENOVO
dmi.board.version: SDK0E50510 WIN
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: None
dmi.modalias: dmi:bvnLENOVO:bvrJBET51WW(1.16):bd07/08/2015:svnLENOVO:pn20BWS2YL00:pvrThinkPadT450s:rvnLENOVO:rn20BWS2YL00:rvrSDK0E50510WIN:cvnLENOVO:ct10:cvrNone:
dmi.product.family: ThinkPad T450s
dmi.product.name: 20BWS2YL00
dmi.product.version: ThinkPad T450s
dmi.sys.vendor: LENOVO

Alexander Kops (alexkops) wrote :
Alexander Kops (alexkops) wrote :

One thing that I tried was disabling RC6 in grub, but that didn't seem to help:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash i915.enable_rc6=0"

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Download full text (6.4 KiB)

Another thing I found in /var/log/kern.log inbetween the FIFI underrun messages is this:

Oct 26 09:32:02 ako-notebook kernel: [58379.446230] ------------[ cut here ]------------
Oct 26 09:32:02 ako-notebook kernel: [58379.446267] WARNING: CPU: 2 PID: 2653 at /build/linux-XO_uEE/linux-4.13.0/drivers/gpu/drm/i915/intel_display.c:12273 intel_atomic_commit_tail+0xda3/0xf80 [i915]
Oct 26 09:32:02 ako-notebook kernel: [58379.446267] Modules linked in: ccm ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 br_netfilter bridge stp llc aufs bnep binfmt_misc cmdlinepart intel_spi_platform intel_spi spi_nor mtd btusb btrtl btbcm btintel bluetooth arc4 ecdh_generic uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media intel_rapl x86_pkg_temp_thermal intel_powerclamp iwlmvm coretemp kvm_intel mac80211 kvm irqbypass intel_cstate intel_rapl_perf iwlwifi snd_seq_midi joydev snd_seq_midi_event input_leds snd_rawmidi serio_raw wmi_bmof snd_hda_codec_hdmi thinkpad_acpi snd_hda_codec_realtek cfg80211 snd_seq intel_pch_thermal snd_hda_codec_generic nvram rtsx_pci_ms snd_hda_intel memstick snd_hda_codec snd_seq_device snd_hda_core snd_hwdep snd_pcm mei_me snd_timer
Oct 26 09:32:02 ako-notebook kernel: [58379.446292] mei lpc_ich shpchp snd soundcore mac_hid xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack libcrc32c iptable_filter parport_pc ppdev lp parport ip_tables x_tables autofs4 algif_skcipher af_alg dm_crypt hid_generic usbhid hid rtsx_pci_sdmmc i915 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd i2c_algo_bit drm_kms_helper psmouse syscopyarea sysfillrect sysimgblt fb_sys_fops ahci e1000e libahci drm rtsx_pci ptp pps_core wmi video
Oct 26 09:32:02 ako-notebook kernel: [58379.446324] CPU: 2 PID: 2653 Comm: gnome-shell Tainted: G U W 4.13.0-16-lowlatency #19-Ubuntu
Oct 26 09:32:02 ako-notebook kernel: [58379.446325] Hardware name: LENOVO 20BWS2YL00/20BWS2YL00, BIOS JBET51WW (1.16 ) 07/08/2015
Oct 26 09:32:02 ako-notebook kernel: [58379.446325] task: ffffa06e241d8000 task.stack: ffffae6b88e08000
Oct 26 09:32:02 ako-notebook kernel: [58379.446345] RIP: 0010:intel_atomic_commit_tail+0xda3/0xf80 [i915]
Oct 26 09:32:02 ako-notebook kernel: [58379.446346] RSP: 0018:ffffae6b88e0bb08 EFLAGS: 00010286
Oct 26 09:32:02 ako-notebook kernel: [58379.446347] RAX: 0000000000000019 RBX: ffffa06e1b1d8310 RCX: 0000000000000006
Oct 26 09:32:02 ako-notebook kernel: [58379.446347] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffffa06e3dc8dc70
Oct 26 09:32:02 ako-notebook kernel: [58379.446348] RBP: ffffae6b88e0bbc0 R08: 00000000000004df R09: 0000000000000004
Oct 26 09:32:02 ako-notebook kernel: [58379.446348] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa06e22a1d000
Oct 26 09:32:02 ako-notebook kernel: [58379.446349] R13: ffffa06e251bd000 R14: ffffa06d809ee000 R15: ffffa06e1b1d8308
...

Read more...

Alexander Kops, thank you for reporting this and helping make Ubuntu better.

In order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following:
1) The one to test is at the very top line at the top of the page (not the daily folder).
2) The release names are irrelevant.
3) The folder time stamps aren't indicative of when the kernel actually was released upstream.
4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds .

If testing on your main install would be inconvenient, one may:
1) Install Ubuntu to a different partition and then test this there.
2) Backup, or clone the primary install.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Also, you don't need to apport-collect further unless specifically requested to do so.

It is most helpful that after testing of the latest upstream kernel is complete, you mark this report Status Confirmed.

Lastly, to keep this issue relevant to upstream, please continue to test the latest mainline kernel as it becomes available.

Thank you for your help.

tags: added: bios-outdated-1.30 regression-release
description: updated
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Alexander Kops (alexkops) wrote :

So I tried to reproduce it with the kernel v4.14-rc6 (bb176f67090ca54869fc1262c913aa69d2ede070)
I tried to get that error for an hour with some lock/unlock, suspend and hibernate cycles but it didn't appear.
Unfortunately I can't test it for a whole working day, since the internet is not working with that kernel (network manager shows everything is connected, but typing "route" on the command line yields an empty result).

Alexander Kops, to clarify:
1) When using the Ubuntu kernel, is the issue consistently reproducible?

2) You mention in your Bug Description:
>"...one of the three things happen when I come back after some time and try to unlock (so it went to power save in-between)"
Are you stating that only the screen turned off but the computer is still on, or the computer is set to either suspend or hibernate after a certain amount of time?

3) Kernel 4.14-rc7 just came out, which might let you test with internet working.

Alexander Kops (alexkops) wrote :

Hi
1) When using the current kernels 4.13.0-16-generic or 4.13.0-16-lowlatency it is insofar reproducible, that it happens most of the time (I'd say more than 50%).
But not every time, that's why I'm not confident to say the bug is fixed after one or two unlocks.

2) Yes, I think it only happens after suspend though. When I lock the screen and the computer goes into power save mode, then this would happen. If I only lock it and immediately unlock, this is not a problem.

3) I installed the rc7 and indeed the internet works with it. I'll be able to use it on a full work day on Wednesday. I'll report back on the end of the day Wednesday.

Alexander Kops (alexkops) wrote :

Tried it now with kernel 4.14.0-041400rc7-generic and it still appears. (Notebook sits in dock, locked the screen, let it sit for 45 minutes, came back to computer being shut down).

In /var/log/syslog I find

Nov 1 10:01:05 xxx kernel: [ 404.329288] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Nov 1 10:02:30 xxx kernel: [ 489.669089] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Nov 1 10:04:38 xxx kernel: [ 617.781995] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

Alexander Kops, to advise, posting small snips of syslog, or any log for that matter, isn't helpful, as what you posted could be a result of some other issue earlier in the log. Instead, you would want to post the log in its entirety, and advise when in the log you performed an action correlated to the problem. In this case, the event most correlated to your computer improperly shutting down is the act of locking it.

With this in mind, I'd like to follow up on your comment:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662/comments/8
>"Yes, I think it only happens after suspend though."

It needs to be confirmed without doubt via you checking the GUI settings that your computer is set after you lock the screen that it will only turn off the monitor, only suspend, or turn off the monitor first, then some time after this event, it goes into suspend, etc. Could you please advise to which situation precisely is the case?

Also, with issues like these, folks typically update their buggy, outdated, and insecure BIOS prior to reporting, as BIOS bugs can manifest issues like this. As per your dmesg:
https://launchpadlibrarian.net/343008378/CurrentDmesg.txt
[ 0.000000] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0x25 (or later)
...
[ 0.120112] pnp 00:01: [Firmware Bug]: PNP resource [mem 0xfed10000-0xfed13fff] covers only part of 0000:00:00.0 Intel MCH; extending to [mem 0xfed10000-0xfed17fff]

In addition, there are numerous BIOS bugs fixed as identified by Lenovo:
https://pcsupport.lenovo.com/us/en/products/LAPTOPS-AND-NETBOOKS/THINKPAD-T-SERIES-LAPTOPS/THINKPAD-T450S/downloads/DS102110

Once updated, could you please post the result of the following terminal command:
sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date

Alexander Kops (alexkops) wrote :

Oh, right now I'm running on the BIOS version
JBET51WW (1.16 )
07/08/2015
That's indeed a lot behind the current one. I'll try to update and report back.

Alexander Kops (alexkops) wrote :

I updated my BIOS to the current version
~ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
JBET66WW (1.30 )
09/13/2017

It didn't fix the [Firmware Bug] lines in the syslog though
Also the actual bug still appears. There seem to be some lines in the log that I haven't seen before though

Nov 3 12:37:05 xxx kernel: [ 8137.161988] [drm:pipe_config_err [i915]] *ERROR* mismatch in pixel_rate (expected 148500, found 296999)
Nov 3 12:37:05 xxx kernel: [ 8137.162024] [drm:pipe_config_err [i915]] *ERROR* mismatch in shared_dpll (expected ffff9818ca8b46e0, found ffff9818ca8b47
78)
Nov 3 12:37:05 xxx kernel: [ 8137.162051] [drm:pipe_config_err [i915]] *ERROR* mismatch in base.adjusted_mode.crtc_clock (expected 148500, found 296999
)
Nov 3 12:37:05 xxx kernel: [ 8137.162076] [drm:pipe_config_err [i915]] *ERROR* mismatch in port_clock (expected 270000, found 540000)
Nov 3 12:37:05 xxx kernel: [ 8137.162078] pipe state doesn't match!

Can you advise on how to narrow down the error to the actual state where it happens?
Can it be achieved through changing settings in /etc/systemd/logind.conf?
Because currently the only way to reproduce it is locking the screen while the computer is sitting in the docking station and come back after a longer period of time (e.g. 30 minutes)

Alexander Kops (alexkops) wrote :

Somebody else seems to have a similar problem:
https://bugzilla.redhat.com/show_bug.cgi?id=1506339

Alexander Kops, I'd like to circle back on some issues I raised that were unaddressed/unanswered in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662/comments/10:

1) Please make a comment to if you have the GUI setup to suspend and/or hibernate after a set amount of time, and how long the timeout is.
2) Please provide a screenshot of the GUI and how you have it configured for inactivity regarding the above.
3) Please stop posting snips of logs. As I mentioned before, it is not helpful here. Instead, attach the log uncompressed/untarred in its entirety with no modifications.
4) That's great you found fedora report who is reporting a similar error message. Unfortunately, there is so little information in that report, its largely useless.

Alexander Kops (alexkops) wrote :

Automatic locking is set to on after 3 minutes.

Alexander Kops (alexkops) wrote :

Here a screenshot of the energy settings (switched language to english). I'm not sure if there is any other setting where I could influence the behaviour.

Alexander Kops (alexkops) wrote :

Here the screen lock settings in English again.

Alexander Kops (alexkops) wrote :

Note that I usually lock the screen manually via the top right menu when I leave the computer.

tags: added: latest-bios-1.30
removed: bios-outdated-1.30

Alexander Kops:
1) Could you please remove the kernel parameter "i915.enable_rc6=0", and any other non-default kernel parameters and keep it that way going forward? It is best to keep this as close to the default configurations as possible.
2) Could you please reproduce the problem with the latest mainline kernel, and attach to this report as separate files syslog and kernel.log?
3) Please advise which of the three events happened as per the Bug Description.
4) To satisfy a curiousity, does your computer's fan run loudly, and/or does the computer run hot temperature wise? Temperature monitoring information may be found at: https://help.ubuntu.com/community/SensorInstallHowto I would recommend polling only once every few seconds.

Alexander Kops (alexkops) wrote :

1) I forgot to mention that I removed "i915.enable_rc6=0" after I tried the first mainline kernel. So everything after that was already with RC6 enabled.
2) I'll try to reproduce it with the latest mainline kernel on Monday and attach the requested files.

4) I can't hear the fan and the temperatures seem to be ok right now (after a day of running), but I will keep monitoring it

sensors
iwlwifi-virtual-0
Adapter: Virtual device
temp1: +37.0°C

thinkpad-isa-0000
Adapter: ISA adapter
fan1: 0 RPM

acpitz-virtual-0
Adapter: Virtual device
temp1: +46.0°C (crit = +128.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +45.0°C (high = +105.0°C, crit = +105.0°C)
Core 0: +45.0°C (high = +105.0°C, crit = +105.0°C)
Core 1: +45.0°C (high = +105.0°C, crit = +105.0°C)

pch_wildcat_point-virtual-0
Adapter: Virtual device
temp1: +47.0°C

Kai-Heng Feng (kaihengfeng) wrote :

Please try kernel here http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/.

If the same issue still happens, file an upstream bug at: http://bugzilla.freedesktop.org
Product: DRI
Component: DRM/Intel

Alexander Kops (alexkops) wrote :
  • syslog Edit (496.7 KiB, application/octet-stream)

So I used the kernel at drm-intel-nightly/current/
It ran fine the whole day, but now I saw again that the notebook was shut down after locking the screen and coming back after half an hour.
But I'm not sure if it's the same bug, since I don't see any of the error messages previously reported in this bug report in the logs.

I'll attach syslog and kern.log. The problem happened between 17:00 and 17:30

Alexander Kops (alexkops) wrote :

The kern.log

Alexander Kops, to advise, if the root cause problem is drm-intel (I'll let Kai-Heng Feng speak to this as the root cause since he asked you to test this), and this is confirmed an issue in the latest mainline kernel (which is now 4.14-rc8) then testing drm-intel makes sense.

This is due to how Ubuntu sources paches from mainline (not drm-intel), and some patches, but not necessarily all, are sent from drm-intel to mainline. Also, drm-intel is an experimental, fast changing tree, where some patches that don't and will never exist in mainline can be removed/added at any time.

Despite this, while you did note the kernel crash as per #4, and the errors as per the Bug Description, you didn't include the log in its entirety, which would allow one to best confirm this is the root cause.

I took a peak at the recent attachments, and while I'm not an expert, I didn't find better confirmation either way.

Kai-Heng Feng, could you please advise on your thoughts regarding the root cause here?

Would an SSH/netconsole dump be helpful here as per https://help.ubuntu.com/community/DebuggingSystemCrash, or you think enough information is available?

Alexander Kops (alexkops) wrote :

So if I got you right, additionally you want me to install the latest mainline 4.14-rc8, reproduce the bug there and then attach kern.log and syslog here?

Alexander Kops, yes please. Also, it would be helpful to attach the information via SSH or netconsole via https://help.ubuntu.com/community/DebuggingSystemCrash .

In addition, to follow up:
1) Have the issues happened without having to lock the computer (i.e. during active daily use, with the PC in or not in the docking station)? No need for a test to this specifically, just as per your recollection?
2) When the computer is plugged into the docking station, does the OS recognize this by showing the battery icon in a charging/charged state?
3) Does this issue happen if you don't dock the computer in the docking station, and only have the power cord plugged into the computer directly (i.e. no external monitors plugged)?
4) Does the issue happen bypassing the docking station, and plugging the monitors and power directly into the laptop?
5) Does this issue happen if the PC has nothing connected (i.e. keep it on battery, and no external monitors)?
6) Regarding your question in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662/comments/12 :
>"Can you advise on how to narrow down the error to the actual state where it happens?"

What seems strange is how after locking the computer with it plugged into the docking station, presuming the power cord is plugged into the docking station, there are a few events in order of most to least interest:
* The PC is set to suspend on battery. I originally presumed that this is the issue trigger, but was thrown off as the GUI screenshots are set to do this only on battery.
* The screen dims when inactive. You could try rapidly toggling the brightness max up/down.
* The WiFi turning off. You could try toggling the WiFi hard swich/key combo repeatedly.
* The lock event itself. You could try locking and immediately unlocking multiple times in quick succession.

Alexander Kops (alexkops) wrote :

Today I installed 4.14.0-041400rc8-generic
I had a crash again, this time the computer did not power down but I came back to a completely frozen lock screen (only visible on the main monitor, not the secondary screens), so I had to hard reboot.
I'll attach kern.log and syslog

Alexander Kops (alexkops) wrote :

Syslog from todays crash with 4.14.0-041400rc8-generic

Alexander Kops (alexkops) wrote :

Regarding your other questions:
1) The issues have not happened without locking so far. Everything works smoothly when I don't lock (so that's why my workaround is to not lock the screen when I'm only away for short, although it's against company policies...)
2) When docked in the docking station I can see the loading battery icon and "Fully charged"
3,4,5) I have to test this separately again, maybe on the weekend. So far I did not see the issues when the computer was undocked (and no external displays present).

For the points raised in #6, I will test this and answer back later.

tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-4.14-rc8

Alexander Kops:
1) What is the manufacturer, model, and firmware version of your docking station?
2) Your recent logs much better confirm the root cause. Hence, could you please cross post this issue to Intel following their instructions for 1.1-DRM KERNEL via https://01.org/linuxgraphics/documentation/how-report-bugs ? Please provide a direct URL to this new report once made so that it may be tracked.

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Alexander Kops (alexkops) wrote :

1) It's a lenovo ThinkPad Ultra Dock Type 40A2 20V
I didn't know docking stations had firmwares up until now. Is there a way to find out via Ubuntu? Otherwise I have to talk to my company helpdesk about it.
2) Ok, will report it there. Do I have to try with another kernel again? Since the /drm-intel-nightly/current/ I tried it with did not yield the same log outputs...

description: updated
summary: - Crashes after unlock: [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
- *ERROR* CPU pipe A FIFO underrun
+ [Lenovo ThinkPad T450s] Issues after docking and locking:
+ [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO
+ underrun

Alexander Kops:
>"1) It's a lenovo ThinkPad Ultra Dock Type 40A2 20V I didn't know docking stations had firmwares up until now. Is there a way to find out via Ubuntu? Otherwise I have to talk to my company helpdesk about it."

While I'm not sure how to check via Ubuntu, the path of least resistence may be to pop a Windows laptop on the dock if available, and confirm via https://support.lenovo.com/us/en/solutions/migr-4tpjf4

>"2) Ok, will report it there. Do I have to try with another kernel again? Since the /drm-intel-nightly/current/ I tried it with did not yield the same log outputs..."

As per Intel's instructions, they want to you to test drm-tip, and provide debugging information. However, you don't have to build it. Instead, pre-built builds are provided as a courtesy by Ubuntu maintainers via https://wiki.ubuntu.com/Kernel/MainlineBuilds

Also, don't worry about SSH/netconsole, as that is usually for when the logs aren't giving reasonable hints to root cause. The latest logs were much better.

description: updated
Alexander Kops (alexkops) wrote :

Ok, I'll install the drm-tip kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ , try to reproduce it there, and report it to Intel.

Alexander Kops, to advise, if you can't reproduce in drm-tip, still report it as per their instructions as it's reproducible in mainline (i.e. upstream issue).

I reported this issue in the Ubuntu bug tracker:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662
I was advised to cross post it here.

It can be reproduced in the latest Mainline kernel 4.14-rc8
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc8/
but doesn't seem to appear (after 1.5 days of testing) with the current drm-tip build
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/

Copying original bug description here:
"What happens is after I dock my laptop into a Lenovo ThinkPad Ultra Dock Type 40A2 20V, lock it via clicking the lock icon, and wait ~30 minutes, one of the following three things happen ~50% of the time when I come back to unlock it:
* Most of the time the computer is found shut down.
* Sometimes the notebook screen flickers weirdly and one of the two external monitors show pixelation.
* The notebook screen shows the lock screen but is completely frozen. Occasionally unlocking works.

I have two external monitors attached to the docking station.

When I dock my laptop, the battery indicator does recognize its charging.

I've not seen this issue when:
1) With the laptop in the dock, while actively using the laptop.
2) With the laptop out of the dock and no external monitors present.

This bug started after upgrading from Ubtuntu 17.04 to Ubuntu 17.10.

Using or not using the following kernel parameter didn't change anything:
i915.enable_rc6=0

PC temperatures are normal as per lm-sensors."

Alexander Kops (alexkops) wrote :

Some updates:
* I cross posted the issue to the freedesktop bugtracker here: https://bugs.freedesktop.org/show_bug.cgi?id=103643
* I got my docking station firmware updated, it is now at 2.33.00. Unfortunately I was not able to determine what the previous version was
* I'm running with the current drm-tip kernel for 1.5 days now and the issue was *not* reproducable so far

Alexander Kops, couple of issues:
1) To clarify, after the firmware update, is the issue still reproducible with latest mainline?
2) Your upstream post is missing all the information requested by Intel as documented in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662/comments/30 . It makes it most difficult for upstream to help if the information they request isn't provided. Also, upstream may ignore your post due to it not having required, publicly documented debugging information.

Alexander Kops (alexkops) wrote :

1) Currently trying to reproduce
2) I know, I'm trying to reproduce the issue with "drm.debug=0x1e log_buf_len=1M" added right now, and will add the information later

Alexander Kops:
"2) I know, I'm trying to reproduce the issue with "drm.debug=0x1e log_buf_len=1M" added right now, and will add the information later"

If it's not reproducible in drm-tip, if reproducible in mainline then one could take this debug log from there. This will given them a hint on what difference between mainline and drm-tip addressed the issue.

Hello Alexander,
Could you share a dmesg or/and kern.log with debug information from boot til problem: drm.debug=0x1e log_bug_len=2M on grub.

I'm currently running my computer with the current Mainline kernel (4.14-rc8) and these settings and will post the dmesg as soon as I'm able to reproduce it.

Created attachment 135371
kern.log - Crash seemed to happen at 13:31:38

I added a compressed kern.log (the computer shut down at Nov 10 13:31:38
I reproduced the bug running the current Mainline kernel 4.14.0-041400rc8-generic

Alexander Kops (alexkops) wrote :

So I was just able to reproduce the shut-down issue with the kernel 4.14.0-041400rc8-generic and the debugging flags enabled.
I submitted a kern.log to the upstream bug report.

Changed in linux:
importance: Unknown → High
status: Unknown → Incomplete
description: updated

Created attachment 135436
kern.log with running drm-tip kernel from today - Computer shut down at 15:20:39

Today I was able to reproduce it with the drm-tip kernel found here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/

I attach the kern.log, the computer shut itself down at 15:20:39
This time no messages about a fifo underrun are in the log.

(In reply to Alexander Kops from comment #4)
> Created attachment 135436 [details]
> kern.log with running drm-tip kernel from today - Computer shut down at
> 15:20:39
>
> Today I was able to reproduce it with the drm-tip kernel found here:
> http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/
>
> I attach the kern.log, the computer shut itself down at 15:20:39
> This time no messages about a fifo underrun are in the log.
Hello Alexander, it seems that the log is keeping all the information that you already shared in the first attachment. Could you reproduce with a clean kern.log, since it shuts I guess a dmesg can't be obtained.

To clean kern.log:
# rm /var/logs/kern.log
# reboot
The kern.log will regenerate after boot.

Also I noticed you marked this bug as a regression, could you please share latest good know kernel commit and bad know commit.

Created attachment 135438
kern.log with running drm-tip kernel from today - Computer shut down at 15:20:39

Oops, looks like I re-uploaded the kern.log from last time. This one is the correct one from today.

Also the regression tag was added by "Christopher M. Penalver" from the Ubuntu bug tracker. So I can't point to specific kernel commits.
I just noticed that it started appearing after using the Kernel shipping with Ubuntu 17.10 and it wasn't happening with the kernel from 17.04

(In reply to Alexander Kops from comment #7)
> ...
> Also the regression tag was added by "Christopher M. Penalver" from the
> Ubuntu bug tracker. So I can't point to specific kernel commits.
> I just noticed that it started appearing after using the Kernel shipping
> with Ubuntu 17.10 and it wasn't happening with the kernel from 17.04
That would be 4.9 and 4.13, I guess...

(In reply to Alexander Kops from comment #7)
> Created attachment 135438 [details]
> kern.log with running drm-tip kernel from today - Computer shut down at
> 15:20:39
>
> Oops, looks like I re-uploaded the kern.log from last time. This one is the
> correct one from today.

The logs contain multiple boots with multiple different kernels, so it's hard to say what's what. But this log doesn't seem to have any FIFO underruns. So am I to assume this is now fixed?

> But this log doesn't seem to have any FIFO underruns. So am I to assume this is now fixed?

Well, it is fixed in a sense that these FIFO underruns don't appear anymore with the drm-tip kernel. But the behaviour, that the computer will just turn itself off a lot of times after enabling lock screen is still there.

Created attachment 135451
kern.log with running drm-tip kernel from today - Computer froze at 16:17:45

I'll attach this current kern.log. This time the situation was a bit different, I didn't find the notebook turned off, but the power light was still on, but all three screens were black and it didn't react to anything. So I had to hard reboot it.

Maybe you can see something in the logs that would lead to a follow up bug report?

The last thing I see in the log before the crash are a bunch of

[drm:drm_mode_addfb2 [drm]] [FB:87]

lines.

(In reply to Alexander Kops from comment #11)
> Created attachment 135451 [details]
> kern.log with running drm-tip kernel from today - Computer froze at 16:17:45
>
> I'll attach this current kern.log. This time the situation was a bit
> different, I didn't find the notebook turned off, but the power light was
> still on, but all three screens were black and it didn't react to anything.
> So I had to hard reboot it.
>
> Maybe you can see something in the logs that would lead to a follow up bug
> report?
>
> The last thing I see in the log before the crash are a bunch of
>
> [drm:drm_mode_addfb2 [drm]] [FB:87]
>
> lines.

Nothing interesting there unfortunately. So I guess we're dealing with some kind of hard system hang, and it doesn't manage to write anything useful to the logs. So it's not even clear whether this has anything to do with i915, or caused by something totally different. Maybe try netconsole/serial console if the machine has a ethernet/serial port. Or you may want to look into pstore to see if that might catch something when the machine dies.

Maybe also enable various debug features in the kernel config:
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_PROVE_LOCKING=y

PS. Your logs are huuuuge. Might want to trim away the unrelated boots from the logs.

iirc when ubuntu puts system in slumber it first calls a bunch of addfb's for the fade to black animation. It doesn't mean it's the cause of the issue, though could very well be related to dpms off.

Hi Alexander. Thanks for your bug report and all your effort to troubleshoot and fix this weird issue.
I'm not using any docking station but I'm experiencing the same behaviour (screen flickering and unable to lock my screen) here on a Lenovo T450s with a Intel HD Graphics 5500 (Broadwell GT2) GPU.
It happens both with Ubuntu 17.10 and Fedora (26 and 27) so i'm so I suppose that's a weird bug in the intel driver.
Did you find any workaround or fix?

I can confirm that this weird issue also happens without a docking station.
It happens both with an external screen and without it.

system-manufacturer: LENOVO
system-version: ThinkPad T450s
bios-version: JBET66WW (1.30 )
bios-release-date: 09/13/2017

(In reply to Angelo Lisco from comment #14)
> I can confirm that this weird issue also happens without a docking station.
> It happens both with an external screen and without it.

Alexander Kops, as the original reporter, can you confirm the same without a docking station or external screen?

I tried to reproduce it without docking station once, but wasn't able to. But I also don't use the notebook without docking station for longer times usually, so no throughout testing happened.

Alexander Kops (alexkops) wrote :

As reported upstream, when I used the drm-tip kernel I didn't see the FIFO error message in the logs again (but still had random power downs).
I'm running the mainline kernel 4.15.0-041500rc8-generic and also don't see any of the described issues anymore (although I still have 1-2 random shutdowns per week while the notebook sits in the dock unattended, but there is currently no hint that it's related to the Intel driver).

lanrat (lanrat) wrote :

I think I am having the same issue on my DELL XPS 13 with intel graphics (no dock).

Is this what you are all seeing? https://www.youtube.com/watch?v=2g2GRca3nik

lanrat, in order to address your issue, it will help immensely if you use the computer the problem is reproducible with, and file a new report with Ubuntu via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

lanrat (lanrat) wrote :

Christopher, the purpose of my question was because, after some research, I do not think that this is a device specific issue, and is actually the result of the kernel and graphics card. Hence why I posted the video to see if you are all seeing the same issue or if it is unrelated.

lanrat, nobody can root cause your issue simply by reviewing a video. However, posting a report allows developers to review debugging logs for root causing.

lanrat (lanrat) wrote :

Christopher, I apologize if I am miscommunicating.

All I am asking is if what you are observing is similar to the video I posted, nothing more.

lanrat, Launchpad is a development platform, for Ubuntu users to provide debugging information of problems for developers to review, and fix. Asking folks to review a video without providing debugging information is unhelpful here.

lanrat (lanrat) wrote :

Christopher, I am trying to provide debugging information for myself and the Ubuntu developers. I have NOT asked anybody to perform a root cause analysis, debug my problem, or perform any video review. I have only asked if the observed behavior in the bug in this thread is similar to the one I am observing because all of the other debug information listed in this report is identical to my situation.

At this point I'm giving up on asking here. Maybe you are right and this is the wrong form, my thinking was that it would be helpful for you to know that this issue is not specific to the particular hardware listed here. I apologize if I have not been helpful.

First of all. Sorry about spam.
This is mass update for our bugs.

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.

Joel (jah1809) wrote :

I am also observing this issue on a Thinkpad T470s using a Thinkpad Ultra Type 40A2 dock.

Joel (jah1809) wrote :

Here is my syslog

Joel (jah1809) wrote :

I primarily observe this issue when manually turning on my monitor after unlock, after it fails to autowake. The problem appears to occur at Apr 11 09:08:41 in the attached kern.log.

Running on Ubuntu 16.04

/proc/version:
Linux version 4.13.0-36-generic (buildd@lgw01-amd64-033) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9)) #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018

/proc/cpuinfo attached

Closing, please re-open if still occurs.

Changed in linux:
status: Incomplete → Invalid
joe black (foot3print) on 2018-10-26
Changed in linux (Ubuntu):
status: Triaged → Confirmed
joe black (foot3print) wrote :

Hi, I would like to confirm this as this also affects not only my client but also my Colleagues (thinkpad T460s and T470s). The Problem seem to occur after a few minutes of inactiviy randomly. Some experience this 3-4 times/day while some 2-5 times a week.

Running on Ubuntu Bionic 18.04
Kernel Version: 4.15.0-36-generic

Kernel Log is attached.

Niklas Goerke (niklas974) wrote :

Same Problem seems to occur on my T480s with Thinkpad Ultra Docking station and two external displays.

After booting and login, all displays work, but after executing:
  # xrandr --output DP2-3 --rotate left

Display DP2-3 does not display anything any more, even though xrandr claims it should:
  #xrandr
  Screen 0: minimum 320 x 200, current 5680 x 1920, maximum 8192 x 8192
  […]
  DP2-3 connected 1200x1920+4480+0 left (normal left inverted right x axis y axis) 518mm x 324mm
     1920x1200 59.95*+
  […]

I'm running fully updated Ubuntu 18.10:
4.18.0-12-generic #13-Ubuntu SMP Wed Nov 14 15:17:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

tags: added: bios-outdated-1.35
removed: latest-bios-1.30
Changed in linux (Ubuntu):
status: Confirmed → Triaged

This error still occurs on kernel 4.20.

My model is T470s and the behavior is consistent: It only happens when connected to an external display through the dock, not when using it disconnected from the dock. Can (most of the time) be triggered by changing display settings through xrandr.

Dock model is SD20F82750.

Dmesg error message:
[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun

GPU: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)

Changed in linux:
status: Invalid → Confirmed

(In reply to Johan Thorén from comment #19)
> This error still occurs on kernel 4.20.
>
> My model is T470s and the behavior is consistent: It only happens when
> connected to an external display through the dock, not when using it
> disconnected from the dock. Can (most of the time) be triggered by changing
> display settings through xrandr.
>
> Dock model is SD20F82750.
>
> Dmesg error message:
> [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO
> underrun
>
> GPU: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)

The original bug is reported is on Broadwell. So, your issue could be different from the original issue reported in this bug.

Can you please attach the full dmesg log from boot with kernel parameters drm.debug=0x1e log_buf_len=4M?

What is the impact of this issue other than the error in the log? Can you elaborate the issue?

Changed in linux:
status: Confirmed → Incomplete

Created attachment 143607
dmesg

Here is my dmesg output with the requested parameters. I'm now running the 5.0.0 kernel with the same behavior. The trigger is sometimes a xrandr change, but almost always coming back from suspend. Reboot is necessary.

Created attachment 143608
Video showing the screen

Would appreciate feedback on the data given, if more is needed or if a separate bug report should be filed. Thanks.

Changed in linux:
status: Incomplete → Confirmed
Changed in linux:
importance: High → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.