17.04, i915 freeze on VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller

Bug #1684010 reported by sles
38
This bug affects 7 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Just freezes all unity and/or X.
Nothing in any logs though.

16.10 worked just fine.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
sles (slesru) wrote :

using kernel from 16.10 while using 17.04 userspace solves problem.
i.e. this is kernel bug.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.11 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc7

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
sles (slesru) wrote :

Yes, right after upgrade to 17.04 from 16.10.
We'll test mainline kernel and report.
Thank you!

Revision history for this message
Vaclav Rehak (vaclav-n) wrote :

I'm running 16.04 with xenial-proposed enabled and I'm experiencing random freezes like this a few times a day since kernel upgrade about two weeks ago.

$ uname -a
Linux vaclav-ntb 4.10.0-19-generic #21~16.04.1-Ubuntu SMP Fri Apr 7 08:20:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
sles (slesru) wrote :

Tested kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc7
Runs fine, at least no freezes for 6 hours.

Thank you!

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-fixed-upstream
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a reverse bisect to figure out which commit upstream fixes this regression. It would be very helpful to know the last kernel that had this issue and the first kernel that did not.

Can you test the following kernels and report back:

 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11-rc1

Revision history for this message
sles (slesru) wrote :

Yes, sure, but , most probably next week, sorry.
If rc1 hangs then we'll try rc2, etc..
Thank you!

Revision history for this message
sles (slesru) wrote :

Hello!

rc1 looks good, no freezes for several hours.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test 4.10 final? If it has the bug, we can bisect between v4.10 and v4.11-rc1.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10/

tags: added: performing-bisect
Revision history for this message
Traumflug (mah-jump-ing) wrote :

A few additional debugging bits:

- Here this typically happens when watching video. As if some memory region gets exhausted after viewing some 100'000 frames.

- I've seen the mouse freezing once, typically it stays movable.

- Audio continues while video freezes.

- I've seen Firefox going <defunct> (in that short time between the first application freezing and everything else freezing, too).

- With everything frozen, no login via SSH possible. PC simply doesn't answer over Ethernet. Switching to another console (ctrl-alt-F1) isn't possible, either.

- Judging by CPU fan noise, one core (of two) typically gets 100% load.

I just installed the 4.10.0.19 kernel I found in the regular package repo and will report back how long it works without freeze. My freeze experience so far was with 4.10.0.20.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Kernel 4.10.0.19 from the regular 17.04 repo appears to work fine, working for some 12 hours now. It's slightly different than the one @Vaclav Rehak reported above as not working:

$ uname -a
Linux piccard 4.10.0-19-generic #21-Ubuntu SMP Thu Apr 6 17:04:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

If there are more kernels I can try for bisecting I'll happily do this.

Revision history for this message
sles (slesru) wrote :

Hello!

It can be strange, but we can't reproduce freeze using kernel from
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10/

Thank you!

Revision history for this message
Traumflug (mah-jump-ing) wrote :

> Kernel 4.10.0.19 from the regular 17.04 repo appears to work fine

Too bad, this was a false positive. Today this kernel froze, too. Firefox became a zombie (<defunct>) twice.

@sles, finding a working kernel is certainly good help for the moment, but not really a solution. Think of all the people not showing up here and what these likely do. Continuing with the bisect is much appreciated!

Revision history for this message
sles (slesru) wrote :

>Continuing with the bisect is much appreciated!

This is what we are doing , right? :-)

Revision history for this message
Traumflug (mah-jump-ing) wrote :

4.10.0 from mainline works fine here. Running it for two days now without flaws.

$ uname -a
Linux piccard 4.10.0-041000-generic #201702191831 SMP Sun Feb 19 23:33:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Hmm. 4.10.0 from mainline works, 4.10.0(.19) from the regular repo does not. Now I'm not sure which one to test next, because I don't know how numbers of the regular repo are related to numbers of the mainline repo. Is 4.10-rc5 or 4.10.6 the next candidate?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

If upstream 4.10 works, but Ubuntu 4.10.0-19 does not, then that indicates the bug was introduced by an Ubuntu specific SAUCE patch.

We should bisect the Ubuntu kernels. We need to identify the last good Ubuntu kernel and the first bad one. We know that 4.10.0-19 is bad, so we should test some earlier versions.

Can you next test Ubuntu 4.10.0-9? It can be downloaded from:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/12033595

For this kernel, you need to install both the linux-image and linux-image-extra .deb packages.

The full list of Ubuntu 4.10 kernels is available here:
https://launchpad.net/ubuntu/zesty/+source/linux

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Thanks for the hint, 4.10.0-9 now running.

Revision history for this message
sles (slesru) wrote :

4.10.0-9 works without problems for several hours here.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

I'll give -9 an additional day to be sure. If you want to continue, 4.10.0-14 is the next candidate.

https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/12139078

Revision history for this message
sles (slesru) wrote :

There is no problem in 4.10.0-14 .
What should we test now?
Thank you!

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
sles (slesru) wrote :

4.10.0-17 should be OK too- my colleague tests all kernels- we run the same hardware except video- I have nvidia card. I forgot to ask him about results, but he didn't complain, so it should be fine.
Really, it looks very strange, because I don't see any video related changes from 17 to 20 in changelog.
Anyway, next 4 days are holidays here, so we can test only after May 9.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

I just booted 4.10.0-18 and graphics apparently fell back to software rendering. Only one screen size (1600x1200) available, animations run very slow, CPU fan spins up when moving windows, such things.

Will run this for two days to see wether it's at least not freezing.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Right. Missing video hardware acceleration (and missing audio, ACPI) was due to the kernel extras package not installed. Corrected this situation and now giving 4.10.0-18 another try.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Just had a kernel freeze after some 20 hours of operation with -18:

$ uname -a
Linux piccard 4.10.0-18-generic #20-Ubuntu SMP Wed Apr 5 17:18:34 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Will now try -17 to confirm sles' findings and also to have a working kernel until somebody has an idea on what to do next. Running some debugging tools or custom compiling a kernel is certainly possible as long as it can be done on a single PC. My other hardware is a rather old Macintosh which can't do much more than serving as a terminal.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Just had kind of a freeze with -17 after about two hours of operation. While other freezes so far manifested in the cursor vanishing and everything becoming unresponsive, this freeze resulted in single applications freezing on input attempts. Like freezing after opening a menu and selecting "Quit". Eventually all applications were frozen, starting new ones or switching console wasn't possible.

$ uname -a
Linux piccard 4.10.0-17-generic #19-Ubuntu SMP Tue Apr 4 16:17:04 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Watching YouTube videos appears to accelerate all this freezing.

Revision history for this message
Jonas Slivka (jonas-slivka) wrote :

I'm also experiencing freezes on Ubuntu Gnome 17.04 (x64). I'm on dell Latitude E5470 with the following specs:

Intel® Core™ i5-6440HQ CPU @ 2.60GHz × 4
Intel® HD Graphics 530 (Skylake GT2)

Tried -20, -17, -14 kernels - all experiencing complete freezes. Now I've been on -9 for a day without any freezes.

Revision history for this message
Jonas Slivka (jonas-slivka) wrote :

Got freeze on -9 as well:

➜ ~ uname -a
Linux dell 4.10.0-9-generic #11-Ubuntu SMP Mon Feb 20 13:47:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Attaching a portion of kern.log with call trace.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Congratulations for managing to get a backtrace, @Jonas Slivka!

Did you run a kernel with debug symbols, perhaps? My kern.log has zero backtraces, despite at least 5 freezes.

Revision history for this message
Jonas Slivka (jonas-slivka) wrote :

@Traumflug, no, I just installed the following two .debs from https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/12033595 and booted into that kernel:

- linux-image-4.10.0-9-generic_4.10.0-9.11_amd64.deb
- linux-image-extra-4.10.0-9-generic_4.10.0-9.11_amd64.deb

Revision history for this message
sles (slesru) wrote :

Well, looks like problem is memory management, which can be related to video, because intel uses system RAM.
During last freeze we logged-in to my colleague computer and seen khugepages eating cpu-
there was no swap on this computer, so we added it and today there was no freezes ,
4.10.0-20-generic ...

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Had a freeze with -14, too, after 3 days of operations (hibernating at night)

That said, it doesn't exactly look like there's somebody keen on fixing this bug. Workarounds like adding swap just postpone the problem. And Ubuntu folks went silent.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Running kernel -21 from the general repo I managed to grab the attached output of 'top'. khugepaged going 100%, staying there forever. System load eventually went up to 8 or more. System didn't freeze entirely, just slowed down a lot.

Revision history for this message
Traumflug (mah-jump-ing) wrote :

Here's a similar, older bug: https://bugzilla.redhat.com/show_bug.cgi?id=879801 , workarounds included there, see comments 13, 14, 15.

Revision history for this message
Jonas Slivka (jonas-slivka) wrote :

It's possible that this bug is related to (or a duplicate of) this bug:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1680904

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.