Whole machine freezes at startup using kernels after 4.8

Bug #1697450 reported by Alberto Donato on 2017-06-12
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

I'm trying to install from the daily Artful desktop image (20170612).

After boot, the gnome session hangs at startup, while the "zoom" animation is being shown.

I have an intel card:

00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller])
        Subsystem: Gigabyte Technology Co., Ltd Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller
        Flags: bus master, fast devsel, latency 0, IRQ 29
        Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [a4] PCI Advanced Features
        Kernel driver in use: i915
        Kernel modules: i915

I'm not sure how to debug this further, but I had similar issues with the installer from Zesty as well (see https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1681846)
With Artful, at least it gets past the splash screen but then it hangs.

Note that the system is currently running Yakkety with no issue.
---
ApportVersion: 2.20.7-0ubuntu2
Architecture: amd64
CurrentDesktop: GNOME
DistroRelease: Ubuntu 17.10
InstallationDate: Installed on 2017-07-30 (71 days ago)
InstallationMedia: Ubuntu-Server 17.10 "Artful Aardvark" - Alpha amd64 (20170730)
NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
Package: linux
PackageArchitecture: amd64
ProcVersionSignature: Ubuntu 4.8.0-59.64-generic 4.8.17
Tags: artful wayland-session
Uname: Linux 4.8.0-59-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip libvirt lpadmin lxd plugdev sambashare sudo
_MarkForUpload: True

Alberto Donato (ack) wrote :

I gave this another try with latest daily, with the same result.

I then installed ubuntu-server and manually installed ubuntu-desktop on top of it.

At boot, I get to the login screen but, as soon as I select my user from the gdm menu, everything freezes. I can't see any error in logs when the freeze happens.

Phillip Susi (psusi) on 2017-09-22
no longer affects: ubiquity (Ubuntu)
summary: - Installer freezes at session startup
+ Gnome Shell hangs at login, while the "zoom" animation is being shown
affects: gdm (Ubuntu) → gdm3 (Ubuntu)

I can confirm this is a gnome-shell bug.

I tried booting with multi-user.target and manually starting gnome-shell, it hangs right after showing the mouse pointer on the console screen.

I also tried starting weston manually, which works fine.

Alberto Donato (ack) wrote :

As an additional info, running artful on the 4.8.0-59-generic kernel from yakkety seem to work fine.

Alberto Donato (ack) on 2017-10-08
summary: - Gnome Shell hangs at login, while the "zoom" animation is being shown
+ Gnome Shell hangs at startup
Alberto Donato (ack) on 2017-10-08
tags: added: gnome-17.10

The version of GNOME in Ubuntu 17.10 is now quite different to when this bug was reported.

Does the bug still occur with newer images from here? http://cdimages.ubuntu.com/daily-live/current/

Changed in gdm3 (Ubuntu):
status: New → Incomplete
Changed in gnome-shell (Ubuntu):
status: New → Incomplete
Daniel van Vugt (vanvugt) wrote :

Also, comment #3 suggests this is a kernel bug so I'll add a kernel task.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1697450

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

If I run gnome-shell from the terminal with the latest (4.13.0-12.13) kernel, the system freezes.

Running apport-collect while running the 4.8 kernel segfaults:

$ sudo apport-collect 1697450
Invalid MIT-MAGIC-COOKIE-1 keyUnable to init server: Could not connect: Connection refused
Invalid MIT-MAGIC-COOKIE-1 keyUnable to init server: Could not connect: Connection refused
Segmentation fault (core dumped)

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Alberto Donato (ack) wrote :

Is there any way I could debug the issue with 4.13?

Daniel van Vugt (vanvugt) wrote :

It looks like that error is caused by sudo. Please just run:
  apport-collect 1697450

Changed in linux (Ubuntu):
status: Confirmed → Incomplete

apport information

tags: added: apport-collected wayland-session
description: updated

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed

FTR this still happens with gnome-shell 3.26.1-0ubuntu4 and kernel 4.13.0-16-generic

Daniel van Vugt (vanvugt) wrote :

OK, we're going to need to analyse the hung gnome-shell process. Please try to do this:

1. Install openssh-server:
   sudo apt install openssh-server
2. Using a second machine, log into the offending machine via SSH:
   ssh <USERNAME>@<HOSTNAME>
   where <USERNAME> is your username, and <HOSTNAME> is either hostname or the machine's IP address.
3. Reproduce the hang.
4. From your ssh login, run:
   killall -TRAP -u $USER gnome-shell
5. Look in /var/crash for a newly created file, something like '_usr_bin_gnome-shell.1000.crash'
6. Upload the crash file using this command or similar:
   ubuntu-bug /var/crash/_usr_bin_gnome-shell.1000.crash
7. Tell use the new bug ID that was created.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Alberto Donato (ack) wrote :

To test the issue, I booted in text mode (multi-user.target), logged in and then started gnome-shell.

Nothing happens (the session doesn't start and the console seems hang).

Via ssh, I ran the killall, saved the .crash file and uploaded to the following bug:

https://bugs.launchpad.net/ubuntu/+source/gnome-shell/+bug/1726600

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in gnome-shell (Ubuntu):
status: Incomplete → Confirmed
Changed in gdm3 (Ubuntu):
status: Incomplete → Confirmed
Daniel van Vugt (vanvugt) wrote :

Great, thanks. Looks like a duplicate of bug 1505409.

Alberto Donato (ack) wrote :

I'm still seeing a similar issue with daily images of Bionic.
At boot, the machine hangs at splash screen.
If I boot without "splash quiet", I can see the last message is about starting login session.

I'm not sure this is actually a duplicate of #1505409.

How can I help fixing this? It makes it impossible for me to run ubuntu with any kernel > 4.8

tags: added: bionic
Daniel van Vugt (vanvugt) wrote :

Ah, yes. Sorry but your bug 1726600 is a duplicate of bug 1505409. And I think this one is different to those two. It just means the wrong .crash file was sent...

Please try following the steps in #15 again, or repeatedly, and we'll see if we can find an instance of this one.

Changed in gdm3 (Ubuntu):
status: Confirmed → Incomplete
Changed in gnome-shell (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Alberto Donato (ack) wrote :

@Daniel I can't really follow those steps exactly, as the machine hangs as soon as either GDM or gnome-shell is started

Daniel van Vugt (vanvugt) wrote :

Yes, I am assuming "hangs" means that the user interface is hung, but not the internals of the system. So that's why the instructions in comment #15 require a second machine and for you to be logged in from that already before reproducing the graphical hang.

Alberto Donato (ack) wrote :

@Daniel no, the whole system freezes, even if I ssh before the UI starts (which requires exact timing), then the machine freezes.

Daniel van Vugt (vanvugt) wrote :

OK, sounds very likely to be a kernel bug, triggered by GPU usage upon login.

summary: - Gnome Shell hangs at startup
+ Whole machine freezes at startup using kernels after 4.8
Changed in linux (Ubuntu):
status: Incomplete → New
Changed in gdm3 (Ubuntu):
status: Incomplete → Invalid
no longer affects: gdm3 (Ubuntu)
no longer affects: gnome-shell (Ubuntu)

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1697450

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Alberto Donato (ack) wrote :

How can I collect logs?

I'm currently using kernel 4.8.0-59.64 from yakkety, anything newer makes the machine freeze at gdm login, or as a soon as I press enter at login screen

Alberto Donato (ack) on 2018-01-05
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Kai-Heng Feng (kaihengfeng) wrote :

Can you switch to VT when this issue happens?

Alberto Donato (ack) wrote :

No, everything is frozen, if I ssh in the machine before starting the session, ssh is frozen too

Kai-Heng Feng (kaihengfeng) wrote :

Zesty has this problem, right?

Then please try Linux kernel v4.9 in http://kernel.ubuntu.com/~kernel-ppa/mainline/

Alberto Donato (ack) wrote :

yes, Zesty has the same issue, and apparently bionic too.

FTR I've tried the v4.14.12 one from the mainline PPA and it behaves the same.
I'll give v4.9 a try too.

Alberto Donato (ack) wrote :

I tested the 4.9.0-040900-generic (4.9.0-040900.201612111631) kernel from the mainline PPA, same issue.

Looking at the boot output the last message I see is about gdm.service starting.
The machine freezes at the boot output as soon as the mouse cursor appears.

Kai-Heng Feng (kaihengfeng) wrote :

Now please find the first v4.9-rc* that has this issue.

Alberto Donato (ack) wrote :

So, it seems that changing a bios setting for the (builtin) video card frequency to "Auto" fixed the issue. it was previously set to match the RAM freq (as the video memory is shared).
I can now run 4.13 fine.
Maybe something changed in the intel driver in 4.9 that makes it freeze when a fixed freq is set?

> On 26 Jan 2018, at 7:55 PM, Alberto Donato <email address hidden> wrote:
>
> So, it seems that changing a bios setting for the (builtin) video card frequency to "Auto" fixed the issue. it was previously set to match the RAM freq (as the video memory is shared).
> I can now run 4.13 fine.
> Maybe something changed in the intel driver in 4.9 that makes it freeze when a fixed freq is set?

Probably. But only a bisection can make sure.
The good things is now it works for you.

>
> --
> You received this bug notification because you are subscribed to linux
> in Ubuntu.
> https://bugs.launchpad.net/bugs/1697450
>
> Title:
> Whole machine freezes at startup using kernels after 4.8
>
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> I'm trying to install from the daily Artful desktop image (20170612).
>
> After boot, the gnome session hangs at startup, while the "zoom"
> animation is being shown.
>
> I have an intel card:
>
>
> 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller])
> Subsystem: Gigabyte Technology Co., Ltd Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller
> Flags: bus master, fast devsel, latency 0, IRQ 29
> Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
> Memory at e0000000 (64-bit, prefetchable) [size=256M]
> I/O ports at f000 [size=64]
> [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
> Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
> Capabilities: [d0] Power Management version 2
> Capabilities: [a4] PCI Advanced Features
> Kernel driver in use: i915
> Kernel modules: i915
>
>
> I'm not sure how to debug this further, but I had similar issues with the installer from Zesty as well (see https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1681846)
> With Artful, at least it gets past the splash screen but then it hangs.
>
> Note that the system is currently running Yakkety with no issue.
> ---
> ApportVersion: 2.20.7-0ubuntu2
> Architecture: amd64
> CurrentDesktop: GNOME
> DistroRelease: Ubuntu 17.10
> InstallationDate: Installed on 2017-07-30 (71 days ago)
> InstallationMedia: Ubuntu-Server 17.10 "Artful Aardvark" - Alpha amd64 (20170730)
> NonfreeKernelModules: zfs zunicode zcommon znvpair zavl
> Package: linux
> PackageArchitecture: amd64
> ProcVersionSignature: Ubuntu 4.8.0-59.64-generic 4.8.17
> Tags: artful wayland-session
> Uname: Linux 4.8.0-59-generic x86_64
> UpgradeStatus: No upgrade log present (probably fresh install)
> UserGroups: adm cdrom dip libvirt lpadmin lxd plugdev sambashare sudo
> _MarkForUpload: True
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1697450/+subscriptions

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers