No server console display after Grub screen until fully booted and pressing Ctrl-Alt-F1/8

Bug #1656605 reported by Michael Lueck
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
High
Unassigned
Xenial
Incomplete
High
Unassigned
release-upgrader-apt (Ubuntu)
New
Undecided
Unassigned
Xenial
New
Undecided
Unassigned

Bug Description

Today I did our first production 14.04 to 16.04 LTS server upgrade. I had done so a couple of times successfully on test server machines. Upon booting up on the 16.04 system/kernel, after the Grub screen the server console screen is completely blank.

Someone suggested using Ctrl-Alt-F1/8, and indeed pressing those hotkeys does toggle through the tty screens.

There is no boot process logged to the server console screen.

I did have the last 14.04 kernel yet installed, so selected that one from the Grub menu. Booting that 14.04 kernel, then server console display behaves normally.

So I suspected perhaps a bum install of the 16.04 kernel. I used the following commands while booted to the 14.04 kernel to reinstall the 16.04 kernel:

$ sudo dpkg -P linux-image-4.4.0-59-generic linux-image-extra-4.4.0-59-generic linux-image-generic
$ sudo apt-get install linux-image-generic

Same results booting the 16.04 kernel after those steps.

Hardware spec is an Intel Atom D945GCLF2D boxed motherboard with Intel Atom 330 Dual-Core processor and 2GB RAM. RAID is 3Ware SATA RAID.

Revision history for this message
Michael Lueck (mlueck) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :
Brad Figg (brad-figg)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Michael Lueck (mlueck) wrote :

A friend advised me to check two more outputs between a 14.04 server and 16.04 impacted server. Attaching the following additional outputs:

mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay$ cat /proc/consoles > consoles.log
mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay$ cat /proc/cmdline > cmdline.log
mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay$ cat consoles.log
tty0 -WU (EC p ) 4:1
mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay$ cat cmdline.log
BOOT_IMAGE=/vmlinuz-4.4.0-59-generic root=UUID=87ad7999-94fd-4004-b898-d6a943e5895e ro quiet splash vt.handoff=7

BTW: Another server, same config still on 14.04 LTS shows output:

mdlueck@cirlnx01:~$ cat /proc/consoles
tty0 -WU (EC p ) 4:1

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.13.0-107-generic root=UUID=b8d89557-1fb2-4748-8708-f6ca416b7a87 ro quiet splash vt.handoff=7

So looks quite similar output. (I do not detect any red flags, at least.)

Revision history for this message
Michael Lueck (mlueck) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :

Ah, a notable difference between the hardware spec of the test servers and the real servers:

The test server is on a test machine which has a graphics board installed in it: EVGA e-GeForce 8400 GS Graphics Card - 512 MB RAM 512-P3-1301-KR

The real servers are using the on-board Intel graphics.

No option to move the Nvidia board to the impacted server machine as there is no PCI-e slot on the server boards. They have only one card slot which is in use for the RAID controller.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.10 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc4

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key xenial
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Michael Lueck (mlueck) wrote :

Yes, working perfectly! Trying to remember how I set tags. Here is the output of applying the test kernel:

mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay/v4.10-rc4$ sudo dpkg -i linux-image-4.10.0-041000rc4-generic_4.10.0-041000rc4.201701152031_i386.deb
[sudo] password for mdlueck:
Selecting previously unselected package linux-image-4.10.0-041000rc4-generic.
(Reading database ... 57910 files and directories currently installed.)
Preparing to unpack linux-image-4.10.0-041000rc4-generic_4.10.0-041000rc4.201701152031_i386.deb ...
Done.
Unpacking linux-image-4.10.0-041000rc4-generic (4.10.0-041000rc4.201701152031) ...
Setting up linux-image-4.10.0-041000rc4-generic (4.10.0-041000rc4.201701152031) ...
Running depmod.
update-initramfs: deferring update (hook will be called later)
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 4.10.0-041000rc4-generic /boot/vmlinuz-4.10.0-041000rc4-generic
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 4.10.0-041000rc4-generic /boot/vmlinuz-4.10.0-041000rc4-generic
update-initramfs: Generating /boot/initrd.img-4.10.0-041000rc4-generic
W: Possible missing firmware /lib/firmware/i915/kbl_dmc_ver1_01.bin for module i915
W: Possible missing firmware /lib/firmware/i915/kbl_guc_ver9_14.bin for module i915
W: Possible missing firmware /lib/firmware/i915/bxt_guc_ver8_7.bin for module i915
run-parts: executing /etc/kernel/postinst.d/update-notifier 4.10.0-041000rc4-generic /boot/vmlinuz-4.10.0-041000rc4-generic
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 4.10.0-041000rc4-generic /boot/vmlinuz-4.10.0-041000rc4-generic
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.10.0-041000rc4-generic
Found initrd image: /boot/initrd.img-4.10.0-041000rc4-generic
Found linux image: /boot/vmlinuz-4.4.0-59-generic
Found initrd image: /boot/initrd.img-4.4.0-59-generic
Found linux image: /boot/vmlinuz-3.13.0-107-generic
Found initrd image: /boot/initrd.img-3.13.0-107-generic
Found memtest86+ image: /memtest86+.elf
Found memtest86+ image: /memtest86+.bin
done

tags: added: kernel-fixed-upstream
Michael Lueck (mlueck)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Michael Lueck (mlueck) wrote :

Question: So is there hope of getting this fixed in the default kernel version for the 16.04 release, or is it going to be addressed only with the "Ubuntu 16.04.2 LTS Point Release Coming On Feb 2 With Linux Kernel 4.8" newer kernel?

No... Linux Kernel 4.4 vs Linux Kernel 4.10, so that is not Linux Kernel 4.8 either.

Anyway, do you need any further input from me? And can this be resolved with the Linux Kernel 4.4 version? I would rather not have to apply a hardware compatibility kernel to be able to run Ubuntu 16.04 on server hardware from 2009.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the update. We can perform a "Reverse" bisect to identify the commit that fixes the bug in v4.10-rc4. We first need to identify the last bad kernel and first good one.

Can you test the following kernels:

4.5 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-wily/
4.8-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8-rc1/
4.10-rc1: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc1/

You don't have to test every kernel, just up until the kernel that first kernel that does not have this bug.

We can SRU the commit to all releases this bug affects once we identify it.

Changed in linux (Ubuntu):
status: Confirmed → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → High
Changed in linux (Ubuntu):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Michael Lueck (mlueck) wrote :

Excellent, first try lucky. I downloaded all three requested kernel packages. Started with the:

4.5 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-wily/

And this version already corrects the entire issue seen with the 4.4.0-59-generic official 16.04 version.

Here is the console output from applying it:

mdlueck@ldslnx01:/srv/shares/data/Download/OpenSource/Ubuntu/Xenial/Bugs/1656605_ServerNoConsoleDisplay/v4.5-wily$ sudo dpkg -i linux-image-4.5.0-040500-generic_4.5.0-040500.201603140130_i386.deb
Selecting previously unselected package linux-image-4.5.0-040500-generic.
(Reading database ... 59100 files and directories currently installed.)
Preparing to unpack linux-image-4.5.0-040500-generic_4.5.0-040500.201603140130_i386.deb ...
Done.
Unpacking linux-image-4.5.0-040500-generic (4.5.0-040500.201603140130) ...
Setting up linux-image-4.5.0-040500-generic (4.5.0-040500.201603140130) ...
Running depmod.
update-initramfs: deferring update (hook will be called later)
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 4.5.0-040500-generic /boot/vmlinuz-4.5.0-040500-generic
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 4.5.0-040500-generic /boot/vmlinuz-4.5.0-040500-generic
update-initramfs: Generating /boot/initrd.img-4.5.0-040500-generic
run-parts: executing /etc/kernel/postinst.d/update-notifier 4.5.0-040500-generic /boot/vmlinuz-4.5.0-040500-generic
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 4.5.0-040500-generic /boot/vmlinuz-4.5.0-040500-generic
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.10.0-041000rc4-generic
Found initrd image: /boot/initrd.img-4.10.0-041000rc4-generic
Found linux image: /boot/vmlinuz-4.5.0-040500-generic
Found initrd image: /boot/initrd.img-4.5.0-040500-generic
Found linux image: /boot/vmlinuz-4.4.0-59-generic
Found initrd image: /boot/initrd.img-4.4.0-59-generic
Found memtest86+ image: /memtest86+.elf
Found memtest86+ image: /memtest86+.bin
done

I will stay booted to this kernel version.

Please let me know if I may be of further assistance.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also confirm the bug does indeed happen with the upstream 4.4 final kernel:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-wily/

That will tell us if the bug was introduced in 4.4 or one of the 4.4 stable updates. It will also tell us that 4.4 final is the starting kernel for the reverse bisect.

We should also narrow down the 4.5 a bit further as well by testing some of the 4.5 release candidates. Can you test 4.5-rc1:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-rc1-wily/

Revision history for this message
Michael Lueck (mlueck) wrote :

OK, I got the first kernel (v4.4-wily) validated that it works properly. I IPL'ed back to the 4.4.0-59 to validate the problem had not suddenly vanished, still there.

So:
4.4.0-59 bad / has the issue
v4.4-wily working / no issue

Do you still need me to install / test / validate the v4.5-rc1-wily kernel, in this case?

I am thinking to stay booted to the v4.4-wily kernel for now.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

No need to test the 4.5-rc1 kernel. Because v4.4-wily is good, we can not bisect between 4.4 and 4.5-rc1. It also means that the bug was introduced by one of the 4.4 stable updates, or by an Ubuntu specific SAUCE patch.

We should bisect between the 4.4 Ubuntu kernels. However, it would be good to also know if this bug was also introduced into Yakkey(16.10). Can you test the latest Yakkety kernel, which can be downloaded from:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/11765356

With that kernel, you need to install both the linux-image and linux-image-extra .deb packages.

For the Xenial bisect, the kernels we need to test are all available from:
https://launchpad.net/ubuntu/xenial/+source/linux

What we would need to do is test the kernels and find the earliest kernel that does not have the bug, then the first that does. Then we would bisect between those two versions to find the offending commit.

Revision history for this message
Michael Lueck (mlueck) wrote :

Noted you want me to test the Yakkey(16.10) kernel specified.

About the Xenial kernels... I am thinking to walk the version list back from 4.4.0-59 to land on one that does not have the error. Then post the last one that was working, and the first one with the noted defect. Does that sound like a plan to you, Joseph?

Revision history for this message
Michael Lueck (mlueck) wrote :

Wowsers... this bug is already present in linux-image-4.4.0-1-generic!

So was linux-image-4.4.0-1-generic the very next version after v4.4-wily? Or do I need to test some additional versions?

The URL link you posted for the Yakkey(16.10) kernel leads directly to amd64 specific... this system is running the i386 kernel.

Revision history for this message
Michael Lueck (mlueck) wrote :

... and I was not suppose to be booting from a lower entry than the top entry for the particular kernel build I am testing, was I? Top has no suffix text, next one I believe was labeled "Upstart" and third was "Recovery".

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Yes, the top entry in GRUB is the one to boot.

Sorry about posting the amd64 link for Yakkety. The i386 link is:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/11923303

Now that we know the bug is in the earlies 4.4 Ubuntu based kernel, it would be good to know if it is in any of the 4.3 based kernels. The last Xenial based one is here:
https://launchpad.net/ubuntu/+source/linux/4.3.0-7.18

Revision history for this message
Michael Lueck (mlueck) wrote :

Same defect with both the https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/11923303 and https://launchpad.net/ubuntu/+source/linux/4.3.0-7.18 kernel builds.

I am back booted to linux-image-4.4.0-59-generic and all other kernels have been purged back off the server.

Please advise what you need me to test next.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :

Tested 4.2.0-19.23 kernel, same issue. Purging it off.

So, quick question... between "v4.4-wily" which worked and "linux-image-4.4.0-1-generic", is the difference there the Ubuntu customizations applied to the official Kernel build? Thus is something in how Ubuntu customizes the Kernel source which is causing this issue?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Correct, the v4.4-wily kernel is an upstream version of the kernel with no Ubuntu customization. That is the reason we need to go back in time and test the older Ubuntu kernels to find the specific version when the bug was introduced. That will allow us to bisect and find the exact commit that caused the regression.

Can you test test the 4.2.0-16.19 kernel:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/8099558

That is the earliest pre-built Xenial kernel, so if that kernel has the bug, I'll have to build older versions.

Revision history for this message
Michael Lueck (mlueck) wrote :

Sorry... defect is yet persisting with the 4.2.0-16.19 kernel build. I purged it back off our server. Booted back to 4.4.0-59.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :

The 3.19.0-80 kernel build still has the defect. I purged it back off our server. Booted back to 4.4.0-59.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It looks like I'll have to build some kernels in between 3.13 and 3.19. Just as a confirmation, can you ensure the latest Trusty kernel does not have the bug:

https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/11845243

Revision history for this message
Michael Lueck (mlueck) wrote :

Yes that v3.13.0-108.155 kernel build boots up with the expected Ubuntu .... splashscreen. Though would not boot up all the way cleanly... went into busybox.

Received v4.4.0-62 via Xenial updates, so purging off both the v3.13.0-108.155 and v4.4.0-59 kernel versions.

So yes, somewhere between 3.13 and 3.19 appears to be where the damage was done. Perhaps do a build of 3.16ish to begin a binary search method?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Revision history for this message
Michael Lueck (mlueck) wrote :

Build 3.16.0-23 has the reported defect. Purged back off.

Next build to evaluate, please...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a 3.16.0-0.1 kernel. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1656605/

Can you give this kernel a test?

Revision history for this message
Michael Lueck (mlueck) wrote :

There does not appear to be i386 files in that directory.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a 64 bit kernel. The 32 bit one is there now.

Revision history for this message
Michael Lueck (mlueck) wrote :

Build 3.16.0-0.1 has the reported defect. Purged back off.

Next build to evaluate, please...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a 3.15.0-0.1 kernel. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1656605/

Can you give this kernel a test?

Revision history for this message
Michael Lueck (mlueck) wrote :

Looks like build 3.15.0-0.1 still has the issue. It also would not boot correctly... landed the server at a busybox type interface.

Purged back off.

So, into the 3.14's next?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There was never a 3.14 based Ubuntu kernel released, so we may be at the two bisectable versions.

Can you confirm 3.13.0-9.29 is goood:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/5589423

If 3.13.0-9.29 is good, I can bisect between 3.13.0-9.29 and 3.15.0-0.1 and start building test kernels.

Revision history for this message
Michael Lueck (mlueck) wrote :

Correct, 3.13.0-9.29 boots with appropriate "Ubuntu ...." text, then lands at a busybox type interface.

So looks like something happened bad in-between there.

Purged back off. Ready for the next test.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This may have been introduced by commit 4ae5bd9. However, I'd first like to see if this is expected and not a bug.

Can you perform the following:

edit /etc/default/grub and remove quiet and splash from the line:
GRUB_CMDLINE_LINUX_DEFAULT

So the line will be:
GRUB_CMDLINE_LINUX_DEFAULT=""

After changing that line, run the following:
sudo update-grub and reboot.

Revision history for this message
Michael Lueck (mlueck) wrote :

Excellent find, Joseph!

Here are the lines of interest from our /etc/default/grub files:

GRUB_DEFAULT=0
#GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX=""

So we had value: "quiet splash"

I tried your suggested value: "" That worked, not quite the old user experience, however.

I tried value: "splash" to see if that would bring back the "Ubuntu ...." start-up splash screen. Nadda.

I next moved to value: "quiet" That at least shows IPL progress, and successfully lands at TTY1 login screen. We will run with this new setting for now.

So question, was it intentional to no longer have "Ubuntu ...." IPL 'splash screen' during which Esc may be pressed to see the actual IPL steps on the server console?

Revision history for this message
Michael Lueck (mlueck) wrote :

We have now built brand new servers with Ubuntu Server 16.04 x64. The initial state of the /etc/default/grub file is:

GRUB_CMDLINE_LINUX_DEFAULT=""

Must be that edit was missed in the LTS upgrade from 14.04 to 16.04.

I have modified our LTS upgraded 16.04 servers to this setting.

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Revision history for this message
Michael Lueck (mlueck) wrote :

@Joseph, Do you mean incomplete based on this bug report needing to morph into an issue against the 14.04 to 16.04 LTS upgrade process?

How may I assist?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Correct, it doesn't sound like this is a kernel bug, so I added the upgrade package.

Changed in linux (Ubuntu Xenial):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
Revision history for this message
Michael Lueck (mlueck) wrote :

I just tested a Xubuntu 14.04 x64 LTS upgrade to Xubuntu 16.04. The value:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
is correct for Xubuntu. That is what causes the boot process to have the GUI screen rather than text console messages.

So note to the upgrade package maintainers... it needs to be smart enough to detect/differentiate between Ubuntu Server vs Ubuntu LTS upgrades and select the correct value.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.