Ubuntu
linux package

3.13.0-155.205 Kernel Panic - divide by zero

Bug #1787258 reported by Ryan Smith on 2018-08-15

This bug affects 6 people

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Invalid	Critical	Unassigned
	Trusty	Fix Released	Critical	Tyler Hicks

Bug Description

[Impact]

Booting the 3.13.0-155.205 generic kernel on a m3 AWS ec2 instance results in a kernel panic during boot.

[Test Case]

Boot with the 3.13.0-155.205 kernel on an m3 instance and verify that it panics on boot.

Boot a patched kernel on an m3 instance and verify that it boots, without a panic, and that the following warning is present in the kernel logs:

smpboot: x86_max_cores == zero !?!?

[Regression Potential]

The only potential for regressions is in systems that panic while bootnig.

[Original Report]

We have updated our 14.04 aws ec2 instances from 3.13.0-153.204 to 3.13.0-155.205, and upon reboot they all kernel panic. full log attached.

[ 0.064081] FEATURE SPEC_CTRL Not Present
[ 0.068730] mce: CPU supports 2 MCE banks
[ 0.072027] Last level iTLB entries: 4KB 512, 2MB 0, 4MB 0
[ 0.072027] Last level dTLB entries: 4KB 512, 2MB 0, 4MB 0
[ 0.080004] Spectre V2 mitigation: Mitigation: Full generic retpoline
[ 0.084004] Spectre V2 mitigation: Speculation control IBPB not-supported IBRS not-supported
[ 0.088005] Speculative Store Bypass: Vulnerable
[ 0.092402] Freeing SMP alternatives memory: 32K (ffffffff81e7a000 - ffffffff81e82000)
[ 0.104581] ACPI: Core revision 20131115
[ 0.111088] ACPI: All ACPI Tables successfully acquired
[ 0.114991] ftrace: allocating 28746 entries in 113 pages
[ 0.160066] divide error: 0000 [#1] SMP
[ 0.163922] Modules linked in:
[ 0.164000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-155-generic #205-Ubuntu
[ 0.164000] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[ 0.164000] task: ffff8801e4828000 ti: ffff8801e4826000 task.ti: ffff8801e4826000
[ 0.164000] RIP: 0010:[<ffffffff81d4b9f2>] [<ffffffff81d4b9f2>] smp_store_boot_cpu_info+0x58/0x191
[ 0.164000] RSP: 0000:ffff8801e4827e98 EFLAGS: 00010286
[ 0.164000] RAX: 000000000000000e RBX: ffffffff81d18980 RCX: 0000000000000000
[ 0.164000] RDX: 0000000000000000 RSI: 00000000000000d0 RDI: ffff8801efc13380
[ 0.164000] RBP: ffff8801e4827ec0 R08: ffffffff81d18988 R09: 0000000000000004
[ 0.164000] R10: ffffffff8180b6c0 R11: 0001f8ecf7bca282 R12: 0000000000013280
[ 0.164000] R13: 00000000ffffffff R14: 0000000000000100 R15: 000000000000d088
[ 0.164000] FS: 0000000000000000(0000) GS:ffff8801efc00000(0000) knlGS:0000000000000000
[ 0.164000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.164000] CR2: ffff8801effff000 CR3: 0000000001c0e000 CR4: 0000000000160670
[ 0.164000] Stack:
[ 0.164000] ffffffff81d18980 0000000000013280 0000000000000246 0000000000000100
[ 0.164000] 0000000000000000 ffff8801e4827ef0 ffffffff81d4bb82 ffffffff81e5df18
[ 0.164000] ffff8801e4828650 0000000000000246 0000000000000001 ffff8801e4827f00
[ 0.164000] Call Trace:
[ 0.164000] [<ffffffff81d4bb82>] native_smp_prepare_cpus+0x57/0x3e0
[ 0.164000] [<ffffffff81d404e1>] xen_hvm_smp_prepare_cpus+0x9/0x2e
[ 0.164000] [<ffffffff81d3a01b>] kernel_init_freeable+0xa7/0x1eb
[ 0.164000] [<ffffffff81727500>] ? rest_init+0x80/0x80
[ 0.164000] [<ffffffff8172750e>] kernel_init+0xe/0x130
[ 0.164000] [<ffffffff8174a88e>] ret_from_fork+0x6e/0xa0
[ 0.164000] [<ffffffff81727500>] ? rest_init+0x80/0x80
[ 0.164000] Code: 48 89 c7 41 83 cd ff 41 54 53 f3 a5 66 c7 80 da 00 00 00 00 00 be d0 00 00 00 0f b7 0d 20 a9 fc ff 8b 05 e2 c5 28 00 8d 44 01 ff <f7> f1 31 d2 89 05 c8 c2 fc ff 8d 81 ff 7f 00 00 f7 f1 89 c3 89
[ 0.164000] RIP [<ffffffff81d4b9f2>] smp_store_boot_cpu_info+0x58/0x191
[ 0.164000] RSP <ffff8801e4827e98>
[ 0.324006] ---[ end trace 8671c9f8a4dc811d ]---
[ 0.328017] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.328017]

See original description

Tags:

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-15:

system.log.txt Edit (63.1 KiB, text/plain)

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2018-08-15: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1787258

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: trusty

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-15:

I am unable to run 'apport-collect' due to the kernel panic.

Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Ryan Smith (homebrewsky) on 2018-08-15

description:

updated

Revision history for this message

Matt Wilson (msw-amazon) wrote on 2018-08-15:

What instance type saw this kernel panic?

Revision history for this message

Dave Compton (sircompo) wrote on 2018-08-15:

One of my servers died with a Kernel Panic on reboot after the update to 3.13.0-155-generic.
It was an m3.large on AMI ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-20140927 (ami-1711732d).

Joseph Salisbury (jsalisbury) on 2018-08-15

Changed in linux (Ubuntu):
importance:	Undecided → Critical
Changed in linux (Ubuntu Trusty):
importance:	Undecided → Critical
status:	New → Confirmed

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-15:

m3.large for me as well.

Tyler Hicks (tyhicks) on 2018-08-15

Changed in linux (Ubuntu Trusty):
assignee:	nobody → Tyler Hicks (tyhicks)

Revision history for this message

Tyler Hicks (tyhicks) wrote on 2018-08-15:

In comparing the Xenial and Trusty backports for L1TF, I noticed that Trusty is missing this patch:

https://git.kernel.org/linus/56402d63eefe22179f7311a51ff2094731420406

I've cherry-picked the commit and built a test kernel:

https://people.canonical.com/~tyhicks/lp1787258.1/

Please give it a shot (I will myself, shortly) and report back results. Thanks!

Revision history for this message

Dave Compton (sircompo) wrote on 2018-08-15:

Boots OK after changing instance type to m4.large.
Poorly tested Spectre fix?

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-15:

I feel this bug is not a duplicate of #1787127. bug #1787127 is not a kernel panic due to division by zero.

Revision history for this message

Tyler Hicks (tyhicks) wrote on 2018-08-15:

#10

Agreed, it is not a dupe of bug #1787127.

Revision history for this message

Robert C Jennings (rcj) wrote on 2018-08-15:

#11

@tyhicks, I've tested your kernel from comment #7.

1. Launch 2 VMs in us-west-2 with ami-4218403a (20180722, the serial prior to the latest)
2. Upgrade the first VM to the kernel in -updates, reboot, and observe the panic in the console log
3. On the 2nd VM, install the linux-image and linux-headers packages from the link in comment #7 and reboot SUCCESS
* Observed "[ 0.156060] smpboot: x86_max_cores == zero !?!?" in dmesg
* I rebooted a few times just to satisfy myself.
4. Ensure this VM does panic by removing tyhicks' kernel, upgrading the stock kernel, and rebooting. VM console shows panic.

Tyler Hicks (tyhicks) on 2018-08-15

description:	updated
description:	updated

Revision history for this message

mig5 (mig5) wrote on 2018-08-16:

#12

This affected me on 3 t2.medium EC2 instances in eu-west-1.

A fourth machine, updated the previous day, but to same kernel 3.13.0-155.205 and still Ubuntu 14.04, was fine, somehow unaffected.

Stefan Bader (smb) on 2018-08-16

Changed in linux (Ubuntu Trusty):
status:	Confirmed → Fix Committed

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-16:

#13

any ETA on releasing the fix into the wild?

Pascal Ouellet (pas.ouellet) on 2018-08-16

information type:

Public → Public Security

Pascal Ouellet (pas.ouellet) on 2018-08-16

information type:

Public Security → Public

Revision history for this message

Brad Figg (brad-figg) wrote on 2018-08-16:

#14

We're actively working on this problem. I believe a fixed kernel will be out tomorrow.

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-17:

#15

Great! Thanks for the quick fix!

Valentin DARRE (valkiller) on 2018-08-17

Changed in linux (Ubuntu Trusty):
status:	Fix Committed → Fix Released

Revision history for this message

Ryan Smith (homebrewsky) wrote on 2018-08-17:

#16

I checked the https://packages.ubuntu.com/trusty-updates/kernel/, and don't see a new kernel image released yet. Any ETA on it being available?

Revision history for this message

Brad Figg (brad-figg) wrote on 2018-08-17:

#17

We have just released a Trusty kernel (3.13.0-156.206) which should address this issue.

Revision history for this message

Pascal Ouellet (pas.ouellet) wrote on 2018-08-17:

#18

Tested on a few AWS instances that wouldn't boot on 3.13.0-155.205 and they are now back to normal after upgrading to 3.13.0-156.206.

Thanks for the quick fix!

Tyler Hicks (tyhicks) on 2018-09-13

Changed in linux (Ubuntu):
status:	Confirmed → Invalid

Brad Figg (brad-figg) on 2019-07-24

tags:

added: cscc

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

system.log.txt Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

3.13.0-155.205 Kernel Panic - divide by zero

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package