Kernel panic - not syncing: Fatal exception in interrupt

Bug #1702164 reported by Khaled Massad
282
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

we updated Kernel of Ubuntu 16.04 from 4.4.0-81-generic to generic_4.4.0-83.106 we had multiple reboots, and below appeared during the boot :

[231489.162816] Kernel panic - not syncing: Fatal exception in interrupt
[231490.197513] Shutting down cpus with NMI
[231490.210772] Kernel Offset: disabled
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 4.4.0-83-generic (buildd@lgw01-29) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 (Ubuntu 4.4.0-83.106-generic 4.4.70)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-83-generic root=UUID=e675b47c-3493-4ef8-8b6f-44acb3d1adb9 ro net.ifnames=0 biosdevname=0 cgroup_enable=memory swapaccount=1 console=tty1 console=ttyS0
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[ 0.000000] x86/fpu: Using 'eager' FPU context switches.
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000f47fffffff] usable
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.4 present.
[ 0.000000] Hypervisor detected: Xen
[ 0.000000] Xen version 4.2.
[ 0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
[ 0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
[ 0.000000] You might have to change the root device
[ 0.000000] from /dev/hd[a-d] to /dev/xvd[a-d]
[ 0.000000] in your root= kernel command line option
[ 0.000000] e820: last_pfn = 0xf480000 max_arch_pfn = 0x400000000
[ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT
[ 0.000000] e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000
[ 0.000000] found SMP MP-table at [mem 0x000fbc20-0x000fbc2f] mapped at [ffff8800000fbc20]
[ 0.000000] Scanning 1 areas for low memory corruption
[ 0.000000] Using GB pages for direct mapping
[ 0.000000] RAMDISK: [mem 0x3612a000-0x3708cfff]
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x00000000000EA020 000024 (v02 Xen )
[ 0.000000] ACPI: XSDT 0x00000000FC014140 00005C (v01 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: FACP 0x00000000FC0131A0 0000F4 (v04 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: DSDT 0x00000000FC001CE0 011438 (v02 Xen HVM 00000000 INTL 20090123)
[ 0.000000] ACPI: FACS 0x00000000FC001CA0 000040
[ 0.000000] ACPI: FACS 0x00000000FC001CA0 000040
[ 0.000000] ACPI: APIC 0x00000000FC0132A0 000460 (v02 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: SRAT 0x00000000FC013780 0008A8 (v01 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: HPET 0x00000000FC014050 000038 (v01 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: WAET 0x00000000FC014090 000028 (v01 Xen HVM 00000000 HVML 00000000)
[ 0.000000] ACPI: SSDT 0x00000000FC0140C0 000031 (v02 Xen HVM 00000000 INTL 20090123)
[ 0.000000] ACPI: SSDT 0x00000000FC014100 000031 (v02 Xen HVM 00000000 INTL 20090123)
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 3 08:14 seq
 crw-rw---- 1 root audio 116, 33 Jul 3 08:14 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
DistroRelease: Ubuntu 16.04
Ec2AMI: ami-e28530f4
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: x1.16xlarge
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: Xen HVM domU
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-83-generic root=UUID=e675b47c-3493-4ef8-8b6f-44acb3d1adb9 ro net.ifnames=0 biosdevname=0 cgroup_enable=memory swapaccount=1 console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 4.4.0-83.106-generic 4.4.70
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-83-generic N/A
 linux-backports-modules-4.4.0-83-generic N/A
 linux-firmware 1.157.11
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial ec2-images
Uname: Linux 4.4.0-83-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 02/16/2017
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd02/16/2017:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

CVE References

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1702164

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: xenial
information type: Public → Public Security
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Khaled Massad (khaledmassad) wrote : AudioDevicesInUse.txt

apport information

tags: added: apport-collected ec2-images
description: updated
Revision history for this message
Khaled Massad (khaledmassad) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : JournalErrors.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : Lspci.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : ProcModules.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : UdevDb.txt

apport information

Revision history for this message
Khaled Massad (khaledmassad) wrote : WifiSyslog.txt

apport information

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Does the panic go away if you boot back into the prior kernel version?

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Revision history for this message
Khaled Massad (khaledmassad) wrote :

Yes, I restored last kernel on another VM, and it worked fine
But panic just removed after kernel crashed, and server rebooted by itself 3 times.
it's up since 2 days with crash.

Revision history for this message
Khaled Massad (khaledmassad) wrote :

sorry:
it's up since 2 days without crashes.

Revision history for this message
Joao Luis (jpluis) wrote :

Sorry if this not the proper way to ask, but I think I may have the same problem (with a Clevo P150SM laptop), and asked for help on https://askubuntu.com/questions/933015/ubuntu-16-04-kernel-4-4-0-83-panic-fatal-exception-in-interrupt

(For me, it started crashing again today after one more set of package updates).

If you think I may have relevant information, please ask, and I will upload/post.

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Joao, it's probably best to run ubuntu-bug linux to file a new report; that should automatically attach what it can from the logs. Your photo may be useful too. (I know it just reports a "warning" but that might be useful all the same.)

Thanks

Revision history for this message
Joao Luis (jpluis) wrote :

Today, kernel 4.4.0-83 its booting fine (and it doesn't crash/hang on boot).

(And yesterday, I did many reboots using the laptop power-on button, and it crashed every time, after the package upgrades).

I run ubuntu-bug and chose "distribution upgrade" category (what else ?), but it gave back no bug report number, so that I can add an explanation to what is being reported.

Next time it happens, I will try to follow
https://wiki.ubuntu.com/Kernel/CrashdumpRecipe#Crash_kernel_fails_to_load:_Hang
(but it seems a bit complicated).

(and I will run a memory check first...)

Thank you for the followup.

Revision history for this message
Joao Luis (jpluis) wrote :

Ooops! My first contact with ubuntu-bug was a bit messy. I seem to have reported #1703091

Sorry for the mess.

Revision history for this message
John Fanjoy (john.fanjoy.viawest) wrote :

We are also seeing relatively frequent kernel panics using 4.4.0-83. We run an Apache Mesos cluster using the docker containerizer. The description of LP #1687512 appears to match quite nicely with our configuration (cgroups enabled with small allocations). That issue appears to have been addressed in 4.4.0-79, but perhaps this is related.

Revision history for this message
Jim Duffield (jduffield7) wrote :

We are experiencing these intermittent kernel panics running a very similar setup; using Apache Mesos and docker. Due to security reasons, we are not able to run a previous version kernel due to the fixes included in 4.4.0-83 to address CVE-2017-1000364.

Revision history for this message
Jim Duffield (jduffield7) wrote :
Download full text (4.9 KiB)

Below is a stack trace from kern.log for one of the hosts. For security reasons, I have redacted the hostname from the output. Stack traces for other machines also mention /build/linux-0uniEn/linux-4.4.0/net/ipv6/addrconf_core.c:160 in6_dev_finish_destroy. Hopefully, this may prove useful:

Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301243] ------------[ cut here ]------------
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301252] WARNING: CPU: 22 PID: 118193 at /build/linux-0uniEn/linux-4.4.0/net/ipv6/addrconf_core.c:159 in6_dev_finish_destroy+0x6b/0xc0()
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301253] Modules linked in: veth binfmt_misc xt_nat xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc mpt3sas raid_class scsi_transport_sas mptctl mptbase dell_rbu zram lz4_compress zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) ipmi_ssif ipmi_devintf input_leds dcdbas intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp joydev kvm_intel kvm irqbypass sb_edac edac_core mei_me lpc_ich mei ipmi_si ipmi_msghandler shpchp 8250_fintek mac_hid acpi_power_meter ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 xfs btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 ses enclosure crct10dif_pclmul crc32_pclmul ghash_clmulni_intel hid_generic usbhid hid mxm_wmi aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd tg3 ptp ahci pps_core libahci megaraid_sas fjes wmi
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301326] CPU: 22 PID: 118193 Comm: kworker/u384:0 Tainted: P O 4.4.0-83-generic #106-Ubuntu
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301327] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.2.5 09/08/2016
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301332] Workqueue: netns cleanup_net
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301334] 0000000000000286 00000000631d084a ffff880c7680bba0 ffffffff813f9513
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301337] 0000000000000000 ffffffff81d75940 ffff880c7680bbd8 ffffffff81081322
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301339] ffff881f2bb2fc00 ffff881f2bb2e000 0000000000000006 ffff880c7680bca8
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301341] Call Trace:
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301347] [<ffffffff813f9513>] dump_stack+0x63/0x90
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301352] [<ffffffff81081322>] warn_slowpath_common+0x82/0xc0
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301354] [<ffffffff8108146a>] warn_slowpath_null+0x1a/0x20
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301356] [<ffffffff8181ba0b>] in6_dev_finish_destroy+0x6b/0xc0
Jul 24 06:15:01 <HOSTNAME REDACTED> kernel: [508525.301359] [<ffffffff817f17e6>] ip6_route_dev_notify+0x116/0x130
Jul 24 06:15:01 <HOS...

Read more...

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.