Kernel panic causing reboot of EC2 instance

Bug #1596033 reported by Ivan Marcak
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

Dear Sirs,

We are using this AMI for our 9 exactly same d2.2xlarge instances deployed in us-east-1:
ubuntu/images/hvm/ubuntu-trusty-14.04-amd64-server-20160314-2016-03-29_1459261902 (ami-93cfd9f9)

All of them are running fine but last week we experiences two reboots with no evident reason.
We asked Amazon support for help and they provided following kernel panic messages which cause restarts:

-------------------------------------------------------------------------------------------------------
[4777272.889262] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[4777272.893249] IP: [<ffffffff81370d61>] rb_next+0x1/0x50
[4777272.893249] PGD a1d20b067 PUD 14dfb6067 PMD 0
[4777272.893249] Oops: 0000 [#1] SMP
[4777272.893249] Modules linked in: btrfs raid6_pq xor ufs msdos xfs libcrc32c iptable_filter ip_tables x_tables dm_crypt syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw isofs crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse ixgbevf floppy
[4777272.893249] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 3.13.0-85-generic #129-Ubuntu
[4777272.893249] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
[4777272.893249] task: ffff880efc478000 ti: ffff880efc46e000 task.ti: ffff880efc46e000
[4777272.893249] RIP: 0010:[<ffffffff81370d61>] [<ffffffff81370d61>] rb_next+0x1/0x50
[4777272.893249] RSP: 0018:ffff880efc46fe20 EFLAGS: 00010046
[4777272.893249] RAX: 0000000000000000 RBX: ffff880efae26a00 RCX: 0000000000003d8a
[4777272.893249] RDX: 00000000148106c1 RSI: ffff880efae26e00 RDI: 0000000000000010
[4777272.893249] RBP: ffff880efc46fe68 R08: 000000000091c6d0 R09: 0000000000000000
[4777272.893249] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[4777272.893249] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffff4b48ce
[4777272.893249] FS: 0000000000000000(0000) GS:ffff880f4fc20000(0000) knlGS:0000000000000000
[4777272.893249] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[4777272.893249] CR2: 0000000000000010 CR3: 000000014dad5000 CR4: 00000000001406e0
[4777272.893249] Stack:
[4777272.893249] ffff880efc46fe68 ffffffff810a2f52 000000000000d160 ffff880f4fc33180
[4777272.893249] ffff880efc478430 ffff880f4fc33180 0000000000000001 0000000000000000
[4777272.893249] ffff880efc46ffd8 ffff880efc46fec8 ffffffff8172f48f ffff880efc478000
[4777272.893249] Call Trace:
[4777272.893249] [<ffffffff810a2f52>] ? pick_next_task_fair+0x102/0x1b0
[4777272.893249] [<ffffffff8172f48f>] __schedule+0x13f/0x7f0
[4777272.893249] [<ffffffff81730089>] schedule_preempt_disabled+0x29/0x70
[4777272.893249] [<ffffffff810c2058>] cpu_startup_entry+0x268/0x2b0
[4777272.893249] [<ffffffff8104288d>] start_secondary+0x21d/0x2d0
[4777272.893249] Code: e5 48 85 c0 75 07 eb 19 66 90 48 89 d0 48 8b 50 10 48 85 d2 75 f4 48 8b 50 08 48 85 d2 75 eb 5d c3 31 c0 5d c3 0f 1f 44 00 00 55 <48> 8b 17 48 89 e5 48 39 d7 74 3b 48 8b 47 08 48 85 c0 75 0e eb
[4777272.893249] RIP [<ffffffff81370d61>] rb_next+0x1/0x50
[4777272.893249] RSP <ffff880efc46fe20>
[4777272.893249] CR2: 0000000000000010
[4777272.893249] ---[ end trace c9f72935ef221890 ]---
[4777272.893249] Kernel panic - not syncing: Attempted to kill the idle task!
[4777272.893249] Shutting down cpus with NMI
-------------------------------------------------------------------------------------------------------

[72906.872064] IP: [<ffffffff81370721>] rb_next+0x1/0x50
[72906.872064] PGD 99f26f067 PUD ea4964067 PMD 0
[72906.872064] Oops: 0000 [#1] SMP
[72906.872064] Modules linked in: iptable_filter ip_tables x_tables dm_crypt syscopyarea sysfillrect sysimgblt fb_sys_fops serio_raw isofs xfs libcrc32c crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy ixgbevf
[72906.872064] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.13.0-86-generic #131-Ubuntu
[72906.872064] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/12/2016
[72906.872064] task: ffff880efc47c800 ti: ffff880efc484000 task.ti: ffff880efc484000
[72906.872064] RIP: 0010:[<ffffffff81370721>] [<ffffffff81370721>] rb_next+0x1/0x50
[72906.872064] RSP: 0018:ffff880efc485e20 EFLAGS: 00010046
[72906.872064] RAX: 0000000000000000 RBX: ffff880eefdb8600 RCX: 0000000000005bfc
[72906.872064] RDX: 000000000309e7aa RSI: ffff880eefdb9600 RDI: 0000000000000010
[72906.872064] RBP: ffff880efc485e68 R08: 000000000002398c R09: 0000000000000000
[72906.872064] R10: 0000000000000004 R11: 0000000000000005 R12: 0000000000000000
[72906.872064] R13: 0000000000000000 R14: 0000000000000000 R15: fffffffffff02262
[72906.872064] FS: 0000000000000000(0000) GS:ffff880f4fc80000(0000) knlGS:0000000000000000
[72906.872064] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[72906.872064] CR2: 0000000000000010 CR3: 0000000b0488a000 CR4: 00000000001406e0
[72906.872064] Stack:
[72906.872064] ffff880efc485e68 ffffffff810a2f32 000000000000d160 ffff880f4fc93180
[72906.872064] ffff880efc47cc30 ffff880f4fc93180 0000000000000004 0000000000000000
[72906.872064] ffff880efc485fd8 ffff880efc485ec8 ffffffff8172e17f ffff880efc47c800
[72906.872064] Call Trace:
[72906.872064] [<ffffffff810a2f32>] ? pick_next_task_fair+0x102/0x1b0
[72906.872064] [<ffffffff8172e17f>] __schedule+0x13f/0x7f0
[72906.872064] [<ffffffff8172ed79>] schedule_preempt_disabled+0x29/0x70
[72906.872064] [<ffffffff810c2008>] cpu_startup_entry+0x268/0x2b0
[72906.872064] [<ffffffff810428bd>] start_secondary+0x21d/0x2d0
[72906.872064] Code: e5 48 85 c0 75 07 eb 19 66 90 48 89 d0 48 8b 50 10 48 85 d2 75 f4 48 8b 50 08 48 85 d2 75 eb 5d c3 31 c0 5d c3 0f 1f 44 00 00 55 <48> 8b 17 48 89 e5 48 39 d7 74 3b 48 8b 47 08 48 85 c0 75 0e eb
[72906.872064] RIP [<ffffffff81370721>] rb_next+0x1/0x50
[72906.872064] RSP <ffff880efc485e20>
[72906.872064] CR2: 0000000000000010
[72906.872064] ---[ end trace e35916ef54f0c31a ]---
[72906.872064] Kernel panic - not syncing: Attempted to kill the idle task!
[72906.872064] Shutting down cpus with NMI
-------------------------------------------------------------------------------------------------------

The instances are using mentioned AMI + aptitude safe-upgrade.

Thank You for any help,
Ivan
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jun 20 20:28 seq
 crw-rw---- 1 root audio 116, 33 Jun 20 20:28 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.21
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-93cfd9f9
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: d2.2xlarge
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: Xen HVM domU
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-86-generic root=UUID=8cc7fa93-96a2-46c2-ae6d-3504a724f19a ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 3.13.0-86.131-generic 3.13.11-ckt39
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-86-generic N/A
 linux-backports-modules-3.13.0-86-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty ec2-images
Uname: Linux 3.13.0-86-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 05/12/2016
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd05/12/2016:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1596033

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: trusty
Revision history for this message
Ivan Marcak (ivan-marcak) wrote : BootDmesg.txt

apport information

tags: added: apport-collected ec2-images
description: updated
Revision history for this message
Ivan Marcak (ivan-marcak) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : Lspci.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : ProcModules.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : UdevDb.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : UdevLog.txt

apport information

Revision history for this message
Ivan Marcak (ivan-marcak) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Mathew Hodson (mhodson)
affects: ubuntu-on-ec2 → ubuntu
no longer affects: ubuntu
Changed in linux (Ubuntu):
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.