KVM + Virtio kernel panic

Bug #1744169 reported by Matthew Wynne
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I received the following kernel panic on a KVM VM which is acting as an OpenStack compute node. I'm unable to consistently reproduce this, but I do consistently get various kernel panics (this is only one of them). Is anyone able to shed any light on what might be happening here?

Ubuntu Server 16.04 cloud image.
Kernel Version: 4.4.0-98-generic

[19927.255474] general protection fault: 0000 [#1] SMP
[19927.256033] Modules linked in: vhost_net vhost macvtap macvlan xt_REDIRECT nf_nat_redirect xt_mark ip6table_mangle iptable_nat nf_nat_ipv4 nf_nat xt_connmark iptable_mangle ip6table_raw nf_conntrack_ipv6 xt_CT xt_mac xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_physdev xt_set xt_conntrack ip_set_hash_net ip_set nfnetlink iptable_raw veth ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables br_netfilter bridge stp llc vport_gre ip_gre ip_tunnel gre vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_defrag_ipv6 nf_conntrack ppdev kvm_amd kvm input_leds serio_raw joydev irqbypass parport_pc parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
[19927.256033] CPU: 3 PID: 2801 Comm: vna_agent Not tainted 4.4.0-98-generic #121-Ubuntu
[19927.256033] Hardware name: RDO OpenStack Compute, BIOS 1.9.1-5.el7_3.3 04/01/2014
[19927.256033] task: ffff88023551c600 ti: ffff8800bb9ac000 task.ti: ffff8800bb9ac000
[19927.256033] RIP: 0010:[<ffffffff811ef274>] [<ffffffff811ef274>] cmpxchg_double_slab.isra.54+0x24/0xe0
[19927.256033] RSP: 0000:ffff88023fd83a40 EFLAGS: 00010206
[19927.256033] RAX: 009a660a158b487e RBX: ffff880233955a00 RCX: c01902a30f48c088
[19927.256033] RDX: c01902a30f48c089 RSI: ffffffff8106b846 RDI: 0000000040000000
[19927.256033] RBP: ffff88023fd83a88 R08: ffff880233955a00 R09: c01902a30f48c088
[19927.256033] R10: 009a660a158b487e R11: ffff8801cf5bfc00 R12: 009a660a158b487e
[19927.256033] R13: ffffffff8106b846 R14: 0000000000000000 R15: ffff880237001700
[19927.256033] FS: 00007fbf85e38740(0000) GS:ffff88023fd80000(0000) knlGS:0000000000000000
[19927.256033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19927.256033] CR2: 00007ffcaa2ca468 CR3: 00000000ba760000 CR4: 00000000003406e0
[19927.256033] Stack:
[19927.256033] 0000000000000000 ffff880233d4f008 0000000000000000 ffff880233cabc80
[19927.256033] ffffc90000e48004 0000000000000001 ffff88023fd83ac0 ffffffff8175ff24
[19927.256033] 000000000f48c088 ffff88023fd83b50 ffffffff811ef3fb ffff880233d4f008
[19927.256033] Call Trace:
[19927.256033] <IRQ>
[19927.256033] [<ffffffff8175ff24>] ? sch_direct_xmit+0x74/0x220
[19927.256033] [<ffffffff811ef3fb>] __slab_free+0xcb/0x2c0
[19927.256033] [<ffffffff8173b086>] ? __dev_queue_xmit+0x286/0x590
[19927.256033] [<ffffffff811efae4>] kmem_cache_free+0x1d4/0x1e0
[19927.256033] [<ffffffff81722d89>] kfree_skbmem+0x59/0x60
[19927.256033] [<ffffffff81723fb4>] consume_skb+0x34/0x90
[19927.256033] [<ffffffff817ab440>] arp_process+0x80/0x750
[19927.256033] [<ffffffff817abc63>] arp_rcv+0x133/0x1c0
[19927.256033] [<ffffffff818285a4>] ? packet_rcv+0x44/0x440
[19927.256033] [<ffffffff81738884>] __netif_receive_skb_core+0x704/0xa60
[19927.256033] [<ffffffff817298ba>] ? __build_skb+0x2a/0xe0
[19927.256033] [<ffffffff81738bf8>] __netif_receive_skb+0x18/0x60
[19927.256033] [<ffffffff81738c72>] netif_receive_skb_internal+0x32/0xa0
[19927.256033] [<ffffffff81739703>] napi_gro_receive+0xc3/0xf0
[19927.256033] [<ffffffff8160b6b6>] virtnet_receive+0x4a6/0x8f0
[19927.256033] [<ffffffff8160bb1d>] virtnet_poll+0x1d/0x80
[19927.256033] [<ffffffff8173913e>] net_rx_action+0x21e/0x360
[19927.256033] [<ffffffff81608ef3>] ? skb_recv_done+0x43/0x50
[19927.256033] [<ffffffff81085dc1>] __do_softirq+0x101/0x290
[19927.256033] [<ffffffff810860c3>] irq_exit+0xa3/0xb0
[19927.256033] [<ffffffff818470a4>] do_IRQ+0x54/0xd0
[19927.256033] [<ffffffff81845182>] common_interrupt+0x82/0x82
[19927.256033] <EOI>
[19927.256033] [<ffffffff81064636>] ? native_safe_halt+0x6/0x10
[19927.256033] [<ffffffff81063b82>] kvm_wait+0x52/0x60
[19927.256033] [<ffffffff810cb516>] __pv_queued_spin_lock_slowpath+0x1d6/0x210
[19927.256033] [<ffffffff81844421>] _raw_spin_lock+0x21/0x30
[19927.256033] [<ffffffff811c204b>] handle_mm_fault+0x60b/0x1820
[19927.256033] [<ffffffff810efbe9>] ? hrtimer_try_to_cancel+0x29/0x130
[19927.256033] [<ffffffff81843886>] ? do_nanosleep+0x96/0xf0
[19927.256033] [<ffffffff810f0b1c>] ? hrtimer_nanosleep+0xdc/0x210
[19927.256033] [<ffffffff8106b577>] __do_page_fault+0x197/0x400
[19927.256033] [<ffffffff8106b847>] trace_do_page_fault+0x37/0xe0
[19927.256033] [<ffffffff81063f29>] do_async_page_fault+0x19/0x70
[19927.256033] [<ffffffff81846868>] async_page_fault+0x28/0x30
[19927.256033] Code: 0f 1f 80 00 00 00 00 55 49 89 d2 48 89 e5 53 48 83 e4 f0 48 83 ec 40 f7 c7 00 00 00 40 74 2c 48 89 ca 4c 89 d0 4c 89 c3 4c 89 c9 <f0> 48 0f c7 4e 10 0f 94 c0 41 89 c0 b8 01 00 00 00 45 84 c0 75
[19927.256033] RIP [<ffffffff811ef274>] cmpxchg_double_slab.isra.54+0x24/0xe0
[19927.256033] RSP <ffff88023fd83a40>
[19927.296033] general protection fault: 0000 [#2] [19927.405901] ---[ end trace f7ced269874d3811 ]---
[19927.405903] Kernel panic - not syncing: Fatal exception in interrupt
[19927.296033] SMP
[19927.296033] Modules linked in: vhost_net vhost macvtap macvlan xt_REDIRECT nf_nat_redirect xt_mark ip6table_mangle iptable_nat nf_nat_ipv4 nf_nat xt_connmark iptable_mangle ip6table_raw nf_conntrack_ipv6 xt_CT xt_mac xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_physdev xt_set xt_conntrack ip_set_hash_net ip_set nfnetlink iptable_raw veth ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables br_netfilter bridge stp llc vport_gre ip_gre ip_tunnel gre vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_defrag_ipv6 nf_conntrack ppdev kvm_amd kvm input_leds serio_raw joydev irqbypass parport_pc parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
[19927.296033] CPU: 1 PID: 11742 Comm: CPU 0/KVM Tainted: G D 4.4.0-98-generic #121-Ubuntu
[19927.296033] Hardware name: RDO OpenStack Compute, BIOS 1.9.1-5.el7_3.3 04/01/2014
[19927.296033] task: ffff880233e52a00 ti: ffff8800820c8000 task.ti: ffff8800820c8000
[19927.296033] RIP: 0010:[<ffffffff811ef667>] [<ffffffff811ef667>] kfree+0x77/0x150
[19927.296033] RSP: 0018:ffff8800820cb9a0 EFLAGS: 00010282
[19927.296033] RAX: 5e415d415c415bff RBX: ffff8801d48b0000 RCX: 0000000000000000
[19927.296033] RDX: 0000000000000000 RSI: ffff880082155000 RDI: 49087e8b49068b49
[19927.296033] RBP: ffff8800820cb9b8 R08: ffff88023fc9a1e0 R09: ffffffff81005f2c
[19927.296033] R10: ffffffff8106b846 R11: 00000000ffffffff R12: 0000000000000000
[19927.296033] R13: ffffffff81005f2c R14: ffff880082155000 R15: ffffffff81e14620
[19927.296033] FS: 00000000016b9880(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000
[19927.296033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19927.296033] CR2: 00000000004895e0 CR3: 00000001d4986000 CR4: 00000000003406e0
[19927.296033] Stack:
[19927.296033] ffff880082155000 0000000000000000 ffff8801d48b0000 ffff8800820cb9f0
[19927.296033] ffffffff81005f2c ffff880082155000 ffffffff81e14620 ffffffffffffffea
[19927.296033] 0000000000000000 ffff880233e52a00 ffff8800820cba18 ffffffff8117d426
[19927.296033] Call Trace:
[19927.296033] [<ffffffff81005f2c>] x86_pmu_event_init+0x16c/0x1f0
[19927.296033] [<ffffffff8117d426>] perf_try_init_event+0x76/0x90
[19927.296033] [<ffffffff81181110>] perf_event_alloc+0x580/0x7d0
[19927.296033] [<ffffffffc0384620>] ? kvm_perf_overflow+0x40/0x40 [kvm]
[19927.296033] [<ffffffff8118307e>] perf_event_create_kernel_counter+0x2e/0x140
[19927.296033] [<ffffffffc038475f>] pmc_reprogram_counter+0xdf/0x140 [kvm]
[19927.296033] [<ffffffffc03848e6>] reprogram_gp_counter+0x126/0x180 [kvm]
[19927.296033] [<ffffffffc06001b7>] amd_pmu_set_msr+0x157/0x170 [kvm_amd]
[19927.296033] [<ffffffffc0384d1a>] kvm_pmu_set_msr+0x1a/0x20 [kvm]
[19927.296033] [<ffffffffc0357710>] kvm_set_msr_common+0x830/0xa80 [kvm]
[19927.296033] [<ffffffff81006674>] ? x86_pmu_enable_all+0xb4/0xd0
[19927.296033] [<ffffffffc05fac90>] svm_set_msr+0x60/0x3a0 [kvm_amd]
[19927.296033] [<ffffffffc034f711>] kvm_set_msr+0x41/0x70 [kvm]
[19927.296033] [<ffffffffc05fbe29>] msr_interception+0x1f9/0x3a0 [kvm_amd]
[19927.296033] [<ffffffffc05fd1c9>] handle_exit+0x129/0xa90 [kvm_amd]
[19927.296033] [<ffffffffc035c5fd>] vcpu_enter_guest+0x86d/0x1130 [kvm]
[19927.296033] [<ffffffffc0362caf>] kvm_arch_vcpu_ioctl_run+0xdf/0x400 [kvm]
[19927.296033] [<ffffffffc034954d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
[19927.296033] [<ffffffff810a9c66>] ? finish_task_switch+0x76/0x220
[19927.296033] [<ffffffff812245cf>] do_vfs_ioctl+0x29f/0x490
[19927.296033] [<ffffffff81840585>] ? schedule+0x35/0x80
[19927.296033] [<ffffffff81224839>] SyS_ioctl+0x79/0x90
[19927.296033] [<ffffffff818446b2>] entry_SYSCALL_64_fastpath+0x16/0x71
[19927.296033] Code: c1 ef 0c 48 c1 e7 06 4c 01 d7 48 8b 47 20 4c 8d 50 ff a8 01 4c 0f 44 d7 49 8b 02 a8 80 0f 84 97 00 00 00 4c 8b 4d 08 49 8b 7a 30 <48> 8b 37 65 48 8b 56 08 65 48 03 35 71 ab e1 7e 4c 39 56 10 0f
[19927.296033] RIP [<ffffffff811ef667>] kfree+0x77/0x150
[19927.296033] RSP <ffff8800820cb9a0>
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jan 16 03:19 seq
 crw-rw---- 1 root audio 116, 33 Jan 16 03:19 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.10
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
DistroRelease: Ubuntu 16.04
Ec2AMI: ami-0000000b
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.compute
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: RDO OpenStack Compute
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-98-generic root=UUID=a857a24a-1868-4c99-8a07-8e21a52a42ae ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 4.4.0-98.121-generic 4.4.90
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-98-generic N/A
 linux-backports-modules-4.4.0-98-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial ec2-images xenial ec2-images
Uname: Linux 4.4.0-98-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: False
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.9.1-5.el7_3.3
dmi.chassis.type: 1
dmi.chassis.vendor: Red Hat
dmi.chassis.version: RHEL 7.3.0 PC (i440FX + PIIX, 1996)
dmi.modalias: dmi:bvnSeaBIOS:bvr1.9.1-5.el7_3.3:bd04/01/2014:svnRDO:pnOpenStackCompute:pvr14.0.7-1.el7:cvnRedHat:ct1:cvrRHEL7.3.0PC(i440FX+PIIX,1996):
dmi.product.name: OpenStack Compute
dmi.product.version: 14.0.7-1.el7
dmi.sys.vendor: RDO

Matthew Wynne (mwynne)
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1744169/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Matthew Wynne (mwynne)
affects: ubuntu → kernel-package (Ubuntu)
affects: kernel-package (Ubuntu) → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1744169

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: xenial
Revision history for this message
Matthew Wynne (mwynne) wrote : CurrentDmesg.txt

apport information

tags: added: apport-collected ec2-images
description: updated
Revision history for this message
Matthew Wynne (mwynne) wrote : JournalErrors.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : Lspci.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : ProcModules.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : UdevDb.txt

apport information

Revision history for this message
Matthew Wynne (mwynne) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the proposed kernel and post back if it resolves this bug?
See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed.

Thank you in advance!

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
tags: added: kernel-da-key
Revision history for this message
Matthew Wynne (mwynne) wrote :

Hi Joseph. It's possible, but not ideal in my situation. I'm running Ubuntu 16.04 server VMs in KVM (via OpenStack) on a CentOS hypervisor, and I'm unable to easily reproduce the issue as it can occur on any of the 50+ VMs I have running. I'd prefer not to change the kernel on every VM if I can avoid it. Is there anything else I can do?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.