Bug #1413540 “Trusty soft lockup issues with nested KVM” : Bugs : linux package : Ubuntu

Gema Gomez (gema) on 2015-01-22

tags:	added: cts
description:	updated

Adam Collard (adam-collard) on 2015-01-22

affects:

ubuntu → qemu-kvm (Ubuntu)

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#1

apport.qemu-kvm.3i3ndao3.apport Edit (658 bytes, text/plain)

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#2

ProblemType: Bug
ApportVersion: 2.14.1-0ubuntu3.6
Architecture: amd64
Date: Thu Jan 22 10:37:18 2015
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-0000000f
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.medium
Ec2Kernel: aki-00000002
Ec2Ramdisk: ari-00000002
Package: qemu-kvm (not installed)
ProcEnviron:
TERM=xterm-256color
SHELL=/bin/bash
PATH=(custom, no user)
LANG=en_US.UTF-8
XDG_RUNTIME_DIR=<set>
ProcVersionSignature: User Name 3.13.0-44.73-generic 3.13.11-ckt12
SourcePackage: qemu
Tags: trusty ec2-images
Uname: Linux 3.13.0-44-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
_MarkForUpload: True

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#3

apport.qemu-system-x86.hhh4e6d9.apport Edit (6.5 KiB, text/plain)

Please, bear in mind I have already modified the KSM value for this node not to crash, that is the only change made to this system after the issue, other than rebooting it.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2015-01-22:

#4

Moving to qemu; qemu-kvm is the source package only up to precise.

affects:

qemu-kvm (Ubuntu) → qemu (Ubuntu)

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2015-01-22:

#5

A few comments here.

First, some of upstream qemu's position is that nested KVM is not for enterprise installs in the first place. (Others feel that if there are bugs, they should be worked out)

Second, the bug appears to be a soft luckup. It seems to me the bug should be about that. If you've opened a separate bug about the lockup itself, please mark it both against qemu and linux. If not, then let's mark this bug as also affecting linux.

Thirdly, there's the question of how to (assuming we decide to) implement this. Should /etc/init/qemu-kvm do something like

grep -qi qemu /proc/cpuinfo && echo 0 > /sys/kernel/mm/ksm/run

?

Note that you can currently do this yourself by having your VM install (say at cloud-init or in juju install section) set KSM_ENABLED=0 in /etc/default/qemu-kvm

I'm not sure how to mark the priroity of this bug. If the bug is about the soft lockup, that's high priority (against linux). if it's about disabling ksm, that's low priority (against qemu, since there's a workaround)

Changed in qemu (Ubuntu):
importance:	Undecided → Low
status:	New → Confirmed

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2015-01-22:

#6

@gema,

please re-do the 'apport-collect 1413540' (with the new, corrected package names)

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-01-22: Missing required logs.

#7

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1413540

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: trusty

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22: Re: qemu-kvm package enables KSM on VMs

#8

apport.linux-image-3.13.0-44-generic.61de1tqv.apport Edit (781.2 KiB, text/plain)

apport file for linux.

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#9

apport.linux-image-3.13.0-44-generic.61de1tqv.apport Edit (781.2 KiB, text/plain)

apport file for linux.

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#10

apport.qemu.pnfp6lff.apport Edit (654 bytes, text/plain)

apport for qemu

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#11

I can reproduce this issue and hand a VM over to whoever is going to triage in a hung state.

Chris J Arges (arges) on 2015-01-22

Changed in linux (Ubuntu):
assignee:	nobody → Chris J Arges (arges)
importance:	Undecided → High

Revision history for this message

Chris J Arges (arges) wrote on 2015-01-22:

#12

@serge-hallyn:

First step is to debug this issue and see if its a reasonable fix. If that's possible then we should leave qemu alone.

Second if we want to disable KSM for nested VMs relying on a 'QEMU' cpu isn't sufficient if someone uses a different -cpu flag. We could do:
grep -q "Booting paravirtualized kernel on KVM" /var/log/kern.log && echo 0 > /sys/kernel/mm/ksm/run
Or use tools like virt-what.

Regardless I'll work on step 1.

Chris J Arges (arges) on 2015-01-22

summary:

- qemu-kvm package enables KSM on VMs
+ issues with KSM enabled for nested KVM VMs

Revision history for this message

Ryan Harper (raharper) wrote on 2015-01-22: Re: issues with KSM enabled for nested KVM VMs

#13

I think you'll find that nested kvm and ksm will be a worst-case scenario w.r.t memory swap-out. KSM actively scans the hosts memory for pages it can unmap , which means when a guest (level 1) or nested guest (level 2) needs to write to that memory, then you incur a page table walk (Guest virtual to host physical) which is quite costly. This can be further affected by overcommit of memory, which ends up having the hypervisor swap memory to disk to handle the working set size. KSM will not have any view into the the page tables, or swapping activity of the guest (l1, or l2) meaning that overtime it's increasingly likely that KSM will have swapped out memory needed for either the L1 or L2 guest. Swapping L1 memory utilized to run the L2 guest is likely to the the most painful since there are two levels of swap-in occuring (host to l1, and the l1 to l2). Swap in/out will be IO intensive, and blocking on io is also more likely to trigger soft-lockups in either l1 or l2 guests.

I suggest looking at the openstack environment and disabling/turning down the amount of memory overcommit to reduce the memory pressure on the host. Given that KSM isn't optimized at all for nested KVM, it's certainly worth disabling KSM when running nested guests (certainly in the L1 guest, possibly the host as well) unless one wants to invest in tuning KSM as well as close monitoring of the host memory pressure any any L1 guest memory pressure if the L1 guest also runs an L2 guest.

Revision history for this message

Gema Gomez (gema) wrote on 2015-01-22:

#14

soft-lockup-different-node.log Edit (98.9 KiB, text/plain)

I have a different VM that has crashed (also nested nova compute), this one had ksm disabled. See log attached.

Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Revision history for this message

Ryan Harper (raharper) wrote on 2015-01-22:

#15

Any information about the physical host running the L1 guest?

Revision history for this message

Chris J Arges (arges) wrote on 2015-01-23: Re: soft lockup issues with nested KVM VMs running tempest

#16

Let's concentrate on the hang without KSM on this bug. I've split the KSM enabled in nested virt issue in bug 1414153.

summary:	- issues with KSM enabled for nested KVM VMs + soft lockup issues with nested KVM VMs running tempest
no longer affects:	qemu (Ubuntu)

Chris J Arges (arges) on 2015-01-23

description:

updated

Chris J Arges (arges) on 2015-02-05

description:

updated

Revision history for this message

Chris J Arges (arges) wrote on 2015-02-11:

#17

Upstream discussion about this bug:
https://lkml.org/lkml/2015/2/11/247

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-07:

#18

We have begun to see a noticeable percentage of this in our Openstack-on-Openstack automated testing. I've not yet dug too deeply, but the symptom is: a deployment completes and tests successfully, then as we're collecting logs from machines, one or more of the compute nodes falls over with: http://paste.ubuntu.com/10556479/

Let me know if there's anything you'd like for us to try. This is on our private cloud "serverstack" environment.

ubuntu@chakora:~$ uname -a
Linux chakora 3.13.0-46-generic #77-Ubuntu SMP Mon Mar 2 18:23:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@chakora:~$ dpkg-query --show linux-generic libvirt-bin qemu-system nova-compute-kvm nova-compute-libvirt
libvirt-bin 1.2.2-0ubuntu13.1.9
linux-generic 3.13.0.46.53
nova-compute-kvm 1:2014.1.3-0ubuntu2
nova-compute-libvirt 1:2014.1.3-0ubuntu2
qemu-system 2.0.0+dfsg-2ubuntu1.10

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-07:

#19

Also FWIW, RAM overcommit does not appear to be a factor on the affected compute node: http://paste.ubuntu.com/10556588/

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2015-03-08:

#20

Ryan,

Could you enable kdump (http://www.inaddy.org/mini-howtos/dumps/using-ubuntu-crash-dump-with-kdump), for these hosts and provide me a way of downloading (dropbox, public url) the /var/crash/XXX/ directory (dump from the kernel panic) ? I would like to make sure we are facing the same problem (upstream discussion with Linus Torvalds -> https://lkml.org/lkml/2015/2/23/460). Your kernel dump might be helpful.

Thank you...

Rafael Tinoco

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2015-03-08:

#21

Or, to be faster, provide me the stack trace from your panic (So I can make sure you are facing the same issue as we are interested on).

Ryan Beisner (1chb1n) on 2015-03-23

summary:

- soft lockup issues with nested KVM VMs running tempest
+ Trusty soft lockup issues with nested KVM

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-23:

#22

This does not appear to be specific to OpenStack, nor tempest. I've reproduced with Trusty on Trusty on Trusty, vanilla qemu/kvm.

Simplified reproducer, with an existing MAAS cluster:

@L0 baremetal:
- Create a Trusty bare metal host from daily images.
- sudo apt-get update -y && sudo apt-get -y install uvtool
- sudo uvt-simplestreams-libvirt sync release=trusty arch=amd64
- sudo uvt-simplestreams-libvirt query
- ssh-keygen
- sudo uvt-kvm create --memory 2048 trusty-vm release=trusty
- sudo virsh shutdown trusty-vm
- # edit the /etc/libvirt/qemu/trusty-vm.xml to enable serial console dump to file:
    <serial type='file'>
      <source path='/tmp/trusty-vm-console.log'/>
      <target port='0'/>
    </serial>
    <console type='file'>
      <source path='/tmp/trusty-vm-console.log'/>
      <target type='serial' port='0'/>
    </console>
- sudo virsh define /etc/libvirt/qemu/trusty-vm.xml
- sudo virsh start trusty-vm
- # confirm console output:
- sudo tailf /tmp/trusty-vm-console.log
- # take note of the VM's IP:
- sudo uvt-kvm ip trusty-vm
- # ssh into the new vm.

@L1 "trusty-vm":
- sudo apt-get update -y && sudo apt-get -y install uvtool
- sudo uvt-simplestreams-libvirt sync release=trusty arch=amd64
- sudo uvt-simplestreams-libvirt query
- ssh-keygen
- # change .122. to .123. in /etc/libvirt/qemu/networks/default.xml
- # make sure default.xml is static linked inside /etc/libvirt/qemu/networks
- sudo reboot # for good measure
- sudo uvt-kvm create --memory 768 trusty-nest release=trusty
- # take note of the nested VM's IP
- sudo uvt-kvm ip trusty-vm
- # ssh into the new vm.

@L2 "trusty-nest":
- sudo apt-get update && sudo apt-get install stress
- stress -c 1 -i 1 -m 1 -d 1 -t 600

Now watch the "trusty-vm" console for: [ 496.076004] BUG: soft lockup - CPU#0 stuck for 23s! [ksmd:36]. It happens to me within a couple of minutes. Then, both L1 and L2 become unreachable indefinitely, with two cores on L0 stuck at 100%.

This does not appear to be specific to OpenStack, nor tempest.  I've reproduced with Trusty on Trusty on Trusty, vanilla qemu/kvm.

Simplified reproducer, with an existing MAAS cluster:

@L0 baremetal:
 - Create a Trusty bare metal host from daily images.
 - sudo apt-get update -y && sudo apt-get -y install uvtool
 - sudo uvt-simplestreams-libvirt sync release=trusty arch=amd64
 - sudo uvt-simplestreams-libvirt query
 - ssh-keygen
 - sudo uvt-kvm create --memory 2048 trusty-vm release=trusty
 - sudo virsh shutdown trusty-vm
 - # edit the /etc/libvirt/qemu/trusty-vm.xml to enable serial console dump to file:
    <serial type='file'>
      <source path='/tmp/trusty-vm-console.log'/>
      <target port='0'/>
    </serial>
    <console type='file'>
      <source path='/tmp/trusty-vm-console.log'/>
      <target type='serial' port='0'/>
    </console>
 - sudo virsh define /etc/libvirt/qemu/trusty-vm.xml
 - sudo virsh start trusty-vm
 - # confirm console output:
 - sudo tailf /tmp/trusty-vm-console.log
 - # take note of the VM's IP:
 - sudo uvt-kvm ip trusty-vm
 - # ssh into the new vm.

@L1 "trusty-vm":
 - sudo apt-get update -y && sudo apt-get -y install uvtool
 - sudo uvt-simplestreams-libvirt sync release=trusty arch=amd64
 - sudo uvt-simplestreams-libvirt query
 - ssh-keygen
 - # change .122. to .123. in /etc/libvirt/qemu/networks/default.xml
 - # make sure default.xml is static linked inside /etc/libvirt/qemu/networks
 - sudo reboot  # for good measure
 - sudo uvt-kvm create --memory 768 trusty-nest release=trusty
 - # take note of the nested VM's IP
 - sudo uvt-kvm ip trusty-vm
 - # ssh into the new vm.

@L2 "trusty-nest":
 - sudo apt-get update && sudo apt-get install stress
 - stress -c 1 -i 1 -m 1 -d 1 -t 600

Now watch the "trusty-vm" console for:  [  496.076004] BUG: soft lockup - CPU#0 stuck for 23s! [ksmd:36].  It happens to me within a couple of minutes.  Then, both L1 and L2 become unreachable indefinitely, with two cores on L0 stuck at 100%.

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-23:

#23

Also FYI: I was not able to reproduce this issue when using Vivid as the bare metal L0.

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-24:

#24

s/static/sym/ ;-)

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-24:

#25

L0-baremetal-cpu-pegged.png Edit (443.5 KiB, image/png)

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-24:

#26

L1-console-log-soft-lockup.png Edit (346.9 KiB, image/png)

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-24:

#27

A few hrs later, those two L0 bare metal host CPUs are still maxed. In scenarios where L0 is hosting many VMs, such as in a cloud, this bug can be expected to cause significant performance, consistency and capacity issues on the host and in the cloud as a whole.

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-24:

#28

I've collected crash dumps, and have stored them on an internal Canonical server as they are 2gb+. Feel free to ping me for access.

Revision history for this message

Stefan Bader (smb) wrote on 2015-03-25:

#29

Hm, following your instructions I rather run into a situation where the l2 guest gets paused. Likely because l1 runs out of disk space. The default of uvtool is 7G which I would say the l2 stress run fills as it grows the l2 qcow image on l1 which has to stuff all the initial cloud-image and the snapshot of that for l2 into its 7G.

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-03-25:

#30

@smb - after repeating the test a few times, I too ran out of space with the default 8GB VM disk size, resulting in a paused VM. You'll have to re-create the VMs a little bit differently (--disk <GB>).

ex:
@L0:
sudo uvt-kvm destroy trusty-vm
sudo uvt-kvm create --memory 2048 --disk 40 trusty-vm release=trusty

@L1:
#repeat original repro

ref:
http://manpages.ubuntu.com/manpages/trusty/man1/uvt-kvm.1.html

Revision history for this message

Stefan Bader (smb) wrote on 2015-03-25:

#31

Yeah, will do. Just got distracted and wanted to ensure that the repro was not accidentally another form of failure path to the out of space issue.

Revision history for this message

Stefan Bader (smb) wrote on 2015-03-25:

#32

Hrmn... When I repeated the setup I seem to have triggered some kind of lockup even while bringing up l2. Of course hard to say without details of Ryan's dump. However mine seems to have backtraces in the log which remind me an awful lot of an issue related to punching holes into ext4 based qcow images. Chris had been working on something like this before... He is on a sprint this week. Anyway, my strace in the log:

[ 1200.288031] INFO: task qemu-system-x86:4545 blocked for more than 120 seconds.
[ 1200.288712] Not tainted 3.13.0-46-generic #77-Ubuntu
[ 1200.289204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1200.289892] qemu-system-x86 D ffff88007fc134c0 0 4545 1 0x00000000
[ 1200.289895] ffff88007a9c5d28 0000000000000082 ffff88007bbd3000 ffff88007a9c5fd8
[ 1200.289897] 00000000000134c0 00000000000134c0 ffff88007bbd3000 ffff88007fc13d58
[ 1200.289898] ffff88007ffcdee8 0000000000000002 ffffffff8114eef0 ffff88007a9c5da0
[ 1200.289900] Call Trace:
[ 1200.289906] [<ffffffff8114eef0>] ? wait_on_page_read+0x60/0x60
[ 1200.289909] [<ffffffff817259fd>] io_schedule+0x9d/0x140
[ 1200.289910] [<ffffffff8114eefe>] sleep_on_page+0xe/0x20
[ 1200.289912] [<ffffffff81725e82>] __wait_on_bit+0x62/0x90
[ 1200.289914] [<ffffffff8114ecbf>] wait_on_page_bit+0x7f/0x90
[ 1200.289917] [<ffffffff810ab140>] ? autoremove_wake_function+0x40/0x40
[ 1200.289919] [<ffffffff8115c4a1>] ? pagevec_lookup_tag+0x21/0x30
[ 1200.289921] [<ffffffff8114edc9>] filemap_fdatawait_range+0xf9/0x190
[ 1200.289923] [<ffffffff8115066f>] filemap_write_and_wait_range+0x3f/0x70
[ 1200.289927] [<ffffffff8123bc4a>] ext4_sync_file+0xba/0x320
[ 1200.289930] [<ffffffff811ede21>] do_fsync+0x51/0x80
[ 1200.289931] [<ffffffff811ee0d3>] SyS_fdatasync+0x13/0x20
[ 1200.289933] [<ffffffff81731d7d>] system_call_fastpath+0x1a/0x1f

Revision history for this message

Chris J Arges (arges) wrote on 2015-03-25:

#33

Stefan,
This looks like a separate bug (as we discussed). Please file another bug for this when you have time.

description:

updated

Revision history for this message

Chris J Arges (arges) wrote on 2015-03-25:

#34

I've added instructions for a workaround. The code paths I've seen in crashes has been the following:

kvm_sched_in
-> kvm_arch_vcpu_load
  -> vmx_vcpu_load
   -> loaded_vmcs_clear
    -> smp_call_function_single

pmdp_clear_flush
-> flush_tlb_mm_range
-> native_flush_tlb_others
-> smp_call_function_many

Generally this has been caused by workloads that use nested VMs, and stress L2/L1 vms (causing non-local CPU TLB flushing or VMCS clearing).

The hang is in csd_lock_wait waiting for CSD_FLAG_LOCK bit to be cleared, which can only be triggered with non-local smp_call_function_* calls.

Another data point is that this can happen with x2apic as well as flat apic (as tested with nox2apic).

Revision history for this message

Chris J Arges (arges) wrote on 2015-03-25:

#35

Ideas going forward:

1) Instrument kernel for debugging csd_lock
2) Determine which CPUs exhibit this issue
3) Examine pinning more in depth pin 0-0 1-2 for example
4) Test older kernels , newer kernels to verify issue

Revision history for this message

Tom Fifield (fifieldt) wrote on 2015-03-27:

#36

Download full text (6.1 KiB)

Hi,

Just wanted to chime in that this bug also affected me - running OpenStack Juno w/KVM inside a KVM hypervisor.

CPU on the host machine is:
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

running 14.04 with the latest packages applied as of today (2015-03-27) for both the host and the guest.

Lockup appeared to happen with one host-guest VM after I altered the number of CPUs allocated to another VM (yet to reboot that VM for changes to take affect), though I had also recently booted a new host-guest-guest VM.

ar 27 15:12:43 compute ntpd[1775]: peers refreshed
Mar 27 15:12:43 compute ntpd[1775]: new interface(s) found: waking up resolver
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPDISCOVER(br100) fa:16:3e:c3:81:22
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPOFFER(br100) 203.0.113.27 fa:16:3e:c3:81:22
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPREQUEST(br100) 203.0.113.27 fa:16:3e:c3:81:22
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPACK(br100) 203.0.113.27 fa:16:3e:c3:81:22 test03
Mar 27 15:15:40 compute kernel: [ 436.100002] BUG: soft lockup - CPU#5 stuck for 23s! [ksmd:68]
Mar 27 15:15:40 compute kernel: [ 436.100002] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM ebt_ip ebt_arp ebtable_filter br
idge stp llc xt_conntrack xt_nat xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6tabl
e_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_
tcp libiscsi_tcp libiscsi scsi_transport_iscsi snd_hda_intel cirrus snd_hda_codec ttm snd_hwdep drm_kms_helper snd_pcm drm snd_page_alloc snd_
timer syscopyarea snd sysfillrect soundcore sysimgblt dm_multipath i2c_piix4 kvm_intel scsi_dh serio_raw kvm mac_hid lp parport 8139too psmous
e 8139cp mii floppy pata_acpi
Mar 27 15:15:40 compute kernel: [ 436.100002] CPU: 5 PID: 68 Comm: ksmd Not tainted 3.13.0-46-generic #79-Ubuntu
Mar 27 15:15:40 compute kernel: [ 436.100002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Mar 27 15:15:40 compute kernel: [ 436.100002] task: ffff8802306db000 ti: ffff8802306e4000 task.ti: ffff8802306e4000
Mar 27 15:15:40 compute kernel: [ 436.100002] RIP: 0010:[<ffffffff810dbf56>] [<ffffffff810dbf56>] generic_exec_single+0x86/0xb0
Mar 27 15:15:40 compute kernel: [ 436.100002] RSP: 0018:ffff8802306e5c00 EFLAGS: 00000202
Mar 27 15:15:40 compute kernel: [ 436.100002] RAX: 0000000000000006 RBX: ffff8802306e5bd0 RCX: 0000000000000005
Mar 27 15:15:40 compute kernel: [ 436.100002] RDX: ffffffff8180ade0 RSI: 0000000000000000 RDI: 0000000000000286
Mar 27 15:15:40 compute kernel: [ 436.100002] RBP: ffff8802306e5c30 R08: ffffffff8180adc8 R09: ffff880232989b48
Mar 27 15:15:40 compute kernel: [ 436.100002] R10: 0000000000000867 R11: 0000000000000000 R12: 0000000000000000
Mar 27 15:15:40 compute kernel: [ 436.100002] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Mar 27 15:15:40 compute kernel: [ 436.100002] FS: 0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
Mar 27 15:15:4...

Hi,

Just wanted to chime in that this bug also affected me - running OpenStack Juno w/KVM inside a KVM hypervisor.

CPU on the host machine is:
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

running 14.04 with the latest packages applied as of today (2015-03-27) for both the host and the guest.

Lockup appeared to happen with one host-guest VM after I altered the number of CPUs allocated to another VM (yet to reboot that VM for changes to take affect), though I had also recently booted a new host-guest-guest VM.

ar 27 15:12:43 compute ntpd[1775]: peers refreshed
Mar 27 15:12:43 compute ntpd[1775]: new interface(s) found: waking up resolver
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPDISCOVER(br100) fa:16:3e:c3:81:22 
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPOFFER(br100) 203.0.113.27 fa:16:3e:c3:81:22 
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPREQUEST(br100) 203.0.113.27 fa:16:3e:c3:81:22 
Mar 27 15:12:48 compute dnsmasq-dhcp[2044]: DHCPACK(br100) 203.0.113.27 fa:16:3e:c3:81:22 test03
Mar 27 15:15:40 compute kernel: [  436.100002] BUG: soft lockup - CPU#5 stuck for 23s! [ksmd:68]
Mar 27 15:15:40 compute kernel: [  436.100002] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM ebt_ip ebt_arp ebtable_filter br
idge stp llc xt_conntrack xt_nat xt_tcpudp iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6tabl
e_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables nbd ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_
tcp libiscsi_tcp libiscsi scsi_transport_iscsi snd_hda_intel cirrus snd_hda_codec ttm snd_hwdep drm_kms_helper snd_pcm drm snd_page_alloc snd_
timer syscopyarea snd sysfillrect soundcore sysimgblt dm_multipath i2c_piix4 kvm_intel scsi_dh serio_raw kvm mac_hid lp parport 8139too psmous
e 8139cp mii floppy pata_acpi
Mar 27 15:15:40 compute kernel: [  436.100002] CPU: 5 PID: 68 Comm: ksmd Not tainted 3.13.0-46-generic #79-Ubuntu
Mar 27 15:15:40 compute kernel: [  436.100002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Mar 27 15:15:40 compute kernel: [  436.100002] task: ffff8802306db000 ti: ffff8802306e4000 task.ti: ffff8802306e4000
Mar 27 15:15:40 compute kernel: [  436.100002] RIP: 0010:[<ffffffff810dbf56>]  [<ffffffff810dbf56>] generic_exec_single+0x86/0xb0
Mar 27 15:15:40 compute kernel: [  436.100002] RSP: 0018:ffff8802306e5c00  EFLAGS: 00000202
Mar 27 15:15:40 compute kernel: [  436.100002] RAX: 0000000000000006 RBX: ffff8802306e5bd0 RCX: 0000000000000005
Mar 27 15:15:40 compute kernel: [  436.100002] RDX: ffffffff8180ade0 RSI: 0000000000000000 RDI: 0000000000000286
Mar 27 15:15:40 compute kernel: [  436.100002] RBP: ffff8802306e5c30 R08: ffffffff8180adc8 R09: ffff880232989b48
Mar 27 15:15:40 compute kernel: [  436.100002] R10: 0000000000000867 R11: 0000000000000000 R12: 0000000000000000
Mar 27 15:15:40 compute kernel: [  436.100002] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Mar 27 15:15:40 compute kernel: [  436.100002] FS:  0000000000000000(0000) GS:ffff88023fd40000(0000) knlGS:0000000000000000
Mar 27 15:15:40 compute kernel: [  436.100002] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 27 15:15:40 compute kernel: [  436.100002] CR2: 00007fb0557bf000 CR3: 0000000036b7d000 CR4: 00000000000026e0
Mar 27 15:15:40 compute kernel: [  436.100002] Stack:
Mar 27 15:15:40 compute kernel: [  436.100002]  ffff88023fd13f80 0000000000000004 0000000000000005 ffffffff81d14300
Mar 27 15:15:40 compute kernel: [  436.100002]  ffffffff8105c7a0 ffff88023212c380 ffff8802306e5ca8 ffffffff810dc065
Mar 27 15:15:40 compute kernel: [  436.100002]  00000000000134c0 00000000000134c0 ffff88023fd13f80 ffff88023fd13f80
Mar 27 15:15:40 compute kernel: [  436.100002] Call Trace:
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff810dc065>] smp_call_function_single+0xe5/0x190
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffffa008c21a>] ? kvm_handle_hva_range+0x11a/0x180 [kvm]
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffffa008f300>] ? rmap_write_protect+0x80/0x80 [kvm]
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff810dc496>] smp_call_function_many+0x286/0x2d0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8105c8f7>] native_flush_tlb_others+0x37/0x40
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8105cbf6>] flush_tlb_page+0x56/0xa0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8118a4c8>] ptep_clear_flush+0x48/0x60
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8119cd9f>] try_to_merge_with_ksm_page+0x14f/0x650
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8119de36>] ksm_do_scan+0xb96/0xdb0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8119e0cf>] ksm_scan_thread+0x7f/0x200
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff810ab100>] ? prepare_to_wait_event+0x100/0x100
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8119e050>] ? ksm_do_scan+0xdb0/0xdb0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8108b592>] kthread+0xd2/0xf0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8108b4c0>] ? kthread_create_on_node+0x1c0/0x1c0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff81731ccc>] ret_from_fork+0x7c/0xb0
Mar 27 15:15:40 compute kernel: [  436.100002]  [<ffffffff8108b4c0>] ? kthread_create_on_node+0x1c0/0x1c0
Mar 27 15:15:40 compute kernel: [  436.100002] Code: 4c 89 23 48 89 4b 08 48 89 19 48 89 55 d0 e8 f2 d1 64 00 4c 3b 65 d0 74 23 45 85 ed 75 09
 eb 0d 0f 1f 44 00 00 f3 90 f6 43 20 01 <75> f8 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 f7 ff 
Mar 27 15:16:08 compute kernel: [  464.100002] BUG: soft lockup - CPU#5 stuck for 23s! [ksmd:68]

I've got to keep working to get this environment up for a test, but if any specific logs or tests can be helpful, let me know and I'll see what I can provide.

Revision history for this message

Chris J Arges (arges) wrote on 2015-03-27:

#37

@fifieldt

Hi, that is the same bug. Things to reduce the hangs right now are:
- Disabling KSM in L1 guest
- Using 3.16 kernel on the L0 host
- Pinning L1 vCPUs to L0 host CPU

Note this doesn't fix the issue, it only decreases (potentially) the frequency of these lockups.
--chris

Revision history for this message

Aaron Rosen (arosen) wrote on 2015-04-01:

#38

I am also hitting this issue in my CI a lot. Here is the trace I'm getting in syslog: http://logs2.aaronorosen.com/85/169585/1/check/dsvm-tempest-full-congress-nodepool/94f8441/logs/syslog.txt.gz#_Apr__1_02_43_44

Is there a work around for this?

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-01:

#39

@arosen,
This looks like a different softlockup, and also the machine seems to recover from it. Please file a new bug and be sure to attach logs to the bug. Describe in detail how to reproduce this as well, what kind of host machine do you have? what VM definition are you using? Etc etc.

description:

updated

Revision history for this message

Aaron Rosen (arosen) wrote on 2015-04-01:

#40

@chris: done https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1439394

Chris J Arges (arges) on 2015-04-06

Changed in linux (Ubuntu Trusty):
assignee:	nobody → Chris J Arges (arges)
importance:	Undecided → High
status:	New → In Progress
description:	updated

Chris J Arges (arges) on 2015-04-06

description:

updated

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-06:

#41

With a revert of b6b8a145 ('Rework interception of IRQs and NMIs'), the issue does not occur readily with the test case. I was able to run for 1+ hour. Generally I can reproduce within 15m.

With 9242b5b6 ('KVM: x86: Check for nested events if there is an injectable interrupt') applied, I can run for 1+ hour without issue.

Current 3.13.0 patchlevel is in between those two commits which allows for this bug to reproduce easily.

Revision history for this message

Ramy Asselin (ramy-asselin) wrote on 2015-04-06:

#42

Ran into this bug too on 3.13.0-48. My workaround is to run QEMU on top of KVM (instead of kvm on top of KVM)

devstack local.conf:
[[post-config|$NOVA_CONF]]
[libvirt]
virt_type = qemu

nova.conf
[libvirt]
virt_type = qemu

Andy Whitcroft (apw) on 2015-04-08

Changed in linux (Ubuntu Trusty):
status:	In Progress → Fix Committed

Revision history for this message

Dr. Jens Harbott (j-harbott) wrote on 2015-04-09:

#43

@Andy: So 3.16.0-34 is the kernel with the fix? Any chance that it will also be backported to the 3.13 series?

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-10:

#44

@j-rosenboom-j:

I've backported patch 9242b5b6 to the 3.13 series that should greatly mitigate this issue from happening. This should be available in the next update.

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-04-17:

#45

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:

added: verification-needed-trusty

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-21:

#46

Verified on my reproducers. I'm marking the development task as fixed for this bug. I'll move the upstream investigation to another bug.

Changed in linux (Ubuntu):
assignee:	Chris J Arges (arges) → nobody
status:	Confirmed → Fix Released
importance:	High → Undecided
tags:	added: verification-done-trusty removed: verification-needed-trusty

Revision history for this message

Gema Gomez (gema) wrote on 2015-04-22:

#47

I have been trying to verify this kernel and I haven't seen exactly the soft lockup crash, but this other one, which may or may not be related but wanted to make a note of it:

[ 2406.041444] Kernel panic - not syncing: hung_task: blocked tasks
[ 2406.043163] CPU: 1 PID: 35 Comm: khungtaskd Not tainted 3.13.0-51-generic #84-Ubuntu
[ 2406.044223] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 2406.044223] 00000000003fffd1 ffff88080ec7fdf0 ffffffff817225ce ffffffff81a62a65
[ 2406.044223] ffff88080ec7fe68 ffffffff8171b46d ffffffff00000008 ffff88080ec7fe78
[ 2406.044223] ffff88080ec7fe18 ffff88080ec7fe40 0000000000000100 0000000000000004
[ 2406.044223] Call Trace:
[ 2406.044223] [<ffffffff817225ce>] dump_stack+0x45/0x56
[ 2406.044223] [<ffffffff8171b46d>] panic+0xc8/0x1d7
[ 2406.044223] [<ffffffff8110d7b6>] watchdog+0x296/0x2e0
[ 2406.044223] [<ffffffff8110d520>] ? reset_hung_task_detector+0x20/0x20
[ 2406.044223] [<ffffffff8108b5d2>] kthread+0xd2/0xf0
[ 2406.044223] [<ffffffff8108b500>] ? kthread_create_on_node+0x1c0/0x1c0
[ 2406.044223] [<ffffffff8173300c>] ret_from_fork+0x7c/0xb0
[ 2406.044223] [<ffffffff8108b500>] ? kthread_create_on_node+0x1c0/0x1c0

I have the crashdump for it, let me know how you want to proceed.

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-22:

#48

After speaking to Gema, she will re-test with this kernel installed in L0 in addition to L1.
NOTE: This fix needs to be present for L0/L1 kernels.

Revision history for this message

B. (b-deactivatedaccount-deactivatedaccount) wrote on 2015-04-22:

#49

Download full text (5.1 KiB)

I still have the same issue with kernel 3.16.0-36-generic or 3.13.0-51-generic (proposed-updates)

# KVM HOST (3.16.0-36-generic)
sudo apt-get install linux-signed-generic-lts-utopic/trusty-proposed

# KVM GUEST (3.16.0-36-generic)
sudo apt-get install linux-virtual-lts-utopic/trusty-proposed
apt-get install cloud-installer
cloud-install

[ 1196.920613] kvm: vmptrld (null)/780000000000 failed
[ 1196.920953] vmwrite error: reg 401e value 31 (err 1)
[ 1196.921243] CPU: 23 PID: 5240 Comm: qemu-system-x86 Not tainted 3.16.0-36-generic #48~14.04.1-Ubuntu
[ 1196.921244] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 11/03/2014
[ 1196.921245] 0000000000000000 ffff88202018fb58 ffffffff81764a5f ffff880fe4960000
[ 1196.921248] ffff88202018fb68 ffffffffc0a9320d ffff88202018fb78 ffffffffc0a878bf
[ 1196.921250] ffff88202018fba8 ffffffffc0a8e1cf ffff880fe4960000 0000000000000000
[ 1196.921252] Call Trace:
[ 1196.921262] [<ffffffff81764a5f>] dump_stack+0x45/0x56
[ 1196.921277] [<ffffffffc0a9320d>] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 1196.921280] [<ffffffffc0a878bf>] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 1196.921283] [<ffffffffc0a8e1cf>] free_nested.part.73+0x5f/0x170 [kvm_intel]
[ 1196.921286] [<ffffffffc0a8e363>] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[ 1196.921305] [<ffffffffc03fd774>] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[ 1196.921312] [<ffffffffc03fe3e2>] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]
[ 1196.921318] [<ffffffff810d28ed>] ? synchronize_srcu+0x1d/0x20
[ 1196.921323] [<ffffffffc03e635e>] kvm_put_kvm+0x10e/0x220 [kvm]
[ 1196.921328] [<ffffffffc03e64a8>] kvm_vcpu_release+0x18/0x20 [kvm]
[ 1196.921331] [<ffffffff811d5e64>] __fput+0xe4/0x220
[ 1196.921333] [<ffffffff811d5fee>] ____fput+0xe/0x10
[ 1196.921337] [<ffffffff8108e284>] task_work_run+0xc4/0xe0
[ 1196.921342] [<ffffffff810702c8>] do_exit+0x2b8/0xa60
[ 1196.921345] [<ffffffff810e3792>] ? __unqueue_futex+0x32/0x70
[ 1196.921347] [<ffffffff810e47b6>] ? futex_wait+0x126/0x290
[ 1196.921349] [<ffffffff8109eb35>] ? check_preempt_curr+0x85/0xa0
[ 1196.921351] [<ffffffff81070aef>] do_group_exit+0x3f/0xa0
[ 1196.921353] [<ffffffff81080530>] get_signal_to_deliver+0x1d0/0x6f0
[ 1196.921357] [<ffffffff81012548>] do_signal+0x48/0xad0
[ 1196.921359] [<ffffffff81011627>] ? __switch_to+0x167/0x590
[ 1196.921361] [<ffffffff81013039>] do_notify_resume+0x69/0xb0
[ 1196.921364] [<ffffffff8176d44a>] int_signal+0x12/0x17
[ 1196.921365] vmwrite error: reg 2800 value ffffffffffffffff (err -255)
[ 1196.921733] CPU: 23 PID: 5240 Comm: qemu-system-x86 Not tainted 3.16.0-36-generic #48~14.04.1-Ubuntu
[ 1196.921734] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 11/03/2014
[ 1196.921735] 0000000000000000 ffff88202018fb58 ffffffff81764a5f ffff880fe4960000
[ 1196.921736] ffff88202018fb68 ffffffffc0a9320d ffff88202018fb78 ffffffffc0a878bf
[ 1196.921737] ffff88202018fba8 ffffffffc0a8e1e0 ffff880fe4960000 0000000000000000
[ 1196.921739] Call Trace:
[ 1196.921741] [<ffffffff81764a5f>] dump_stack+0x45/0x56
[ 1196.921744] [<ffffffffc0a9320d>] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 1196.921746] [<ffffffffc0a878bf>] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 1196.921748] [<ffffffffc0a8e1e0>] free_nest...

I still have the same issue with kernel 3.16.0-36-generic or 3.13.0-51-generic (proposed-updates)

# KVM HOST (3.16.0-36-generic)
sudo apt-get install linux-signed-generic-lts-utopic/trusty-proposed

# KVM GUEST (3.16.0-36-generic)
sudo apt-get install linux-virtual-lts-utopic/trusty-proposed
apt-get install cloud-installer
cloud-install

[ 1196.920613] kvm: vmptrld           (null)/780000000000 failed
[ 1196.920953] vmwrite error: reg 401e value 31 (err 1)
[ 1196.921243] CPU: 23 PID: 5240 Comm: qemu-system-x86 Not tainted 3.16.0-36-generic #48~14.04.1-Ubuntu
[ 1196.921244] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 11/03/2014
[ 1196.921245]  0000000000000000 ffff88202018fb58 ffffffff81764a5f ffff880fe4960000
[ 1196.921248]  ffff88202018fb68 ffffffffc0a9320d ffff88202018fb78 ffffffffc0a878bf
[ 1196.921250]  ffff88202018fba8 ffffffffc0a8e1cf ffff880fe4960000 0000000000000000
[ 1196.921252] Call Trace:
[ 1196.921262]  [<ffffffff81764a5f>] dump_stack+0x45/0x56
[ 1196.921277]  [<ffffffffc0a9320d>] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 1196.921280]  [<ffffffffc0a878bf>] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 1196.921283]  [<ffffffffc0a8e1cf>] free_nested.part.73+0x5f/0x170 [kvm_intel]
[ 1196.921286]  [<ffffffffc0a8e363>] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[ 1196.921305]  [<ffffffffc03fd774>] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[ 1196.921312]  [<ffffffffc03fe3e2>] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]
[ 1196.921318]  [<ffffffff810d28ed>] ? synchronize_srcu+0x1d/0x20
[ 1196.921323]  [<ffffffffc03e635e>] kvm_put_kvm+0x10e/0x220 [kvm]
[ 1196.921328]  [<ffffffffc03e64a8>] kvm_vcpu_release+0x18/0x20 [kvm]
[ 1196.921331]  [<ffffffff811d5e64>] __fput+0xe4/0x220
[ 1196.921333]  [<ffffffff811d5fee>] ____fput+0xe/0x10
[ 1196.921337]  [<ffffffff8108e284>] task_work_run+0xc4/0xe0
[ 1196.921342]  [<ffffffff810702c8>] do_exit+0x2b8/0xa60
[ 1196.921345]  [<ffffffff810e3792>] ? __unqueue_futex+0x32/0x70
[ 1196.921347]  [<ffffffff810e47b6>] ? futex_wait+0x126/0x290
[ 1196.921349]  [<ffffffff8109eb35>] ? check_preempt_curr+0x85/0xa0
[ 1196.921351]  [<ffffffff81070aef>] do_group_exit+0x3f/0xa0
[ 1196.921353]  [<ffffffff81080530>] get_signal_to_deliver+0x1d0/0x6f0
[ 1196.921357]  [<ffffffff81012548>] do_signal+0x48/0xad0
[ 1196.921359]  [<ffffffff81011627>] ? __switch_to+0x167/0x590
[ 1196.921361]  [<ffffffff81013039>] do_notify_resume+0x69/0xb0
[ 1196.921364]  [<ffffffff8176d44a>] int_signal+0x12/0x17
[ 1196.921365] vmwrite error: reg 2800 value ffffffffffffffff (err -255)
[ 1196.921733] CPU: 23 PID: 5240 Comm: qemu-system-x86 Not tainted 3.16.0-36-generic #48~14.04.1-Ubuntu
[ 1196.921734] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 11/03/2014
[ 1196.921735]  0000000000000000 ffff88202018fb58 ffffffff81764a5f ffff880fe4960000
[ 1196.921736]  ffff88202018fb68 ffffffffc0a9320d ffff88202018fb78 ffffffffc0a878bf
[ 1196.921737]  ffff88202018fba8 ffffffffc0a8e1e0 ffff880fe4960000 0000000000000000
[ 1196.921739] Call Trace:
[ 1196.921741]  [<ffffffff81764a5f>] dump_stack+0x45/0x56
[ 1196.921744]  [<ffffffffc0a9320d>] vmwrite_error+0x2c/0x2e [kvm_intel]
[ 1196.921746]  [<ffffffffc0a878bf>] vmcs_writel+0x1f/0x30 [kvm_intel]
[ 1196.921748]  [<ffffffffc0a8e1e0>] free_nested.part.73+0x70/0x170 [kvm_intel]
[ 1196.921751]  [<ffffffffc0a8e363>] vmx_free_vcpu+0x33/0x70 [kvm_intel]
[ 1196.921757]  [<ffffffffc03fd774>] kvm_arch_vcpu_free+0x44/0x50 [kvm]
[ 1196.921763]  [<ffffffffc03fe3e2>] kvm_arch_destroy_vm+0xf2/0x1f0 [kvm]
[ 1196.921765]  [<ffffffff810d28ed>] ? synchronize_srcu+0x1d/0x20
[ 1196.921770]  [<ffffffffc03e635e>] kvm_put_kvm+0x10e/0x220 [kvm]
[ 1196.921774]  [<ffffffffc03e64a8>] kvm_vcpu_release+0x18/0x20 [kvm]
[ 1196.921775]  [<ffffffff811d5e64>] __fput+0xe4/0x220
[ 1196.921777]  [<ffffffff811d5fee>] ____fput+0xe/0x10
[ 1196.921778]  [<ffffffff8108e284>] task_work_run+0xc4/0xe0
[ 1196.921780]  [<ffffffff810702c8>] do_exit+0x2b8/0xa60
[ 1196.921782]  [<ffffffff810e3792>] ? __unqueue_futex+0x32/0x70
[ 1196.921783]  [<ffffffff810e47b6>] ? futex_wait+0x126/0x290
[ 1196.921784]  [<ffffffff8109eb35>] ? check_preempt_curr+0x85/0xa0
[ 1196.921786]  [<ffffffff81070aef>] do_group_exit+0x3f/0xa0
[ 1196.921788]  [<ffffffff81080530>] get_signal_to_deliver+0x1d0/0x6f0
[ 1196.921790]  [<ffffffff81012548>] do_signal+0x48/0xad0
[ 1196.921791]  [<ffffffff81011627>] ? __switch_to+0x167/0x590
[ 1196.921793]  [<ffffffff81013039>] do_notify_resume+0x69/0xb0
[ 1196.921795]  [<ffffffff8176d44a>] int_signal+0x12/0x17
[ 1270.766540] device vnet3 entered promiscuous mode
[ 1270.865885] device vnet4 entered promiscuous mode
[ 1273.824576] kvm: zapping shadow pages for mmio generation wraparound
[ 1447.725335] kvm [6152]: vcpu0 unhandled rdmsr: 0x606

uvt-kvm create \
--memory 16384 \
--disk 100 \
--cpu 2 \
--ssh-public-key-file uvt-authorized_keys \
--template uvt-template.xml \
test release=trusty arch=amd64

....
  <vcpu placement='static'>2</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='23'/>
    <vcpupin vcpu='1' cpuset='31'/>
    <emulatorpin cpuset='23,31'/>
  </cputune>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>SandyBridge</model>
    <vendor>Intel</vendor>
    <topology sockets='1' cores='2' threads='1'/>
    <feature policy='require' name='vmx'/>
....

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-22:

#50

@baco-1

These backtraces look a bit different than the original bug. Can you file a new bug with how you are reproducing this and gather complete logs?

--chris

Revision history for this message

B. (b-deactivatedaccount-deactivatedaccount) wrote on 2015-04-23:

#51

@arges

For me it's related at least part of it...

If I don't update the kernel to proposed-updates I have the following messages :
If I use one CPU instead of two, I don't have those messages.

BUG: soft lockup CPU#1 stuck for 22s! [qemu-system-x86:6889]
INFO: rcu_sched detected stalls on CPUs/tasks: { 0} (detected by 1, t=15002 jiffies, g=5324, c=5323, q=0)
BUG: soft lockup CPU#1 stuck for 22s! [qemu-system-x86:6889]
...

My stress test is installing openstack with cloud-install on a single VM (nested KVM)
generated with uvt-kvm and two vcpus.

The backtrace I posted is the result of a manual "force off" (using virt-manager) of the test openstack VM.
shortly after cloud-install tries to launch a VM in VM, the CPU reach 100% and all the shells (also console) get stuck

The last kernel message of the VM before I loose control is
"[ 942.295014] IPv6: ADDRCONF(NETDEV_UP): virbr0: link is not ready"

Revision history for this message

Chris J Arges (arges) wrote on 2015-04-23:

#52

@baco-1

1) What kind of hardware are you running on L0? ('ubuntu-bug linux' and filing a bug would collect the necessary info)
2) What kind of load are you seeing in L0, L1?
3) Can you give me the output of 'tail /sys/module/kvm_intel/parameters/*' ?
4) You could setup crashdump to dump on a hang (if we think its the right one), or just have a full backtrace on a softlockup by adding the following to the kernel cmdline:
softlockup_all_cpu_backtrace=1

Having a single vCPU could either be reducing the load, or avoiding a race; it would be hard to tell without a proper backtrace of the hang itsself.

This seems like a pretty simple testcase, I will put it on my list of things to try and reproduce.

Revision history for this message

Gema Gomez (gema) wrote on 2015-04-24:

#53

My deployment is still running strong after over 36 hours. No crashes. I will leave it running for a few more days to see if it happens after a few days... and will report back.

@arges, thanks for this fix!

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-04-29:

#54

Download full text (13.4 KiB)

This bug was fixed in the package linux - 3.13.0-51.84

---------------
linux (3.13.0-51.84) trusty; urgency=low

[ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1444141
  * Merged back Ubuntu-3.13.0-49.83 security release

linux (3.13.0-50.82) trusty; urgency=low

[ Brad Figg ]

* Release Tracking Bug
- LP: #1442285

[ Andy Whitcroft ]

* [Config] CONFIG_DEFAULT_MMAP_MIN_ADDR needs to match on armhf and arm64
- LP: #1418140

[ Chris J Arges ]

* [Config] CONFIG_PCIEASPM_DEBUG=y
- LP: #1398544

[ Upstream Kernel Changes ]

  * KEYS: request_key() should reget expired keys rather than give
    EKEYEXPIRED
    - LP: #1124250
  * audit: correctly record file names with different path name types
    - LP: #1439441
  * KVM: x86: Check for nested events if there is an injectable interrupt
    - LP: #1413540
  * be2iscsi: fix memory leak in error path
    - LP: #1440156
  * block: remove old blk_iopoll_enabled variable
    - LP: #1440156
  * be2iscsi: Fix handling timed out MBX completion from FW
    - LP: #1440156
  * be2iscsi: Fix doorbell format for EQ/CQ/RQ s per SLI spec.
    - LP: #1440156
  * be2iscsi: Fix the session cleanup when reboot/shutdown happens
    - LP: #1440156
  * be2iscsi: Fix scsi_cmnd leakage in driver.
    - LP: #1440156
  * be2iscsi : Fix DMA Out of SW-IOMMU space error
    - LP: #1440156
  * be2iscsi: Fix retrieving MCCQ_WRB in non-embedded Mbox path
    - LP: #1440156
  * be2iscsi: Fix exposing Host in sysfs after adapter initialization is
    complete
    - LP: #1440156
  * be2iscsi: Fix interrupt Coalescing mechanism.
    - LP: #1440156
  * be2iscsi: Fix TCP parameters while connection offloading.
    - LP: #1440156
  * be2iscsi: Fix memory corruption in MBX path
    - LP: #1440156
  * be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed
    - LP: #1440156
  * be2iscsi: add an missing goto in error path
    - LP: #1440156
  * be2iscsi: remove potential junk pointer free
    - LP: #1440156
  * be2iscsi: Fix memory leak in mgmt_set_ip()
    - LP: #1440156
  * be2iscsi: Fix the sparse warning introduced in previous submission
    - LP: #1440156
  * be2iscsi: Fix updating the boot enteries in sysfs
    - LP: #1440156
  * be2iscsi: Fix processing CQE before connection resources are freed
    - LP: #1440156
  * be2iscsi : Fix kernel panic during reboot/shutdown
    - LP: #1440156
  * fixed invalid assignment of 64bit mask to host dma_boundary for scatter
    gather segment boundary limit.
    - LP: #1440156
  * quota: Store maximum space limit in bytes
    - LP: #1441284
  * ip: zero sockaddr returned on error queue
    - LP: #1441284
  * net: rps: fix cpu unplug
    - LP: #1441284
  * ipv6: stop sending PTB packets for MTU < 1280
    - LP: #1441284
  * netxen: fix netxen_nic_poll() logic
    - LP: #1441284
  * udp_diag: Fix socket skipping within chain
    - LP: #1441284
  * ping: Fix race in free in receive path
    - LP: #1441284
  * bnx2x: fix napi poll return value for repoll
    - LP: #1441284
  * net: don't OOPS on socket aio
    - LP: #1441284
  * bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify
    - LP: #1441284
  * ipv4: tcp: get rid of ugly unicast_sock
...

This bug was fixed in the package linux - 3.13.0-51.84

---------------
linux (3.13.0-51.84) trusty; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
    - LP: #1444141
  * Merged back Ubuntu-3.13.0-49.83 security release

linux (3.13.0-50.82) trusty; urgency=low

[ Brad Figg ]

* Release Tracking Bug
    - LP: #1442285

[ Andy Whitcroft ]

* [Config] CONFIG_DEFAULT_MMAP_MIN_ADDR needs to match on armhf and arm64
    - LP: #1418140

[ Chris J Arges ]

* [Config] CONFIG_PCIEASPM_DEBUG=y
    - LP: #1398544

[ Upstream Kernel Changes ]

* KEYS: request_key() should reget expired keys rather than give
    EKEYEXPIRED
    - LP: #1124250
  * audit: correctly record file names with different path name types
    - LP: #1439441
  * KVM: x86: Check for nested events if there is an injectable interrupt
    - LP: #1413540
  * be2iscsi: fix memory leak in error path
    - LP: #1440156
  * block: remove old blk_iopoll_enabled variable
    - LP: #1440156
  * be2iscsi: Fix handling timed out MBX completion from FW
    - LP: #1440156
  * be2iscsi: Fix doorbell format for EQ/CQ/RQ s per SLI spec.
    - LP: #1440156
  * be2iscsi: Fix the session cleanup when reboot/shutdown happens
    - LP: #1440156
  * be2iscsi: Fix scsi_cmnd leakage in driver.
    - LP: #1440156
  * be2iscsi : Fix DMA Out of SW-IOMMU space error
    - LP: #1440156
  * be2iscsi: Fix retrieving MCCQ_WRB in non-embedded Mbox path
    - LP: #1440156
  * be2iscsi: Fix exposing Host in sysfs after adapter initialization is
    complete
    - LP: #1440156
  * be2iscsi: Fix interrupt Coalescing mechanism.
    - LP: #1440156
  * be2iscsi: Fix TCP parameters while connection offloading.
    - LP: #1440156
  * be2iscsi: Fix memory corruption in MBX path
    - LP: #1440156
  * be2iscsi: Fix destroy MCC-CQ before MCC-EQ is destroyed
    - LP: #1440156
  * be2iscsi: add an missing goto in error path
    - LP: #1440156
  * be2iscsi: remove potential junk pointer free
    - LP: #1440156
  * be2iscsi: Fix memory leak in mgmt_set_ip()
    - LP: #1440156
  * be2iscsi: Fix the sparse warning introduced in previous submission
    - LP: #1440156
  * be2iscsi: Fix updating the boot enteries in sysfs
    - LP: #1440156
  * be2iscsi: Fix processing CQE before connection resources are freed
    - LP: #1440156
  * be2iscsi : Fix kernel panic during reboot/shutdown
    - LP: #1440156
  * fixed invalid assignment of 64bit mask to host dma_boundary for scatter
    gather segment boundary limit.
    - LP: #1440156
  * quota: Store maximum space limit in bytes
    - LP: #1441284
  * ip: zero sockaddr returned on error queue
    - LP: #1441284
  * net: rps: fix cpu unplug
    - LP: #1441284
  * ipv6: stop sending PTB packets for MTU < 1280
    - LP: #1441284
  * netxen: fix netxen_nic_poll() logic
    - LP: #1441284
  * udp_diag: Fix socket skipping within chain
    - LP: #1441284
  * ping: Fix race in free in receive path
    - LP: #1441284
  * bnx2x: fix napi poll return value for repoll
    - LP: #1441284
  * net: don't OOPS on socket aio
    - LP: #1441284
  * bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify
    - LP: #1441284
  * ipv4: tcp: get rid of ugly unicast_sock
    - LP: #1441284
  * ppp: deflate: never return len larger than output buffer
    - LP: #1441284
  * net: sctp: fix passing wrong parameter header to param_type2af in
    sctp_process_param
    - LP: #1441284
  * ARM: pxa: add regulator_has_full_constraints to corgi board file
    - LP: #1441284
  * ARM: pxa: add regulator_has_full_constraints to poodle board file
    - LP: #1441284
  * ARM: pxa: add regulator_has_full_constraints to spitz board file
    - LP: #1441284
  * hx4700: regulator: declare full constraints
    - LP: #1441284
  * HID: input: fix confusion on conflicting mappings
    - LP: #1441284
  * HID: fixup the conflicting keyboard mappings quirk
    - LP: #1441284
  * megaraid_sas: disable interrupt_mask before enabling hardware
    interrupts
    - LP: #1441284
  * PCI: Generate uppercase hex for modalias var in uevent
    - LP: #1441284
  * usb: core: buffer: smallest buffer should start at ARCH_DMA_MINALIGN
    - LP: #1441284
  * tty/serial: at91: enable peripheral clock before accessing I/O
    registers
    - LP: #1441284
  * tty/serial: at91: fix error handling in atmel_serial_probe()
    - LP: #1441284
  * axonram: Fix bug in direct_access
    - LP: #1441284
  * ksoftirqd: Enable IRQs and call cond_resched() before poking RCU
    - LP: #1441284
  * TPM: Add new TPMs to the tail of the list to prevent inadvertent change
    of dev
    - LP: #1441284
  * char: tpm: Add missing error check for devm_kzalloc
    - LP: #1441284
  * tpm_tis: verify interrupt during init
    - LP: #1441284
  * tpm: Fix NULL return in tpm_ibmvtpm_get_desired_dma
    - LP: #1441284
  * tpm/tpm_i2c_stm_st33: Fix potential bug in tpm_stm_i2c_send
    - LP: #1441284
  * tpm/tpm_i2c_stm_st33: Add status check when reading data on the FIFO
    - LP: #1441284
  * mmc: sdhci-pxav3: fix unbalanced clock issues during probe
    - LP: #1441284
  * iwlwifi: mvm: validate tid and sta_id in ba_notif
    - LP: #1441284
  * power: bq24190: Fix ignored supplicants
    - LP: #1441284
  * ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL enabled on UART3
    - LP: #1441284
  * Bluetooth: ath3k: Add support of AR3012 bluetooth 13d3:3423 device
    - LP: #1411193, #1441284
  * cfq-iosched: fix incorrect filing of rt async cfqq
    - LP: #1441284
  * smack: fix possible use after frees in task_security() callers
    - LP: #1441284
  * xfs: ensure buffer types are set correctly
    - LP: #1441284
  * xfs: inode unlink does not set AGI buffer type
    - LP: #1441284
  * xfs: set buf types when converting extent formats
    - LP: #1441284
  * xfs: set superblock buffer type correctly
    - LP: #1441284
  * btrfs: set proper message level for skinny metadata
    - LP: #1441284
  * KVM: s390: base hrtimer on a monotonic clock
    - LP: #1441284
  * PCI: Fix infinite loop with ROM image of size 0
    - LP: #1441284
  * USB: cp210x: add ID for RUGGEDCOM USB Serial Console
    - LP: #1441284
  * clk: zynq: Force CPU_2X clock to be ungated
    - LP: #1441284
  * mmc: sdhci-pxav3: Remove checks for mandatory host clock
    - LP: #1441284
  * mmc: sdhci-pxav3: fix race between runtime pm and irq
    - LP: #1441284
  * power_supply: 88pm860x: Fix leaked power supply on probe fail
    - LP: #1441284
  * staging: comedi: comedi_compat32.c: fix COMEDI_CMD copy back
    - LP: #1441284
  * mmc: sdhci-pxav3: fix setting of pdata->clk_delay_cycles
    - LP: #1441284
  * ARM: 8284/1: sa1100: clear RCSR_SMR on resume
    - LP: #1441284
  * usb: musb: omap2plus bus glue needs USB host support
    - LP: #1441284
  * USB: add flag for HCDs that can't receive wakeup requests (isp1760-hcd)
    - LP: #1441284
  * USB: fix use-after-free bug in usb_hcd_unlink_urb()
    - LP: #1441284
  * iwlwifi: mvm: always use mac color zero
    - LP: #1441284
  * iwlwifi: pcie: disable the SCD_BASE_ADDR when we resume from WoWLAN
    - LP: #1441284
  * vt: provide notifications on selection changes
    - LP: #1441284
  * tty: Prevent untrappable signals from malicious program
    - LP: #1441284
  * cpufreq: Set cpufreq_cpu_data to NULL before putting kobject
    - LP: #1441284
  * lmedm04: Fix usb_submit_urb BOGUS urb xfer, pipe 1 != type 3 in
    interrupt urb
    - LP: #1441284
  * mei: mask interrupt set bit on clean reset bit
    - LP: #1441284
  * mei: me: release hw from reset only during the reset flow
    - LP: #1441284
  * MIPS: KVM: Deliver guest interrupts after local_irq_disable()
    - LP: #1441284
  * KVM: MIPS: Don't leak FPU/DSP to guest
    - LP: #1441284
  * ALSA: hda - Add the pin fixup for HP Envy TS bass speaker
    - LP: #1441284
  * ALSA: hda - Set up GPIO for Toshiba Satellite S50D
    - LP: #1441284
  * xen/manage: Fix USB interaction issues when resuming
    - LP: #1441284
  * drm/i915: Correct the IOSF Dev_FN field for IOSF transfers
    - LP: #1441284
  * cfq-iosched: handle failure of cfq group allocation
    - LP: #1441284
  * tracing: Fix unmapping loop in tracing_mark_write
    - LP: #1441284
  * fsnotify: fix handling of renames in audit
    - LP: #1441284
  * drm/radeon: workaround for CP HW bug on CIK
    - LP: #1441284
  * drm/radeon: only enable kv/kb dpm interrupts once v3
    - LP: #1441284
  * NFSv4.1: Fix a kfree() of uninitialised pointers in
    decode_cb_sequence_args
    - LP: #1441284
  * cpufreq: speedstep-smi: enable interrupts when waiting
    - LP: #1441284
  * mm/hugetlb: pmd_huge() returns true for non-present hugepage
    - LP: #1441284
  * mm: cleanup follow_page_mask()
    - LP: #1441284
  * mm/hugetlb: take page table lock in follow_huge_pmd()
    - LP: #1441284
  * mm/hugetlb: fix getting refcount 0 page in hugetlb_fault()
    - LP: #1441284
  * mm/hugetlb: add migration/hwpoisoned entry check in
    hugetlb_change_protection
    - LP: #1441284
  * mm/hugetlb: add migration entry check in __unmap_hugepage_range
    - LP: #1441284
  * mm: softdirty: unmapped addresses between VMAs are clean
    - LP: #1441284
  * proc/pagemap: walk page tables under pte lock
    - LP: #1441284
  * mm: when stealing freepages, also take pages created by splitting buddy
    page
    - LP: #1441284
  * mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
    - LP: #1441284
  * mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
    - LP: #1441284
  * iscsi-target: Drop problematic active_ts_list usage
    - LP: #1441284
  * target: Fix PR_APTPL_BUF_LEN buffer size limitation
    - LP: #1441284
  * mm/compaction: fix wrong order check in compact_finished()
    - LP: #1441284
  * mm/memory.c: actually remap enough memory
    - LP: #1441284
  * mm: hwpoison: drop lru_add_drain_all() in __soft_offline_page()
    - LP: #1441284
  * ARC: fix page address calculation if PAGE_OFFSET != LINUX_LINK_BASE
    - LP: #1441284
  * drm/radeon/dp: Set EDP_CONFIGURATION_SET for bridge chips if necessary
    - LP: #1441284
  * drm/radeon: fix voltage setup on hawaii
    - LP: #1441284
  * ALSA: hdspm - Constrain periods to 2 on older cards
    - LP: #1441284
  * jffs2: fix handling of corrupted summary length
    - LP: #1441284
  * dm mirror: do not degrade the mirror on discard error
    - LP: #1441284
  * dm io: reject unsupported DISCARD requests with EOPNOTSUPP
    - LP: #1441284
  * target: Add missing WRITE_SAME end-of-device sanity check
    - LP: #1441284
  * target: Check for LBA + sectors wrap-around in sbc_parse_cdb
    - LP: #1441284
  * Btrfs: fix fsync data loss after adding hard link to inode
    - LP: #1441284
  * Added Little Endian support to vtpm module
    - LP: #1441284
  * sg: fix read() error reporting
    - LP: #1441284
  * IB/qib: Do not write EEPROM
    - LP: #1441284
  * md/raid5: Fix livelock when array is both resyncing and degraded.
    - LP: #1441284
  * dm: fix a race condition in dm_get_md
    - LP: #1441284
  * dm snapshot: fix a possible invalid memory access on unload
    - LP: #1441284
  * cpufreq: s3c: remove incorrect __init annotations
    - LP: #1441284
  * libceph: assert both regular and lingering lists in __remove_osd()
    - LP: #1441284
  * libceph: change from BUG to WARN for __remove_osd() asserts
    - LP: #1441284
  * libceph: fix double __remove_osd() problem
    - LP: #1441284
  * MIPS: Export FP functions used by lose_fpu(1) for KVM
    - LP: #1441284
  * kdb: fix incorrect counts in KDB summary command output
    - LP: #1441284
  * blk-throttle: check stats_cpu before reading it from sysfs
    - LP: #1441284
  * procfs: fix race between symlink removals and traversals
    - LP: #1441284
  * autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for
    allocation
    - LP: #1441284
  * pktgen: fix UDP checksum computation
    - LP: #1441284
  * ipv6: fix ipv6_cow_metrics for non DST_HOST case
    - LP: #1441284
  * clk-gate: fix bit # check in clk_register_gate()
    - LP: #1441284
  * ALSA: off by one bug in snd_riptide_joystick_probe()
    - LP: #1441284
  * ath5k: fix spontaneus AR5312 freezes
    - LP: #1441284
  * pinctrl: pinctrl-imx: don't use invalid value of conf_reg
    - LP: #1441284
  * ALSA: hda - Add one more node in the EAPD supporting candidate list
    - LP: #1436745, #1441284
  * ALSA: hda - Add pin configs for ASUS mobo with IDT 92HD73XX codec
    - LP: #1441284
  * drm/i915/bdw: PCI IDs ending in 0xb are ULT.
    - LP: #1441284
  * xfs: Fix quota type in quota structures when reusing quota file
    - LP: #1441284
  * gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one
    chip per node
    - LP: #1441284
  * gpio: tps65912: fix wrong container_of arguments
    - LP: #1441284
  * ALSA: pcm: Don't leave PREPARED state after draining
    - LP: #1441284
  * metag: Fix KSTK_EIP() and KSTK_ESP() macros
    - LP: #1441284
  * md/raid1: fix read balance when a drive is write-mostly.
    - LP: #1441284
  * drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh
    - LP: #1441284
  * drm/radeon: fix 1 RB harvest config setup for TN/RL
    - LP: #1441284
  * arm64: compat Fix siginfo_t -> compat_siginfo_t conversion on big
    endian
    - LP: #1441284
  * nilfs2: fix potential memory overrun on inode
    - LP: #1441284
  * HID: i2c-hid: Limit reads to wMaxInputLength bytes for input events
    - LP: #1441284
  * Linux 3.13.11-ckt18
    - LP: #1441284
  * ipv6: Don't reduce hop limit for an interface
    - LP: #1441103
    - CVE-2015-2922
  * x86/microcode/intel: Guard against stack overflow in the loader
    - LP: #1438504
    - CVE-2015-2666
 -- Luis Henriques <luis.henriques@canonical.com>   Tue, 14 Apr 2015 21:38:57 +0100

Changed in linux (Ubuntu Trusty):
status:	Fix Committed → Fix Released

Ubuntu
linux package

Trusty soft lockup issues with nested KVM

Bug Description

CVE References

Other bug subscribers

Bug attachments

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Fix Released	Undecided	Unassigned
	Trusty	Fix Released	High	Chris J Arges

Ubuntulinux package

Trusty soft lockup issues with nested KVM

Bug Description

CVE References

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package