Ubuntu
linux package

[Xenial] KVM trusty guest 3.13.0-68 raid6-pq panic in raid6_avx21_gen_syndrome() while probing grub devices [was: Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel exception]

Bug #1524069 reported by Mike Pontillo on 2015-12-08

This bug affects 10 people

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Confirmed	High	Unassigned

Bug Description

The symptom I saw was this (note the segfault, and apt-get upgrade hangs after this):

Setting up linux-image-3.13.0-71-generic (3.13.0-71.114) ...
Running depmod.
update-initramfs: deferring update (hook will be called later)
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
update-initramfs: Generating /boot/initrd.img-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/pm-utils 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-3.13.0-71-generic
Found initrd image: /boot/initrd.img-3.13.0-71-generic
Found linux image: /boot/vmlinuz-3.13.0-68-generic
Found initrd image: /boot/initrd.img-3.13.0-68-generic
Segmentation fault
done
Setting up linux-firmware (1.127.19) ...
Setting up linux-image-extra-3.13.0-71-generic (3.13.0-71.114) ...
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
update-initramfs: Generating /boot/initrd.img-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/pm-utils 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 3.13.0-71-generic /boot/vmlinuz-3.13.0-71-generic
Generating grub configuration file ...
Warning: Setting GRUB_TIMEOUT to a non-zero value when GRUB_HIDDEN_TIMEOUT is set is no longer supported.
Found linux image: /boot/vmlinuz-3.13.0-71-generic
Found initrd image: /boot/initrd.img-3.13.0-71-generic
Found linux image: /boot/vmlinuz-3.13.0-68-generic
Found initrd image: /boot/initrd.img-3.13.0-68-generic

In dmesg, I saw a corresponding kernel stack trace:

[ 522.649091] SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
[ 522.654031] JFS: nTxBlock = 8192, nTxLock = 65536
[ 522.660515] NTFS driver 2.1.30 [Flags: R/O MODULE].
[ 522.672519] QNX4 filesystem 0.2.3 registered.
[ 522.677257] xor: measuring software checksum speed
[ 522.715613] prefetch64-sse: 17306.000 MB/sec
[ 522.755589] generic_sse: 16039.000 MB/sec
[ 522.755590] xor: using function: prefetch64-sse (17306.000 MB/sec)
[ 522.823619] raid6: sse2x1 10481 MB/s
[ 522.891614] raid6: sse2x2 13303 MB/s
[ 522.959616] raid6: sse2x4 15209 MB/s
[ 522.963634] invalid opcode: 0000 [#1] SMP
[ 522.963645] Modules linked in: raid6_pq(+) xor ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables snd_hda_intel snd_hda_codec snd_hwdep qxl snd_pcm kvm_intel ttm snd_page_alloc kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper snd_timer aesni_intel snd aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd drm soundcore lp parport serio_raw i2c_piix4 mac_hid pata_acpi floppy psmouse
[ 522.963746] CPU: 2 PID: 11288 Comm: modprobe Not tainted 3.13.0-68-generic #111-Ubuntu
[ 522.963751] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 522.963755] task: ffff880059363000 ti: ffff88005dec6000 task.ti: ffff88005dec6000
[ 522.963759] RIP: 0010:[<ffffffffa049dd0a>] [<ffffffffa049dd0a>] raid6_avx21_gen_syndrome+0x4a/0x160 [raid6_pq]
[ 522.963767] RSP: 0018:ffff88005dec7c40 EFLAGS: 00010246
[ 522.963771] RAX: 0000000000000000 RBX: ffff88005dec7c88 RCX: ffff880059363000
[ 522.963774] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 0000000000000012
[ 522.963778] RBP: ffff88005dec7c70 R08: 0000000000000086 R09: 000000000000025f
[ 522.963781] R10: 0000000000000000 R11: ffff88005dec79ae R12: 0000000000001000
[ 522.963785] R13: ffff880043a42000 R14: ffff880043a43000 R15: 0000000000000012
[ 522.963789] FS: 00007fa330823740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[ 522.963793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 522.963796] CR2: 00007fa330623000 CR3: 0000000036acf000 CR4: 00000000001006e0
[ 522.963801] Stack:
[ 522.963803] 0000000000000080 ffffffffa049e238 ffffffffa04b0720 ffff880043a42000
[ 522.963810] 0000000000003cd7 000000010000d992 ffff88005dec7d40 ffffffffa00d20fb
[ 522.963817] 0000000000000000 ffffffffa04a0600 ffffffffa04a1600 ffffffffa04a2600
[ 522.963824] Call Trace:
[ 522.963838] [<ffffffffa00d20fb>] init_module+0xfb/0x1000 [raid6_pq]
[ 522.963843] [<ffffffffa00d2000>] ? 0xffffffffa00d1fff
[ 522.963849] [<ffffffff8100214a>] do_one_initcall+0xfa/0x1b0
[ 522.963854] [<ffffffff81059903>] ? set_memory_nx+0x43/0x50
[ 522.963859] [<ffffffff810e29bd>] load_module+0x12dd/0x1b40
[ 522.963863] [<ffffffff810de440>] ? store_uevent+0x40/0x40
[ 522.963868] [<ffffffff810e3396>] SyS_finit_module+0x86/0xb0
[ 522.963873] [<ffffffff81734cdd>] system_call_fastpath+0x1a/0x1f
[ 522.963876] Code: 00 00 00 00 53 48 89 d3 48 83 ec 08 48 89 75 d0 4c 8b 2c c2 4c 8b 74 32 08 e8 13 f9 b7 e0 84 c0 0f 84 f1 00 00 00 e8 c6 f9 b7 e0 <c5> fd 6f 05 ee 2a 01 00 c5 e5 ef db 4d 85 e4 0f 84 c0 00 00 00
[ 522.963940] RIP [<ffffffffa049dd0a>] raid6_avx21_gen_syndrome+0x4a/0x160 [raid6_pq]
[ 522.963946] RSP <ffff88005dec7c40>
[ 522.963949] ---[ end trace 7324d498bc862f81 ]---

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-68-generic 3.13.0-68.111
ProcVersionSignature: Ubuntu 3.13.0-68.111-generic 3.13.11-ckt27
Uname: Linux 3.13.0-68-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version k3.13.0-68-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
Date: Tue Dec 8 13:08:13 2015
HibernationDevice: RESUME=UUID=69fc2e53-278d-40ca-8109-64f826a073e7
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 qxldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-68-generic root=UUID=897d5dd8-599e-42f9-a081-96f73cef59d9 ro splash quiet vt.handoff=7
RelatedPackageVersions:
linux-restricted-modules-3.13.0-68-generic N/A
linux-backports-modules-3.13.0-68-generic N/A
linux-firmware 1.127.19
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-wily
dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-wily:cvnQEMU:ct1:cvrpc-i440fx-wily:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-wily
dmi.sys.vendor: QEMU

Tags:

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-08:

AlsaDevices.txt Edit (375 bytes, text/plain; charset="utf-8")
BootDmesg.txt Edit (35.8 KiB, text/plain; charset="utf-8")
Card0.Codecs.codec.0.txt Edit (2.1 KiB, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (4.6 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (3.3 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (7.4 KiB, text/plain; charset="utf-8")
PciMultimedia.txt Edit (600 bytes, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (3.3 KiB, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (2.5 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (3.4 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (72.0 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (175.1 KiB, text/plain; charset="utf-8")
WifiSyslog.txt Edit (127.8 KiB, text/plain; charset="utf-8")

summary:

- Xenial KVM: updating guest from 3.13.0-68 to 3.13.0-71 causes kernel
- exception
+ Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes
+ kernel exception

Chris J Arges (arges) on 2015-12-08

tags:

added: kernel-key

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-08: Re: Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel exception

For what it's worth, unchecking "[ ] Copy host CPU configuration" in virt-manager and selecting "Hypervisor default" for the CPU is a workaround. (/proc/cpuinfo reports "QEMU Virtual CPU version 2.4.0")

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-08:

Also, the hypervisor's /proc/cpuinfo reports the following CPU configuration (8 cores of the same):

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz
stepping : 3
microcode : 0x1e
cpu MHz : 2500.097
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
bugs :
bogomips : 4988.49
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:

And the hypervisor's kernel is version 4.2.0-19-generic.

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-12-08: Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-08: Re: Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel exception

I just tried again with two logical CPUs and the core set to "Westmere" (I think that was the default), and didn't see the bug. So this seems related specifically to "[x] Copy host CPU configuration" being checked. Here's the /proc/cpuinfo from the "virtual Westmere" where it works:

http://paste.ubuntu.com/13836572/

Joseph Salisbury (jsalisbury) on 2015-12-08

Changed in linux (Ubuntu):
importance:	Undecided → High

Revision history for this message

Andy Whitcroft (apw) wrote on 2015-12-09: Re: [trusty] 3.13.0-68 raid6-pq panic in raid6_avx21_gen_syndrome() while probing grub devices (was: Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel exception)

So the host exposing some of the physical cpu characteristics is triggering use of this specific syndrome generator (which makes sense as it is h/w specific) raid6_avx21_gen_syndrome().

summary:

- Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes
- kernel exception
+ [trusty] 3.13.0-68 raid6-pq panic in raid6_avx21_gen_syndrome() while
+ probing grub devices (was: Xenial KVM: updating Trusty guest from
+ 3.13.0-68 to 3.13.0-71 causes kernel exception)

Revision history for this message

Andy Whitcroft (apw) wrote on 2015-12-09:

I will note that this has thrown an illegal instruction trap while doing this:

[ 522.963634] invalid opcode: 0000 [#1] SMP

So it is likely that dispite offering up the avx2 flags to the guest the host is not actually providing the instructions?

Andy Whitcroft (apw) on 2015-12-09

summary:

- [trusty] 3.13.0-68 raid6-pq panic in raid6_avx21_gen_syndrome() while
- probing grub devices (was: Xenial KVM: updating Trusty guest from
- 3.13.0-68 to 3.13.0-71 causes kernel exception)
+ [Xenial] KVM trusty guest 3.13.0-68 raid6-pq panic in
+ raid6_avx21_gen_syndrome() while probing grub devices [was: Xenial KVM:
+ updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel
+ exception]

Revision history for this message

Chris J Arges (arges) wrote on 2015-12-09:

Ok I attempted to reproduce without success. What I tried:
- Xenial on Xenial on machine/guest with avx2
- Trusty 3.13.0-{68,71} on Xenial on machine/guest with avx2

In both of these instances I modprobed raid6_pq, in the trusty instance I triggered an upgrade from 68 to 71 and neither of these caused a segfault.

The following information would be useful in debugging this:

1) The differences between the guests /proc/cpuinfo with it reproducing and not-reproducing. This can verify the avx2 bit is to blame.
2) Can you simply 'sudo modprobe -r raid6_pq && sudo modprobe raid6_pq' to trigger this issue?
3) Can you test with a xenial guest and a similar configuration to see if this still triggers the issue?
4) If you have the ability testing with different versions of hypervisor (trusty vs xenial) might also be useful.

Thanks,
--chris j arges

Changed in linux (Ubuntu):
assignee:	nobody → Chris J Arges (arges)

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-09:

On my Trusty VM with "[x] Copy host CPU configuration" checked in virt-manager, running 'sudo modprobe -r raid6_pq && sudo modprobe raid6_pq' is enough to reproduce the kernel exception in dmesg (though I didn't see it print "Segmentation Fault", so that may be somewhat of a red herring).

Here's /proc/cpuinfo on the guest:
http://paste.ubuntu.com/13868715/

And here's /proc/cpuinfo on the hypervisor host:
http://paste.ubuntu.com/13868744/

I have a separate issue (which I have not filed, because I haven't triaged, and am not sure I can reproduce yet) that was blocking me from creating a Xenial guest, but I can try again shortly.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-12-10:

#10

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in kvm (Ubuntu):
status:	New → Confirmed

Revision history for this message

Serge Victor (ser) wrote on 2015-12-10:

#11

It affects me, trying to bootstrap xenial vm on xenial, getting OOPS like above.

# virt-install --name $SHORT \
        --ram $MEM \
        --vcpus $CPUS \
        --location=http://hk.archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/ \
        --os-type=linux --os-variant=virtio26 \
        --bridge=siec,model=virtio,mac=$MAC \
        --disk path=$DISK \
        --cpuset=0,1,2 --cpu=host \
        --autostart --nographics \
        --extra-args="auto=true priority=critical locale=pl_PL keymap=pl debian-installer/allow_unauthenticated_ssl=true console-setup/ask_detect=false netcfg/hostname=$HOST url=https://domain.com/ubuntu.pre4seed.cfg console=tty0 console=ttyS0,115200n8 serial"

Revision history for this message

Serge Victor (ser) wrote on 2015-12-16:

#12

To be honest, it's pretty annoying bug, as I am not able to install any xenial VM :-(

Revision history for this message

Mike Pontillo (mpontillo) wrote on 2015-12-16:

#13

@Serge, A workaround for me is to use "hypervisor default" in virt-manager. I'm not sure what the equivalent is in virt-install, but maybe using --cpu=host-model-only would be a workaround?

From the man page:

               Expose the host CPUs configuration to the guest. This enables the guest to take advantage of many
               of the host CPUs features (better performance), but may cause issues if migrating the guest to a
               host without an identical CPU.

           --cpu host-model-only
               Expose the nearest host CPU model configuration to the guest. It is the best CPU which can be
               used for a guest on any of the hosts.

Use --cpu=? to see a list of all available sub options. Complete details at
<http://libvirt.org/formatdomain.html#elementsCPU>

Revision history for this message

Serge Victor (ser) wrote on 2016-01-03:

#14

The bug persists in kernel 4.3.0-2-generic #11-Ubuntu.

@Mike - thanks for a workaround, it works, indeed :-)

--cpu=host-model-only

for virt-manager.

Revision history for this message

William Grant (wgrant) wrote on 2016-01-06:

#15

If you need avx2 support, --cpu Haswell-noTSX,-x2apic works on Haswell desktop/laptop chips.

The most confusing problem is that qemu's definition of "Haswell" is actually Haswell-E, -EP and -EX; Haswell itself lacks x2apic, which qemu's Haswell requires. x2apic dates back to Nehalem, but qemu's CPU definitions only include it back to Sandy Bridge, so a standard desktop or laptop Haswell CPU falls all the back to Westmere and then adds flags including avx, avx2 and xsave on top of that[0].

Advertising support for AVX and AVX2 is just a matter of setting CPUID.1:ECX.AVX and CPUID.7:EBX.AVX2, but the instructions won't actually work unless XCR0.AVX is set, and kvm_load_guest_xcr0 only sets XCR0 from the guest if the guest's XR4.OSXSAVE is set. The guest's fpu__init_cpu_xstate only sets CR4.OSXSAVE when xfeatures_mask is non-zero, and xfeatures_mask is calculated by fpu__init_system_xstate from CPUID.(EAX=0DH,ECX=0), which doesn't exist on qemu's Westmere (level=0xb), so XCR0.AVX remains unset and AVX2 instructions #UD.

The bug is probably that raid6_have_avx2 only checks that AVX2 is supported in CPUID, not that it's enabled. Checking for X86_FEATURE_OSXSAVE might work, though I'm not sure if the value checked by boot_cpu_has is stored too early for that.

I am also a little suspicious of kvm_load_guest_xcr0's CR4.OSXSAVE guard. The Intel manuals state that XSAVE, XSRSTOR, XGETBV and XSETBV require the flag to be set, but KVM won't restore a guest's existing XCR0 unless OSXSAVE is still set.

[0] "-cpu Westmere,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+rdtscp,+pdpe1gb,+rdrand,+f16c,+avx,+osxsave,+xsave,+tsc-deadline,+movbe,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme"

Revision history for this message

William Grant (wgrant) wrote on 2016-01-06:

#16

One could argue that libvirt should exclude x2apic from the host-model checks, as it's emulated by qemu whether or not the host supports it.

Mathew Hodson (mhodson) on 2016-01-17

no longer affects:

kvm (Ubuntu)

Joseph Salisbury (jsalisbury) on 2016-02-02

tags:

added: kernel-da-key
removed: kernel-key

Revision history for this message

Dr. Jens Harbott (j-harbott) wrote on 2016-02-26:

#17

I am seeing the same issue on some of my OpenStack compute nodes, interestingly those which seem to have a newer CPU than others.

Affected CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
Mapped in guest to: Intel Core i7 9xx (Nehalem Class Core i7)

Unaffected Host CPU: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Mapped in guest to: Intel Xeon E312xx (Sandy Bridge)

Booting a Xenial guest on the affected host crashes during boot, see http://paste.ubuntu.com/15204908/. A Wily guest runs fine at first, but generates a similar traceback as soon as I modprobe raid6_pq.

Both hosts are running nova-compute from Wily. Please let me know if you need any further details.

Revision history for this message

Blake Rouse (blake-rouse) wrote on 2016-05-10:

#18

Just hit this same issue with nova-compute on Xenial and creating a Xenial instance in nova. I worked around the issue with:

juju set-config nova-compute cpu-mode=host-passthrough

Revision history for this message

Dmitry Sutyagin (dsutyagin) wrote on 2016-05-28:

#19

Hit this issue when booting Ubuntu Trusty VM on an Ubuntu host, worked around via editing VM XML by adding this in <cpu> section:

This + destroy->start the VM and it booted fine.

Chris J Arges (arges) on 2016-05-31

Changed in linux (Ubuntu):
assignee:	Chris J Arges (arges) → nobody

Revision history for this message

shane (sygibson) wrote on 2016-08-29:

#20

I've hit this bug as well, OpenStack Nova Compute 2:12.0.4-0ubuntu1~cloud1, KVM 1:2.3+dfsg-5ubuntu9.4~cloud1. Hypervisor is running Ubuntu 14.04.4 with CPU flags:

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid

(NOTE "avx2")

CPU model is Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

Guest running Ubuntu Xenial (16.04.1) - started up with the following arguments (note again "avx2"):

libvirt+ 7221 1 5 14:22 ? 00:01:14 qemu-system-x86_64 -enable-kvm -name instance-00000002 -S -machine pc-i440fx-vivid,accel=kvm,usb=off -cpu Nehalem,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+rdtscp,+pdpe1gb,+rdrand,+f16c,+avx,+osxsave,+xsave,+tsc-deadline,+movbe,+x2apic,+dca,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid d1d07811-e82c-4fc0-9df0-31b7992b098b -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=12.0.4,serial=00000000-0000-0000-0000-0cc47a4e5b5a,uuid=d1d07811-e82c-4fc0-9df0-31b7992b098b,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000002.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/disk/by-path/ip-10.101.104.1:3260-iscsi-iqn.2010-10.org.openstack:volume-7ad6788f-fdb0-4e50-a470-525bf36a8f1e-lun-1,if=none,id=drive-virtio-disk0,format=raw,serial=7ad6788f-fdb0-4e50-a470-525bf36a8f1e,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:d5:97:04,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/home/zerostack/data/openstack/nova/instances/d1d07811-e82c-4fc0-9df0-31b7992b098b/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

Symptom: VM Crashes on boot up with similar stack trace as documented here.

I've hit this bug as well, OpenStack Nova Compute 2:12.0.4-0ubuntu1~cloud1, KVM 1:2.3+dfsg-5ubuntu9.4~cloud1.  Hypervisor is running Ubuntu 14.04.4 with CPU flags:

flags  		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid

(NOTE "avx2")

CPU model is Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

Guest running Ubuntu Xenial (16.04.1) - started up with the following arguments (note again "avx2"):

libvirt+  7221     1  5 14:22 ?        00:01:14 qemu-system-x86_64 -enable-kvm -name instance-00000002 -S -machine pc-i440fx-vivid,accel=kvm,usb=off -cpu Nehalem,+invpcid,+erms,+bmi2,+smep,+avx2,+bmi1,+fsgsbase,+abm,+rdtscp,+pdpe1gb,+rdrand,+f16c,+avx,+osxsave,+xsave,+tsc-deadline,+movbe,+x2apic,+dca,+pcid,+pdcm,+xtpr,+fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid d1d07811-e82c-4fc0-9df0-31b7992b098b -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=12.0.4,serial=00000000-0000-0000-0000-0cc47a4e5b5a,uuid=d1d07811-e82c-4fc0-9df0-31b7992b098b,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000002.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/disk/by-path/ip-10.101.104.1:3260-iscsi-iqn.2010-10.org.openstack:volume-7ad6788f-fdb0-4e50-a470-525bf36a8f1e-lun-1,if=none,id=drive-virtio-disk0,format=raw,serial=7ad6788f-fdb0-4e50-a470-525bf36a8f1e,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=28,id=hostnet0,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:d5:97:04,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/home/zerostack/data/openstack/nova/instances/d1d07811-e82c-4fc0-9df0-31b7992b098b/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

Symptom:  VM Crashes on boot up with similar stack trace as documented here.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

[Xenial] KVM trusty guest 3.13.0-68 raid6-pq panic in raid6_avx21_gen_syndrome() while probing grub devices [was: Xenial KVM: updating Trusty guest from 3.13.0-68 to 3.13.0-71 causes kernel exception]

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package