Couldn't emulate instruction 0x7813427c

Bug #1634129 reported by bugproxy on 2016-10-17
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Tim Gardner
Zesty
Undecided
Unassigned

Bug Description

Couldn't emulate instruction 0x7813427c
-------------------------------------------------------

Cannot boot nested VMs in Xenial or Yakkety w/ kvm accel.
It worked until Vivid (in spite of not being possible in x86)
TCG mode works fine, but very slow.

TCG full emulation is the mode in the x86 world for nested virt. However, in Power, we've been using in OpenStack CI w/ kvm accel (native virtualization) to speed up 2nd level VMs. It worked until Vivid.

Is the case that kvm accel isn't possible anymore for nested virt (aligned with x86 KVM) ? So full emulation TCG mode is the only possible mode in newer kernels ?

qemu-system-ppc64le -machine pseries,accel=kvm,usb=off -m 1G -enable-kvm -cpu POWER8E -display none -nographic cirros-d161007-ppc64le-disk.img

lsmod |grep kvm
kvm_pr 96452 1
kvm 152984 4 kvm_pr

Nested VM console:

OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 4.4.0-28-generic (buildd@bos01-ppc64el-018) (gcc version 5.3.1 20160413 (Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #47-Ubuntu SMP Fri Jun 24 10:09:20 UTC 2016 (Ubuntu 4.4.0-28.47-generic 4.4.13)
Detected machine type: 0000000000000101
Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
Calling ibm,client-architecture-support... done
command line: BOOT_IMAGE=/boot/vmlinux-4.4.0-28-generic LABEL=cirros-rootfs ro
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000004210000
  alloc_top : 0000000010000000
  alloc_top_hi : 0000000040000000
  rmo_top : 0000000010000000
  ram_top : 0000000040000000
found display : /pci@800000020000000/vga@0, opening... done
instantiating rtas at 0x000000000daf0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000004220000 -> 0x0000000004220aa9
Device tree struct 0x0000000004230000 -> 0x0000000004240000
Quiescing Open Firmware ...
Booting Linux via __start() ?

<The nested VM hangs here>

/var/log/syslog & /var/log/kern.log

Oct 13 14:07:38 patricia-ub16-10 kernel: [64072.186975] kvmppc_handle_exit_pr: emulation at 700 failed (7813427c)
Oct 13 14:07:38 patricia-ub16-10 kernel: [64072.187023] Couldn't emulate instruction 0x7813427c (op 30 xop 318)
Oct 13 14:07:38 patricia-ub16-10 kernel: [64072.187066] kvmppc_handle_exit_pr: emulation at 700 failed (7813427c)
Oct 13 14:07:38 patricia-ub16-10 kernel: [64072.187113] Couldn't emulate instruction 0x7813427c (op 30 xop 318)
Oct 13 14:07:38 patricia-ub16-10 kernel: [64072.187156] kvmppc_handle_exit_pr: emulation at 700 failed (7813427c)

<syslog and kern.log fills up with this error forever, until get the disk full>

Host:
cpu : POWER8E (raw), altivec supported
clock : 3690.000000MHz
revision : 2.1 (pvr 004b 0201)

timebase : 512000000
platform : PowerNV
model : 8247-22L
machine : PowerNV 8247-22L
firmware : OPAL v3

Guest: Xenial or Yakkety
Description: Ubuntu 16.10
Release: 16.10
Codename: yakkety

Nested VM:
CirrOS
http://download.cirros-cloud.net/daily/20161007/cirros-d161007-ppc64le-disk.img

This seems to be related
https://patchwork.kernel.org/patch/9121881/

bugproxy (bugproxy) on 2016-10-17
tags: added: architecture-ppc64le bugnameltc-147569 severity-critical targetmilestone-inin16041
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)

------- Comment From <email address hidden> 2016-11-03 23:42 EDT-------
I think we probably need commit fa73c3b25bd8 ("KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register", 2016-09-21). It has been back-ported to the 4.4.x stable series in 4.4.25 as commit 418fdccd410e.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-11-06 19:55 EDT-------
To test, I built kernel packages based on 4.4.0-45.66 with the two patches mentioned in comments 6 and 11, and with that running in a HV KVM guest I was able to boot a nested guest running the cirros image.

Tim Gardner (timg-tpi) wrote :

<email address hidden> - what are the 2 commits ? I can only see commit fa73c3b25bd8 ("KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register", 2016-09-21) mentioned.

Breno Leitão (breno-leitao) wrote :

Tim,

I think that the other commit is:

 https://patchwork.kernel.org/patch/9121881/

It seems to be commit-id 708e75a3ee750dce1072134e630d66c4e6eaf63c

Tim Gardner (timg-tpi) wrote :

commit fa73c3b25bd8 ("KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register" was part of v4.4.25 stable which was released in Ubuntu-4.4.0-48.69.

I've submitted commit 708e75a3ee750dce1072134e630d66c4e6eaf63c ('KVM: PPC: Book3S PR: Fix illegal opcode emulation') for review:

https://lists.ubuntu.com/archives/kernel-team/2017-January/081774.html

Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Zesty):
status: New → Fix Released
assignee: Taco Screen team (taco-screen-team) → nobody
Luis Henriques (henrix) on 2017-01-10
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
John Donnelly (jpdonnelly) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Rafael Folco (rafaelfolco) wrote :

Seems to be FIXED, tested linux-image-4.4.0-62-generic/xenial-proposed.

I could not reproduce the hang with the version 4.4.0-62.83 from xenial-proposed, so I assume the issue has been fixed. I confirm that I am able to boot nested VMs with the kernel above from 4.4 release.

root@bug-147569:~# uname -a
Linux bug-147569 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:09:19 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

--
Installed with:
 apt-get install linux-image-4.4.0-62-generic/xenial-proposed
linux-image-4.4.0-62-generic is already the newest version (4.4.0-62.83).
Selected version '4.4.0-62.83' (Ubuntu:16.04/xenial-proposed [ppc64el]) for 'linux-image-4.4.0-62-generic'

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (10.8 KiB)

This bug was fixed in the package linux - 4.4.0-62.83

---------------
linux (4.4.0-62.83) xenial; urgency=low

  [ Thadeu Lima de Souza Cascardo ]

  * Release Tracking Bug
    - LP: #1657430

  * Backport DP MST fixes to i915 (LP: #1657353)
    - SAUCE: i915_bpo: Fix DP link rate math
    - SAUCE: i915_bpo: Validate mode against max. link data rate for DP MST

  * Ubuntu xenial - 4.4.0-59-generic i3 I/O performance issue (LP: #1657281)
    - blk-mq: really fix plug list flushing for nomerge queues

linux (4.4.0-61.82) xenial; urgency=low

  [ Thadeu Lima de Souza Cascardo ]

  * Release Tracking Bug
    - LP: #1656810

  * Xen MSI setup code incorrectly re-uses cached pirq (LP: #1656381)
    - SAUCE: xen: do not re-use pirq number cached in pci device msi msg data

  * nvme drive probe failure (LP: #1626894)
    - nvme: revert NVMe: only setup MSIX once

linux (4.4.0-60.81) xenial; urgency=low

  [ John Donnelly ]

  * Release Tracking Bug
    - LP: #1656084

  * Couldn't emulate instruction 0x7813427c (LP: #1634129)
    - KVM: PPC: Book3S PR: Fix illegal opcode emulation

  * perf: 24x7: Eliminate domain name suffix in event names (LP: #1560482)
    - powerpc/perf/hv-24x7: Fix usage with chip events.
    - powerpc/perf/hv-24x7: Display change in counter values
    - powerpc/perf/hv-24x7: Display domain indices in sysfs
    - powerpc/perf/24x7: Eliminate domain suffix in event names

  * i386 ftrace tests hang on ADT testing (LP: #1655040)
    - ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps
      to it

  * VMX module autoloading if available (LP: #1651322)
    - powerpc: Add module autoloading based on CPU features
    - crypto: vmx - Convert to CPU feature based module autoloading

  * ACPI probe support for AD5592/3 configurable multi-channel converter
    (LP: #1654497)
    - SAUCE: iio: dac: ad5592r: Add ACPI support
    - SAUCE: iio: dac: ad5593r: Add ACPI support

  * Xenial update to v4.4.40 stable release (LP: #1654602)
    - btrfs: limit async_work allocation and worker func duration
    - Btrfs: fix tree search logic when replaying directory entry deletes
    - btrfs: store and load values of stripes_min/stripes_max in balance status
      item
    - Btrfs: fix qgroup rescan worker initialization
    - USB: serial: option: add support for Telit LE922A PIDs 0x1040, 0x1041
    - USB: serial: option: add dlink dwm-158
    - USB: serial: kl5kusb105: fix open error path
    - USB: cdc-acm: add device id for GW Instek AFG-125
    - usb: hub: Fix auto-remount of safely removed or ejected USB-3 devices
    - usb: gadget: f_uac2: fix error handling at afunc_bind
    - usb: gadget: composite: correctly initialize ep->maxpacket
    - USB: UHCI: report non-PME wakeup signalling for Intel hardware
    - ALSA: usb-audio: Add QuickCam Communicate Deluxe/S7500 to
      volume_control_quirks
    - ALSA: hiface: Fix M2Tech hiFace driver sampling rate change
    - ALSA: hda/ca0132 - Add quirk for Alienware 15 R2 2016
    - ALSA: hda - ignore the assoc and seq when comparing pin configurations
    - ALSA: hda - fix headset-mic problem on a Dell laptop
    - ALSA: hda - Gate the mic jack on HP Z1 Gen3 AiO
    - ALSA: hd...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers