qemu 1:2.11+dfsg-1ubuntu7.4 hangs when -cpu POWER9 is specified

Bug #1787408 reported by bugproxy on 2018-08-16
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Cosmic
Undecided
Unassigned

Bug Description

[Impact]

 * The qemu code prior to qemu 3.0 has an issue where the power9 machine
   spec makes the guest issue unsupported instructions. Qemu has to be
   adapted to be able to understand that and not break.

 * backport of upstream fix https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c8fd8373e42821984400382cd91b8bf4e7c14e3b

[Test Case]

 * Run a guest in P9 mode, on guest init it will hang - with the fix it
   will reach a login prompt.

   Feel free to use the provided initrd to boot from.
   Any host Bionic+ kernel seems to do, so we are just reusing the hosts
   vmlinuz.

   $ wget https://openpower.xyz/job/initramfs/job/buildroot-master/lastSuccessfulBuild/artifact/rootfs-le.cpio.xz
   $ qemu-system-ppc64 -nographic -vga none -M pseries,cap-htm=off -cpu POWER9 -m 1G -kernel /boot/vmlinux-$(uname -r) -initrd rootfs-le.cpio.xz

[Regression Potential]

 * The change is very limited to just the PPC eieio instruction, so if any
   it can only affect that. Also the new type is only part of the P9 spec,
   so any guest running in older modes won't be affected either.
   Those guests that would be affected by a potentially bad emulation
   error in the new code are those that didn't work at all so far -
   therefore it might contain a but but should not regress something
   working today.
   Only if there would be "another" eieio instruction source out there
   with bit 6 set (I know of none) that code would be able to regress
   those cases.

[Other Info]

 * n/a

----

== Comment: #0 - Murilo Opsfelder Araujo <email address hidden> - 2018-08-15 15:08:52 ==
---Problem Description---
qemu 1:2.11+dfsg-1ubuntu7.4 hangs when -cpu POWER9 is specified.

Bisecting qemu, I found this patch:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c8fd8373e42821984400382cd91b8bf4e7c14e3b

With a small tweak, it applies on qemu 2.11.1 from bionic and fixes the hang.

This was originally reported as a kernel bug at https://github.com/linuxppc/linux/issues/168

Contact Information = Murilo Opsfelder Araujo <email address hidden>

---uname output---
Linux jaspion1 4.15.0-30-generic #32-Ubuntu SMP Thu Jul 26 17:43:11 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = na

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 wget https://openpower.xyz/job/initramfs/job/buildroot-master/lastSuccessfulBuild/artifact/rootfs-le.cpio.xz

qemu-system-ppc64 -nographic -vga none -M pseries,cap-htm=off -cpu POWER9 -m 1G -kernel /boot/vmlinux-$(uname -r) -initrd rootfs-le.cpio.xz

Userspace tool common name: qemu

The userspace tool has the following bit modes: 64-bit

Userspace rpm: qemu

Userspace tool obtained from project website: na

*Additional Instructions for Murilo Opsfelder Araujo <email address hidden>:
-Attach ltrace and strace of userspace application.

== Comment: #1 - Murilo Opsfelder Araujo <email address hidden> - 2018-08-15 15:26:53 ==
I'll provide a debdiff.

bugproxy (bugproxy) on 2018-08-16
tags: added: architecture-ppc64le bugnameltc-170602 severity-medium targetmilestone-inin---

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1787408/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed

------- Comment (attachment only) From <email address hidden> 2018-08-16 12:24 EDT-------

affects: ubuntu → qemu (Ubuntu)

Discussed on IRC, great initial patch.
I'll need to put this into Cosmic before being able to SRU, but other than that there seems nothing blocking it.

Changed in qemu (Ubuntu Bionic):
status: New → Triaged
Changed in qemu (Ubuntu Cosmic):
status: Confirmed → Triaged

The offending patch that changed the DisasContext struct is:
commit b6bac4bc7016531405d117cfc1bf64145799e164
Author: Emilio G. Cota <email address hidden>
Date: Thu Feb 15 14:51:48 2018 -0500

    target/ppc: convert to DisasContextBase

    A couple of notes:

    - removed ctx->nip in favour of base->pc_next. Yes, it is annoying,
      but didn't want to waste its 4 bytes.

So on 2.12 (cosmic) we can use use the patch as-is, but for Bionic we will need the modified backport.

I made it work for Cosmic (without the modification we discussed) and did some minor cleanups (LP bug mention in changelog, matching the patch filename in changelog), but this stays a great initial patch submission.

It is built for cosmic in [1], could you please verify that this is doing exactly as required by you and then report back here so I can go on pushing?

P.S: I'll ty to catch a PPC system myself to verify, but in case I can't it would be great if you can verify as requested.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3362

Got hold of a P9 system and was glad that it seems to reproduce there as well (I was afraid it might only show on P8 which I had no free one atm).
- confirmed to affect Bionic
- confirmed to affect Cosmic
- proposed PPA confirmed to fix the issue

Hangs are at very early boot of the initrd or passed completely.

That said I can go on with the fix.
Further confidence by being a PPC only change from IBM PPC Engineer, pushing this to Cosmic now.

[1] is building and has to pass propsoed-migration.
Taking a look tomorrow when the tests had some time to run.

[1]: https://launchpad.net/ubuntu/+source/qemu/1:2.12+dfsg-3ubuntu4

Hi, Christian.

I'm glad you've already tested this on P9. Thank you!

Just for the record, I've managed to test
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3362 on Cosmic using
the very same P8 box I used to report this and it fixed the bug, i.e. guest
booted as expected and no hang was observed.

Are you going to update Bionic or do you want I submit a new debdiff with your
changes for Cosmic?

Cheers
Murilo

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.12+dfsg-3ubuntu4

---------------
qemu (1:2.12+dfsg-3ubuntu4) cosmic; urgency=medium

  [ Murilo Opsfelder Araujo ]
  * d//ubuntu/target-ppc-extend-eieio-for-POWER9.patch: Backport to
    extend eieio for POWER9 emulation (LP: #1787408).

 -- Christian Ehrhardt <email address hidden> Mon, 20 Aug 2018 11:52:39 +0200

Changed in qemu (Ubuntu Cosmic):
status: Triaged → Fix Released

Verified what just got released in Cosmic to work as well.

I can confirm that the build ppa:ci-train-ppa-service/3367 (for Bionic) works as expected, i.e.: POWER9 guest boots and no hang is observed.

Me as well now also passed a set of other tests over night, being ready to SRU now after I added an SRU Template.

description: updated

FYI - in the SRU queue

Hello bugproxy, or anyone else affected,

Accepted qemu into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.11+dfsg-1ubuntu7.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in qemu (Ubuntu Bionic):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-bionic
Download full text (3.7 KiB)

Note can be tested in LXD on PPC64EL with an additional LXD profile like:
config:
  boot.autostart: "true"
  linux.kernel_modules: openvswitch,nbd,ip_tables,ip6_tables,kvm
  raw.apparmor: mount,
  raw.lxc: |-
    lxc.cgroup.devices.allow = c 10:237 rwm
    lxc.cgroup.devices.allow = b 7:* rwm
    lxc.cgroup.devices.allow = b 259:* rwm
    lxc.cgroup.devices.allow = b 230:* rw
  security.nesting: "true"
  security.privileged: "true"
description: ""
devices:
  eth0:
    mtu: "9000"
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  kvm:
    path: /dev/kvm
    type: unix-char
  mapper:
    path: /dev/mapper/control
    type: unix-char
  mem:
    path: /dev/mem
    type: unix-char
  tun:
    path: /dev/net/tun
    type: unix-char
name: kvm

Before upgrade the test hung at:
[...]
[ 0.000639] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[ 0.001298] clocksource: timebase mult[1f40000] shift[24] registered

Upgrading from Proposed:
root@b2:~# apt install qemu-system-ppc qemu-block-extra qemu-system-common qemu-utils
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
  samba vde2 openbios-ppc openhackware debootstrap
The following packages will be upgraded:
  qemu-block-extra qemu-system-common qemu-system-ppc qemu-utils
4 upgraded, 0 newly installed, 0 to remove and 31 not upgraded.
Need to get 8428 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el qemu-utils ppc64el 1:2.11+dfsg-1ubuntu7.5 [787 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el qemu-system-common ppc64el 1:2.11+dfsg-1ubuntu7.5 [662 kB]
Get:3 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el qemu-block-extra ppc64el 1:2.11+dfsg-1ubuntu7.5 [38.0 kB]
Get:4 http://ports.ubuntu.com/ubuntu-ports bionic-proposed/main ppc64el qemu-system-ppc ppc64el 1:2.11+dfsg-1ubuntu7.5 [6942 kB]
Fetched 8428 kB in 1s (10.2 MB/s)
(Reading database ... 65433 files and directories currently installed.)
Preparing to unpack .../qemu-utils_1%3a2.11+dfsg-1ubuntu7.5_ppc64el.deb ...
Unpacking qemu-utils (1:2.11+dfsg-1ubuntu7.5) over (1:2.11+dfsg-1ubuntu7.4) ...
Preparing to unpack .../qemu-system-common_1%3a2.11+dfsg-1ubuntu7.5_ppc64el.deb ...
Unpacking qemu-system-common (1:2.11+dfsg-1ubuntu7.5) over (1:2.11+dfsg-1ubuntu7.4) ...
Preparing to unpack .../qemu-block-extra_1%3a2.11+dfsg-1ubuntu7.5_ppc64el.deb ...
Unpacking qemu-block-extra:ppc64el (1:2.11+dfsg-1ubuntu7.5) over (1:2.11+dfsg-1ubuntu7.4) ...
Preparing to unpack .../qemu-system-ppc_1%3a2.11+dfsg-1ubuntu7.5_ppc64el.deb ...
Unpacking qemu-system-ppc (1:2.11+dfsg-1ubuntu7.5) over (1:2.11+dfsg-1ubuntu7.4) ...
Setting up qemu-block-extra:ppc64el (1:2.11+dfsg-1ubuntu7.5) ...
Setting up qemu-utils (1:2.11+dfsg-1ubuntu7.5) ...
Processing triggers for man-db (2.8.3-2) ...
Setting up qemu-system-common (1:2.11+dfsg-1ubuntu7.5) ...
Setting up qemu-system-ppc (1:2.11+dfsg-1ubuntu7.5) ...
root@b2:~# echo $?
0

Testing from proposed:
$ qemu-system-ppc64 -nographic -v...

Read more...

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic

Note: The code change already passed the general regression checks on the identical content against a PPA (Also on the weekend prior to the full maturing period I'll have another automated run on proposed).

regression checks on the weekend confirmed what we knew from the PPA.
No regression seen due to that testing 1:2.11+dfsg-1ubuntu7.5 from proposed in our jenkins.

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.11+dfsg-1ubuntu7.5

---------------
qemu (1:2.11+dfsg-1ubuntu7.5) bionic; urgency=medium

  [Christian Ehrhardt]
  * d/p/lp-1755912-qxl-fix-local-renderer-crash.patch: Fix an issue triggered
    by migrations with UI frontends or frequent guest resolution changes
    (LP: #1755912)

  [ Murilo Opsfelder Araujo ]
  * d/p/ubuntu/target-ppc-extend-eieio-for-POWER9.patch: Backport to
    extend eieio for POWER9 emulation (LP: #1787408).

 -- Christian Ehrhardt <email address hidden> Tue, 21 Aug 2018 11:25:45 +0200

Changed in qemu (Ubuntu Bionic):
status: Fix Committed → Fix Released
bugproxy (bugproxy) on 2019-03-08
tags: added: targetmilestone-inin1804
removed: targetmilestone-inin---
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers