Power S822LC (8335-GTB) fails KVM guest cert test with kvm_init_vcpu failed: Invalid argument

Bug #1656112 reported by Mike Rushton on 2017-01-12
30
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Canonical Kernel Team
Xenial
High
Unassigned
qemu (Ubuntu)
High
Ubuntu Server
Xenial
Medium
Christian Ehrhardt 

Bug Description

[Impact]

 * Some newer Power8 derivates fail to work correctly e.g. Power S822LC
  (8335-GTB)

 * This is a toleration change (no exploitation) for those HW releases
   following the SRU policy of "For Long Term Support releases we
   regularly want to enable new hardware. Such changes are appropriate
   provided that we can ensure not to affect upgrades on existing
   hardware."

 * Without the Fix that hardware won't run Xenial guests under current
   Xenials Qemu version

 * The fix lets processors that support it run in PowerISA 2.07
   compatibility mode (plus a few no-op changes as backport
   dependencies)

[Test Case]

 * Run a Xenial Guest in KVM on one of the specific HW revisions being
   affected.

[Regression Potential]

 * I'd rate the potential to regress low for powerpc and next to
   impossible for other architectures, reasons:
 * Changes are PPC only, so fallout should be contained to that
 * Patches and were created by IBM and in Upstream qemu since 2.6
 * The effective change is rather small, only allow to saner cpu model
   versions (drop some HW dev trash that was left) and add the new types.

[Other Info]

 * Needed for certifying this Hardware for Ubuntu

Upon running the virtualization test from the certification test suite, the kvm guest test fails with the following error:

kvm_init_vcpu failed: Invalid argument

This same test works on multiple other IBM Power 8 and Openpower servers. kvm-ok tells us that kvm virtualization is supported. I have tried with SMT enabled and disabled. I have tried the latest cloud image as well as previous onces we had saved. I have tried running the qemu-system-ppc64 command found below manually with the same error.

The full output from the test is as follows:

Executing KVM Test
DEBUG:root:Starting KVM Test
DEBUG:root:Cloud image location specified: http://10.1.10.2/cloud/xenial-server-cloudimg-ppc64el-disk1.img.
DEBUG:root:Downloading xenial-server-cloudimg-ppc64el-disk1.img, from http://10.1.10.2
DEBUG:root:Creating cloud user-data
DEBUG:root:Creating cloud meta-data
I: -input-charset not specified, using utf-8 (detected in locale settings)
Total translation table size: 0
Total rockridge attributes bytes: 331
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
183 extents written (0 MB)
DEBUG:root:Attempting boot for:xenial-server-cloudimg-ppc64el-disk1.img
DEBUG:root:Attaching Cloud config disk
DEBUG:root:Using params:qemu-system-ppc64 -m 1024 -display none -nographic -net nic -net user,net=10.0.0.0/8,host=10.0.0.1,hostfwd=tcp::2222-:22 -enable-kvm -machine pseries,usb=off -cpu POWER8 -drive file=xenial-server-cloudimg-ppc64el-disk1.img,if=virtio -drive file=seed.iso,if=virtio
INFO:root:Storing VM console output in /home/ubuntu/.cache/plainbox/sessions/canonical-certification-server-2017-01-12T22.19.34.session/CHECKBOX_DATA/virt_debug

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-59-generic 4.4.0-59.80
ProcVersionSignature: Ubuntu 4.4.0-59.80-generic 4.4.35
Uname: Linux 4.4.0-59-generic ppc64le
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jan 12 22:18 seq
 crw-rw---- 1 root audio 116, 33 Jan 12 22:18 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.1-0ubuntu2.4
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu Jan 12 22:45:34 2017
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
 Bus 002 Device 002: ID 125f:312b A-DATA Technology Co., Ltd. Superior S102 Pro
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse
 Bus 001 Device 002: ID 046b:ff01 American Megatrends, Inc.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: root=UUID=a7ce18b4-4614-485f-9346-b19b0415db3a ro fips=1
ProcLoadAvg: 0.03 0.02 0.08 1/1288 11017
ProcLocks:
 1: POSIX ADVISORY WRITE 3709 00:14:665 0 EOF
 2: POSIX ADVISORY WRITE 3593 00:14:655 0 EOF
 3: POSIX ADVISORY WRITE 1658 00:14:376 0 EOF
 4: FLOCK ADVISORY WRITE 3560 00:14:637 0 EOF
 5: POSIX ADVISORY WRITE 3571 00:14:640 0 EOF
ProcSwaps:
 Filename Type Size Used Priority
 /swap.img file 8388544 0 -1
ProcVersion: Linux version 4.4.0-59-generic (buildd@bos01-ppc64el-029) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #80-Ubuntu SMP Fri Jan 6 17:35:59 UTC 2017
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-59-generic N/A
 linux-backports-modules-4.4.0-59-generic N/A
 linux-firmware 1.157.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_dscr: DSCR is 0
cpu_freq:
 min: 3.959 GHz (cpu 79)
 max: 3.988 GHz (cpu 81)
 avg: 3.974 GHz
cpu_runmode:
 Could not retrieve current diagnostics mode,
 No kernel interface to firmware
cpu_smt: SMT=8

Mike Rushton (leftyfb) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Hi Mike,

In the logs, it seems that you have SMT enabled. KVM does not work with SMT enabled on ppc64el.

Would you mind running with SMT=off and rerun and posting the new logs, please?

In order to disable SMT, please run:

# ppc64_cpu --smt=off

Mike Rushton (leftyfb) wrote :

As mentioned above, I have run the test multiple times with SMT both enabled and disabled. I get the same error. This is from running the test just now with SMT disabled:

WARNING: Image format was not specified for 'seed.iso' and probing guessed raw.
         Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
         Specify the 'raw' format explicitly to remove the restrictions.
kvm_init_vcpu failed: Invalid argument

Also, KVM does in fact work with SMT enabled on ppc64el in PowerNV mode. KVM does not work with SMT enabled on a guest in PowerKVM mode. I have just run the test on the Power S822LC (8335-GTA Non-Virtualized) in PowerNV mode with SMT enabled and the guest booted fine.

That said, to limit variables, I will run the tests on the current server with SMT disabled for the duration of the troubleshooting.

Joseph Salisbury (jsalisbury) wrote :

Are you able to test various kernels? If so, it might be good to test the mainline kernel to see if the bug is already fixed in mainline. It is available from:

 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc3

Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
status: New → Confirmed
tags: added: kernel-da-key
Download full text (4.6 KiB)

Dear sir,

please , do not send me Bug reports : It is impossible for me to
unsuscribe (bug) so I write to you. I thank you.

On 12/01/2017 23:52, Mike Rushton wrote:
> Public bug reported:
>
> Upon running the virtualization test from the certification test suite,
> the kvm guest test fails with the following error:
>
> kvm_init_vcpu failed: Invalid argument
>
> This same test works on multiple other IBM Power 8 and Openpower
> servers. kvm-ok tells us that kvm virtualization is supported. I have
> tried with SMT enabled and disabled. I have tried the latest cloud image
> as well as previous onces we had saved. I have tried running the qemu-
> system-ppc64 command found below manually with the same error.
>
>
> The full output from the test is as follows:
>
> Executing KVM Test
> DEBUG:root:Starting KVM Test
> DEBUG:root:Cloud image location specified: http://10.1.10.2/cloud/xenial-server-cloudimg-ppc64el-disk1.img.
> DEBUG:root:Downloading xenial-server-cloudimg-ppc64el-disk1.img, from http://10.1.10.2
> DEBUG:root:Creating cloud user-data
> DEBUG:root:Creating cloud meta-data
> I: -input-charset not specified, using utf-8 (detected in locale settings)
> Total translation table size: 0
> Total rockridge attributes bytes: 331
> Total directory bytes: 0
> Path table size(bytes): 10
> Max brk space used 0
> 183 extents written (0 MB)
> DEBUG:root:Attempting boot for:xenial-server-cloudimg-ppc64el-disk1.img
> DEBUG:root:Attaching Cloud config disk
> DEBUG:root:Using params:qemu-system-ppc64 -m 1024 -display none -nographic -net nic -net user,net=10.0.0.0/8,host=10.0.0.1,hostfwd=tcp::2222-:22 -enable-kvm -machine pseries,usb=off -cpu POWER8 -drive file=xenial-server-cloudimg-ppc64el-disk1.img,if=virtio -drive file=seed.iso,if=virtio
> INFO:root:Storing VM console output in /home/ubuntu/.cache/plainbox/sessions/canonical-certification-server-2017-01-12T22.19.34.session/CHECKBOX_DATA/virt_debug
>
> ProblemType: Bug
> DistroRelease: Ubuntu 16.04
> Package: linux-image-4.4.0-59-generic 4.4.0-59.80
> ProcVersionSignature: Ubuntu 4.4.0-59.80-generic 4.4.35
> Uname: Linux 4.4.0-59-generic ppc64le
> AlsaDevices:
> total 0
> crw-rw---- 1 root audio 116, 1 Jan 12 22:18 seq
> crw-rw---- 1 root audio 116, 33 Jan 12 22:18 timer
> AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
> ApportVersion: 2.20.1-0ubuntu2.4
> Architecture: ppc64el
> ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
> AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
> Date: Thu Jan 12 22:45:34 2017
> IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
> Lsusb:
> Bus 002 Device 002: ID 125f:312b A-DATA Technology Co., Ltd. Superior S102 Pro
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
> Bus 001 Device 003: ID 046b:ff10 American Megatrends, Inc. Virtual Keyboard and Mouse
> Bus 001 Device 002: ID 046b:ff01 American Megatrends, Inc.
> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> PciMultimedia:
>
> ProcEnviron:
> TERM=xterm
> PATH=(custom, no user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcFB: 0 astdrmfb
> ProcKer...

Read more...

Abalan, Please view the bug page and at the top on the Right Hand side, you should see a box that says the following, or something similar:

You have subscriptions that may cause you to receive notifications, but you are not directly subscribed to this bug's notifications.
Mute bug mail Mute help
Edit bug mail

Please use that to mute the bug mail from this bug.

Rod Smith (rodsmith) wrote :

Abalan, you're subscribed to the "linux in Ubuntu" package, which is why you're receiving bug reports:

https://bugs.launchpad.net/~fyma/+packagebugs

Jeff Lane (bladernr) wrote :

Abalan, you have also subscribed to the package "Linux in Ubuntu" as shown here:

https://bugs.launchpad.net/~fyma/+packagebugs

Because of this, you will receive bug notifications for EVERY bug filed against "Linux". YOu will need to unsubscribe from this package if you do not wish to receive these updates.

OR, you will need to unsubscribe from each bug separately in the Bug Mail settings box that is towards the top of the page on the right hand side.

Good Luck

Jeff

Mike Rushton (leftyfb) wrote :

4.10 RC3 gives a kernel panic:

[ 3.863436] ahci 0009:04:00.0: AHCI 0001.0000 32 slots 4 ports 6 Gbps 0xf impl SATA mode
[ 3.863631] ahci 0009:04:00.0: flags: 64bit ncq sntf led only pmp fbs pio slum part sxs
[ 3.864221] scsi host0: ahci
[ 3.864404] scsi host1: ahci
[ 3.864562] scsi host2: ahci
[ 3.864715] scsi host3: ahci
[ 3.864781] ata1: SATA max UDMA/133 abar m2048@0x3fe280810000 port 0x3fe280810100 irq 447
[ 3.864826] ata2: SATA max UDMA/133 abar m2048@0x3fe280810000 port 0x3fe280810180 irq 447
[ 3.864871] ata3: SATA max UDMA/133 abar m2048@0x3fe280810000 port 0x3fe280810200 irq 447
[ 3.864917] ata4: SATA max UDMA/133 abar m2048@0x3fe280810000 port 0x3fe280810280 irq 447
[ 3.869801] [drm] platform has no IO space, trying MMIO
[ 3.869832] [drm] AST 2400 detected
[ 3.869863] [drm] Analog VGA only
[ 3.869896] [drm] dram 1632000000 7 16 00c00000
[ 3.869970] [TTM] Zone kernel: Available graphics memory: 267838144 kiB
[ 3.870007] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
[ 3.870044] [TTM] Initializing pool allocator
[ 3.870076] [TTM] Initializing DMA pool allocator
[ 3.874823] fb: switching to astdrmfb from OFfb vga
[ 3.874938] Console: switching to colour dummy device 80x25
[ 3.899718] nouveau 0002:01:00.0: enabling device (0140 -> 0142)
[ 3.928774] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: c0000000002ad0d4
[ 3.928774]
[ 3.928776] CPU: 11 PID: 1118 Comm: systemd-udevd Not tainted 4.10.0-041000r

Joseph Salisbury (jsalisbury) wrote :

Thanks for testing, Mike. Do you know if this is a regression? Were there prior kernel versions that did not exhibit this bug?

Mike Rushton (leftyfb) wrote :

KVM was never tested successfully on this particular model. The virtualization test works fine on all other open power servers we have in the labs.

Leonardo Garcia (lagarcia) wrote :

Hi Mike,

I could not understand your comment 4. Even though you said you see the same error, you are showing only a warning message there. Also, the description about KVM working with SMT on is really confusing. Could you please detail what test steps you are taking? On a POWER8 machine it is impossible to create guests if you have SMT enabled on the KMV host.

Leonardo Garcia (lagarcia) wrote :

The issue being described here may be caused by the fact that older kernel and QEMU cannot properly identify POWER8NVL processors being used in the S822LC (8335-GTB) machines.

You might need 7cc851039d643a2ee7df4d18177150f2c3a484f5 kernel commit (from kernel 4.7), and the following series of patches from QEMU, which are available since QEMU 2.7:

http://git.qemu.org/?p=qemu.git;a=commitdiff;h=7386ae6372cc07c77a39cb
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=8cd2ce7aaa3c3fadc561f4
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=52b2519c4ea9a6aa4df7ab
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=eac4fba965136f61cc239a
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=b30ff227c27c931155f768

bugproxy (bugproxy) on 2017-01-20
tags: added: architecture-ppc64le bugnameltc-150834 severity-medium targetmilestone-inin16042
tags: added: kernel-key
removed: kernel-da-key
Manoj Iyer (manjo) on 2017-01-25
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in qemu (Ubuntu):
assignee: nobody → Jon Grimm (jgrimm)

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

bugproxy (bugproxy) wrote : nvram.gz

Default Comment by Bridge

Added tracker for qemu package given speculation that may need updated for POWER8NVL processor in comment #14.

Jon Grimm (jgrimm) on 2017-01-25
Changed in qemu (Ubuntu):
assignee: Jon Grimm (jgrimm) → nobody
assignee: nobody → Ubuntu Server Team (ubuntu-server)
Mike Rushton (leftyfb) wrote :

In my testing, only qemu-system-ppc version 1:2.6.1+dfsg-0ubuntu5.2 available in yakkety and the template to set -cpu POWER8NVL was needed to resolve the issue. No kernel patches were necessary.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Changed in linux (Ubuntu Xenial):
status: Confirmed → Incomplete

------- Comment From <email address hidden> 2017-01-25 17:21 EDT-------
Mike,

If you are running the modified QEMU in a host, you are correct.

You only need to patch the guest kernel if the QEMU in the host does not contain the patches I pointed here.

I suggest the inclusion of both QEMU and kernel patches in order to cover any combinations.

Hi,
thank you lagarcia for your expertise on this.

I try to summarize how I understand the last post - please correct me if I'm wrong.

1. on POWER8NVL "Xenial qemu" in host and "Xenials kernel" in the guest fail (that is the bug)

2. on POWER8NVL "Xenial qemu + the fixes" in host and "Xenial kernel" in the guest would work

3. on POWER8NVL "Xenial qemu" in host and "Xenials kernel + fix" in the guest would work

Is that correct?

The Qemu patches are only ppc, which leaves some hope that the potential to affect other architectures should be minimal :-)

$ git show 7386ae6372cc07c77a39cb 8cd2ce7aaa3c3fadc561f4 52b2519c4ea9a6aa4df7ab eac4fba965136f61cc239a b30ff227c27c931155f768 | diffstat
 hw/ppc/spapr_hcall.c | 71 ++++++++++++++++++++++++++------------------
 target-ppc/cpu-qom.h | 3 +
 target-ppc/cpu.h | 3 +
 target-ppc/kvm.c | 19 ++++++++---
 target-ppc/kvm_ppc.h | 7 ++++
 target-ppc/translate_init.c | 24 +++++++++++---
 6 files changed, 86 insertions(+), 41 deletions(-)

The backport of the qemu fixes is not rocket-science, but also not applying as-is and at the moment I'd personally consider it more a feature than a fix. The series consists of 4 patches reworking code for the final one being "ppc: Add PowerISA 2.07 compatibility mode".
The question how much this is "fix or feature" might be important in the context of SRUing it eventually.

For now I have prepared a qemu ppa to allow the reporter more testing.
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2409/

That has the mentioned pactches on top of Xenial qemu and should allow to test case #2 of my list above.

OTOH the kernel patch is much much smaller:
$ git show 7cc851039d643a2ee7df4d18177150f2c3a484f5 | diffstat
 prom_init.c | 1 +
 1 file changed, 1 insertion(+)
See https://github.com/torvalds/linux/commit/7cc851039d643a2ee7df4d18177150f2c3a484f5

IMHO I'd almost expect to only SRU the kernel change as it is more a "fix" in the SRU sense - and also that much smaller.
I'm clearly open for discussion, and the risk is almost ppc64 only so to some extend since it is IBMs call.
The ppa above shall help testing the qemu change for now.

Also we need to hear from the Kernel Team about potential side effects after their review of the kernel change to see the full picture.

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

Default Comment by Bridge

bugproxy (bugproxy) wrote : nvram.gz

Default Comment by Bridge

Changed in linux (Ubuntu Xenial):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
status: Incomplete → Confirmed

Just learned that SRU for that would be more ok than I expected, quoting the SRU policy:
"For Long Term Support releases we regularly want to enable new hardware. Such changes are appropriate provided that we can ensure not to affect upgrades on existing hardware."

That said waiting for verification on the ppa before I continue (also another SRU has to clear the queue first anyway).

Changed in qemu (Ubuntu):
status: New → Triaged
Changed in qemu (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → Medium
Changed in qemu (Ubuntu):
importance: Undecided → High

Default Comment by Bridge

Default Comment by Bridge

bugproxy (bugproxy) wrote : nvram.gz

Default Comment by Bridge

@Mike - is the branch you linked meant to augment or to replace the suggested SRU (which should be tested from the ppa I linked)?

Mike Rushton (leftyfb) wrote :

@Christian The branch is for a fix in the plainbox-provider-checkbox package which is part of the certification suite. It is not for qemu-system-ppc but is a necessary part of the overall fix since the script was originally assuming POWER8.

On Mon, Jan 30, 2017 at 3:38 PM, Mike Rushton <email address hidden>
wrote:

> @Christian The branch is for a fix in the plainbox-provider-checkbox
> package which is part of the certification suite
>

@Mike - sure I saw what it patche :-)
    I wanted a clarification if this cancels the need of the qemu fixes or
if we will still need them?

@Mike - also I wanted to ask - since you have access to HW - if the you can
confirm the ppa fixing the issue.

tags: added: kernel-da-key
removed: kernel-key

@Christian

The qemu fix is still necessary in order to resolve the issue.

While I do have access to the hardware, it is currently having issues booting. Once we resolve the issue with IBM, I plan on testing out the qemu fix from the PPA.

Changed in qemu (Ubuntu):
status: Triaged → Fix Released

Preliminary builds as preparation for an SRU currently building at

Xenial https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2502

My testing will start on those, but if you can please give it as much testing as you can as well.

@leftyfb - how about your machine, is it running again to be able to test that?

Mike Rushton (leftyfb) wrote :

@paelzer the machine was sent out for repair/replacement. We expect it to return later this week. I will test the package from the above PPA as soon as it is back.

Tests for any regressions on the ppa look good so far, going to try to test the fixed case explicitly.

Since I don't have the HW that will be hard, but Mike already said he will test as soon as the HW is back so that shall do it then.

description: updated

@Mike - is the machine back, any chance to test so far?

Mike Rushton (leftyfb) wrote :

We still have not received the machine back from IBM.

Descheduled from next SRU to unblock other fixes to be able to move into the release.
Setting to incomplete until test results provided.

Changed in qemu (Ubuntu Xenial):
status: Triaged → Incomplete
Mike Rushton (leftyfb) wrote :

We received the hardware back and have started testing all blockers again. The latest test of the above mentioned PPA did not yield positive results:

"Unable to find PowerPC CPU definition"

$ apt-cache policy qemu-system-ppc
qemu-system-ppc:
  Installed: 1:2.5+dfsg-5ubuntu10.10
  Candidate: 1:2.5+dfsg-5ubuntu10.10
  Version table:
 *** 1:2.5+dfsg-5ubuntu10.10 500
        500 http://ppa.launchpad.net/ci-train-ppa-service/2502/ubuntu xenial/main ppc64el Packages
        100 /var/lib/dpkg/status
     1:2.5+dfsg-5ubuntu10.9 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial-updates/main ppc64el Packages
     1:2.5+dfsg-5ubuntu10.6 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial-security/main ppc64el Packages
     1:2.5+dfsg-5ubuntu10 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial/main ppc64el Packages
$ uname -a
Linux fesenkov 4.4.0-66-generic #87-Ubuntu SMP Fri Mar 3 15:30:20 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

Mike Rushton (leftyfb) wrote :

Just to be sure I installed qemu-system-ppc version 1:2.6.1+dfsg-0ubuntu5.3 from yakkety-updates and the tests pass.

Thanks Mike for the check.
Given the results even better that we decoupled this from last SRU.
The ppa you tested had all patches that were identified by IBM backported.

@lagarcia:
 - do you think there are more than the 5 you identified?
 - Given the report that the 2.6.1 from yakkety works there must be something between 2.5 to 2.6.1 that we miss not from 2.7 as the identified patches so far were.

------- Comment From <email address hidden> 2017-03-17 13:59 EDT-------
Hi all,

I am looking though patches to see if there might have been others. Is there a way we can log into the machine to look around? We are looking to see if we can find a similar machine to use as well. Thanks!

Mike Rushton (leftyfb) on 2017-03-27
summary: - Power S822LC (8335-GTB) failes KVM guest cert test with kvm_init_vcpu
+ Power S822LC (8335-GTB) fails KVM guest cert test with kvm_init_vcpu
failed: Invalid argument
Mike Rushton (leftyfb) on 2017-03-28
Changed in qemu (Ubuntu):
status: Fix Released → In Progress
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-03-28 20:05 EDT-------
Hi all,

Just an update, we did get a similar machine set up, but so far I dont have a recreate. I will see if I can find a copy of the same test case and try that and see if I have better luck. Thx all!

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-04-04 14:08 EDT-------
(In reply to comment #38)
> - do you think there are more than the 5 you identified?
> - Given the report that the 2.6.1 from yakkety works there must be something
> between 2.5 to 2.6.1 that we miss not from 2.7 as the identified patches so
> far were.

Yes, there were 2 patches from 2.6.1 that were needed in the backport of 2.5.0 together with the 5 patches already mentioned by lagarcia.

I've backported all of them in a 2.5.0 tree and sucessfully ran a sanity test in a 8335-GTB machine. I've uploaded the backported 2.5.0 tree to my github mirror:

https://github.com/danielhb/qemu/tree/2.5.0_8335-GTB

This is the breakdown of the patches that were backported to 2.5.0:

from 2.6.1:

commit 9d6ba75df26d699a6e23d4817983bd029898f5c7
Author: Benjamin Herrenschmidt <email address hidden>
Date: Wed Nov 11 11:15:46 2015 +1100

target-ppc: Use sensible POWER8/POWER8E versions
commit a88dced8eb69730df39cb04bb3e262e5b98d5f5c
Author: Alexey Kardashevskiy <email address hidden>
Date: Thu Mar 3 11:08:19 2016 +1100

target-ppc: Add PVR for POWER8NVL processor

from 2.7.0:

http://git.qemu.org/?p=qemu.git;a=commitdiff;h=7386ae6372cc07c77a39cb
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=8cd2ce7aaa3c3fadc561f4
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=52b2519c4ea9a6aa4df7ab
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=eac4fba965136f61cc239a
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=b30ff227c27c931155f768

Let me know if you need further assistance.

Daniel

Hi,
thank you a lot Daniel to identify the potentially missing two patches.
I wonder thou if we really need all 7 then, or maybe just the two new ones.
While the extra 5 are valid fixes in general I'm not sure they are 100% needed as Xenial seems to be fine without and at least according to Mike he was good with 2.6.1 from Yakkety which does not have them.

Now I'd have two requests:
1. I want Mike to test two PPAs - one of them with just the two and one with all seven fixes applied.
2. I'd be happy on an IBM statement on the extra 5 patches from qemu 2.7 if they are really required especially since this is in the sense of an SRU I want to keep the scope minimal to the fix that is needed.

@Mike - the fixes are building via bileto to have all architectures, could you check:
1. minimal: 2.5+dfsg-5ubuntu10.11~ppa1 from https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2697
1. full: 2.5+dfsg-5ubuntu10.11~ppa2 from https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2698

Changed in qemu (Ubuntu Xenial):
status: Incomplete → Triaged
Mike Rushton (leftyfb) wrote :

I have tested the qemu-system-ppc package from both PPA's and both work without issue.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-04-10 09:03 EDT-------
> Hi,
> thank you a lot Daniel to identify the potentially missing two patches.
> I wonder thou if we really need all 7 then, or maybe just the two new ones.
> While the extra 5 are valid fixes in general I'm not sure they are 100%
> needed as Xenial seems to be fine without and at least according to Mike he
> was good with 2.6.1 from Yakkety which does not have them.

The 5 patches from 2.7.0 are performance boosts/fixes. Given that the 8335-GTB
is a high performance system they are quite interesting to have. But they
are not need in a sense of QEMU does not work without them. In fact, I've tested
2.5.0 with only the 2 patches from 2.6.1 and the sanity tests worked just fine.

> 2. I'd be happy on an IBM statement on the extra 5 patches from qemu 2.7 if
> they are really required especially since this is in the sense of an SRU I
> want to keep the scope minimal to the fix that is needed.

You don't need the 5 patches from 2.7.0 to make QEMU work in 8335-GTB.

Changed in qemu (Ubuntu Xenial):
assignee: nobody → ChristianEhrhardt (paelzer)
Changed in qemu (Ubuntu):
status: In Progress → Fix Released
description: updated

So I think this is ready for SRU now:
- Template complete
- Updated triaging states to match reality
- Xenial SRU queue free of older qemu
- Minimized the patches down to the SRU'ed case
- Tested on ppa to work
- bileto dep8 tests are all good
- ppa checked on regressions (once more with X-proposed then by our Jenkins jobs)

Now in xenial-unapproved for SRU Team review

Changed in qemu (Ubuntu Xenial):
status: Triaged → In Progress
tags: removed: kernel-da-key

FYI - Recent security update burned the version number that this was supposed to be, re-preparing an adapted version now.

Re-prepared, and merged with another SRU that was ready now.
Pushed to SRU unapproved queue.

P.S. Since I pretest a lot one can already look at [1]. Please do note that the iscsi dep8 test is a notorious noisy transient issue that is not related to the update, but will likely be seen in -proposed as well :-/ Latter Releases are at least somewhat improved but for xenial we are down to a lot retries til working or ignoring.

[1]: https://bileto.ubuntu.com/excuses/2697/xenial.html

Changed in qemu (Ubuntu Xenial):
status: In Progress → Fix Committed

Hello Mike, or anyone else affected,

Accepted qemu into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.13 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed
Jeff Lane (bladernr) wrote :

Due to security cert work, we won't be able to get to this until next week. I've got this on my list to test out on Monday.

Jeff Lane (bladernr) wrote :
Download full text (18.7 KiB)

I've tested the updated qemu in proposed and it fails.

ubuntu@fesenkov:~$ sudo qemu-system-ppc64 -m 1024 -display none -nographic -net nic -net user,net=10.0.0.0/8,host=10.0.0.1,hostfwd=tcp::2222-:22 -enable-kvm -machine pseries,usb=off -cpu host -drive file=/home/ubuntu/xenial-server-cloudimg-ppc64el-disk1.img,if=virtio -drive file=/home/ubuntu/seed.iso,if=virtio
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-iscsi.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-curl.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-rbd.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-dmg.so
Note: only modules from the same build can be loaded.
WARNING: Image format was not specified for '/home/ubuntu/seed.iso' and probing guessed raw.
         Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
         Specify the 'raw' format explicitly to remove the restrictions.
Unable to find PowerPC CPU definition

There is no CPU definition for the Power8NVL cpu:
ubuntu@fesenkov:~$ qemu-system-ppc64 -cpu help
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-iscsi.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-curl.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-rbd.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-dmg.so
Note: only modules from the same build can be loaded.
PowerPC 601_v0 PVR 00010001
PowerPC 601_v1 PVR 00010001
PowerPC 601_v2 PVR 00010002
PowerPC 601 (alias for 601_v2)
PowerPC 601v (alias for 601_v2)
PowerPC 603 PVR 00030100
PowerPC MPC8240 (alias for 603)
PowerPC Vanilla (alias for 603)
PowerPC 604 PVR 00040103
PowerPC ppc32 (alias for 604)
PowerPC ppc (alias for 604)
PowerPC default (alias for 604)
PowerPC 602 PVR 00050100
PowerPC 603e_v1.1 PVR 00060101
PowerPC 603e_v1.2 PVR 00060102
PowerPC 603e_v1.3 PVR 00060103
PowerPC 603e_v1.4 PVR 00060104
PowerPC 603e_v2.2 PVR 00060202
PowerPC 603e_v3 PVR 00060300
PowerPC 603e_v4 PVR 00060400
PowerPC 603e_v4.1 PVR 00060401
PowerPC 603e (alias for 603e_v4.1)
PowerPC Stretch (alias for 603e_v4.1)
PowerPC 603p PVR 00070000
PowerPC 603e7v PVR 00070100
PowerPC Vaillant (alias for 603e7v)
PowerPC 603e7v1 PVR 00070101
PowerPC 603e7 PVR 00070200
PowerPC 603e7v2 PVR 00070201
PowerPC 603e7t PVR 00071201
PowerPC 603r (alias for 603e7t)
PowerPC Goldeneye (alias for 603e7t)
PowerPC 740_v1.0 PVR 00080100
PowerPC 750_v1.0 PVR 0...

Jeff Lane (bladernr) wrote :

Additionally, as noted above, qemu-block-extra modules fail to load:

Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-iscsi.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-curl.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-rbd.so
Note: only modules from the same build can be loaded.
Failed to initialize module: /usr/lib/powerpc64le-linux-gnu/qemu/block-dmg.so
Note: only modules from the same build can be loaded.

Jeff Lane (bladernr) wrote :

ubuntu@fesenkov:~$ dpkg -l |grep qemu | awk '{print $2"\t\t"$3}'
ipxe-qemu 1.0.0+git-20150424.a25a16d-1ubuntu1
qemu-block-extra:ppc64el 1:2.5+dfsg-5ubuntu10.13
qemu-slof 20151103+dfsg-1ubuntu1
qemu-system 1:2.5+dfsg-5ubuntu10.13
qemu-system-arm 1:2.5+dfsg-5ubuntu10.13
qemu-system-common 1:2.5+dfsg-5ubuntu10.13
qemu-system-mips 1:2.5+dfsg-5ubuntu10.11
qemu-system-misc 1:2.5+dfsg-5ubuntu10.11
qemu-system-ppc 1:2.5+dfsg-5ubuntu10.11
qemu-system-sparc 1:2.5+dfsg-5ubuntu10.11
qemu-system-x86 1:2.5+dfsg-5ubuntu10.13
qemu-utils 1:2.5+dfsg-5ubuntu10.13

Hi Jeff,

this bug always was and seems to stay an interesting one - as on the ppc part this is actually identical to what mike tested successfully before.

Yet I think the dpkg listing you posted holds the answer.
It seems only some of the qemu packages got updated.

The fix needed would be in /usr/bin/qemu-system-ppc64 which is in qemu-system-ppc and that is not updated. The packages providing the extra libs (qemu-block-extra) are already updated which is why you see the .so load failures.

I checked if we would have build errors in proposed, but [1] exists as it should.

And while I have no power8nvl machine to test on my own I checked on a different power box that enabling proposed provides all of them as expected in my case "qemu-block-extra qemu-kvm qemu-system-common qemu-system-ppc qemu-utils".

Could you please try to make sure all packages are correctly upgraded and retry.
Maybe there is some pinning left on the machine from older experiments - also maybe Mike left some notes because, as I said, the same ppc code was already verified good from a ppa which would leave me puzzled if it won't work now.

[1]: https://launchpad.net/ubuntu/xenial/amd64/qemu-system-ppc/1:2.5+dfsg-5ubuntu10.13

Jeff Lane (bladernr) wrote :

So I thought that was a bit weird, so look below:

First, I wanted to see what was installed:
ubuntu@fesenkov:~$ apt-cache policy qemu-system-ppc*
qemu-system-ppcemb:
  Installed: (none)
  Candidate: (none)
  Version table:
qemu-system-ppc: ###this is what I had installed, and I'm not entirely sure why it was down-rev. I presume this is the package you mean... I did a dist-upgrade on the node, so I'm not entirely sure why it was skipped before. Oh well, it's upgraded now, I'll re-try.
  Installed: 1:2.5+dfsg-5ubuntu10.11
  Candidate: 1:2.5+dfsg-5ubuntu10.13
  Version table:
     1:2.5+dfsg-5ubuntu10.13 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial-proposed/main ppc64el Packages
 *** 1:2.5+dfsg-5ubuntu10.11 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial-updates/main ppc64el Packages
        500 http://ports.ubuntu.com/ubuntu-ports xenial-security/main ppc64el Packages
        100 /var/lib/dpkg/status
     1:2.5+dfsg-5ubuntu10 500
        500 http://ports.ubuntu.com/ubuntu-ports xenial/main ppc64el Packages
qemu-system-ppc64: ### I expected to see this installed, I didn't realize it was now just a blank pointer to qemu-system-ppc
  Installed: (none)
  Candidate: (none)
  Version table:
ubuntu@fesenkov:~$ sudo apt-get install qemu-system-ppc64
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'qemu-system-ppc' instead of 'qemu-system-ppc64'
Suggested packages:
  samba vde2 openbios-ppc openhackware
The following packages will be upgraded:
  qemu-system-ppc
1 upgraded, 0 newly installed, 0 to remove and 56 not upgraded.
Need to get 2,431 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://ports.ubuntu.com/ubuntu-ports xenial-proposed/main ppc64el qemu-system-ppc ppc64el 1:2.5+dfsg-5ubuntu10.13 [2,431 kB]
Fetched 2,431 kB in 0s (12.0 MB/s)
(Reading database ... 79950 files and directories currently installed.)
Preparing to unpack .../qemu-system-ppc_1%3a2.5+dfsg-5ubuntu10.13_ppc64el.deb ...
Unpacking qemu-system-ppc (1:2.5+dfsg-5ubuntu10.13) over (1:2.5+dfsg-5ubuntu10.11) ...
Processing triggers for man-db (2.7.5-1) ...
Setting up qemu-system-ppc (1:2.5+dfsg-5ubuntu10.13) ...

So now I'll re-try as qemu-system-ppc is updated.

Jeff Lane (bladernr) wrote :

OK, THAT worked. Sorry for the worry. I have no idea why qemu-system-ppc was skipped over when I upgraded everything, but I did retry and indeed it does now pass the virt test.

Jeff Lane (bladernr) wrote :

Call it pebkac. It's possible I missed the package when I manually updated the qemu-* packages.

On Mon, May 15, 2017 at 5:10 PM, Jeff Lane <email address hidden>
wrote:

> Call it pebkac. It's possible I missed the package when I manually
> updated the qemu-* packages.
>

I call it "thank you for sticking on and working through it"!
So as I understand we can set v-done now.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.5+dfsg-5ubuntu10.13

---------------
qemu (1:2.5+dfsg-5ubuntu10.13) xenial; urgency=medium

  * debian/patches/ubuntu/vvfat-fix-volume-name-assertion.patch:
    Fix the volume name assertion in vvfat rw mode (LP: #1684239)

qemu (1:2.5+dfsg-5ubuntu10.12) xenial; urgency=medium

  * debian/patches/ubuntu/bug-1656112-POWER8NVL-[12]-*.patch:
    Add PowerISA 2.07 compatibility mode to fix execution on POWER8NVL
    processors such as in S822LC (8335-GTB) machines (LP: #1656112)

 -- Christian Ehrhardt <email address hidden> Tue, 25 Apr 2017 13:58:10 +0200

Changed in qemu (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Jeff Lane (bladernr) wrote :

Also verified that the SRU resolves the issue on the previously failing hardware (POWER8NVL).

This was tested on Xenial w/ 4.4 kernel

Can we set the Ubuntu and Xenial tasks to Fix Released now?

Yeah we can, kernel had the fixes a while now.
It was a bit complex in this case as we had fix A, fix B and all 4 combinations thereof.
Thanks Jeff!

Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Released
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.