QEMU - count cache flush Spectre v2 mitigation (CVE) (required for POWER9 DD2.3)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
High
|
Unassigned | |||
linux (Ubuntu) |
Undecided
|
Unassigned | |||
Bionic |
Undecided
|
Unassigned | |||
Disco |
High
|
Unassigned | |||
qemu (Ubuntu) |
Undecided
|
Unassigned | |||
Xenial |
Undecided
|
Unassigned | |||
Bionic |
High
|
Canonical Server Team | |||
Cosmic |
Undecided
|
Unassigned | |||
Disco |
High
|
Canonical Server Team | |||
Eoan |
Undecided
|
Unassigned |
Bug Description
[Impact]
* This belongs to the overall context of spectre mitigations and even
more the try to minimize the related performance impacts.
On ppc64el there is a new chip revision (DD 2.3) which provides
a facility that helps to better mitigate some of this.
* Backport the patches that will make the feature (if supported by the
HW) will pass the capability to the guest - to allow guests that
support the improved mitigation to use it.
[Test Case]
* Start guests with and without this capability
* Check if the capability is guest visible as intented
* Check if there are any issues on pre DD2.3 HW
* Test migrations (IBM outlined the intented paths that will work
below)
* The problem with the above (and also the reasons I didn't add a list
of commands this time) is that it needs special HW (mentioned DD2.3
revision) of the chips which aren't available to us right now.
Due to that testing / verification of this on all releases is on IBM
[Regression Potential]
* Adding new capabilities usually works fine, there are three common
pitfalls which here are the regression potential.
- (severe) the code would announce a capability that isn't really
available. The guest tries to use it and crashes
- (medium) several migration paths especially from systems with the
new cap to older (un-updated systems) will fail. But that applies
to any "from machine with Feature to machine without that feature"
and isn't really a new regression. As outlined by IBM below they
even tried to make it somewhat compatible (by being a new value in
an existing cap)
- (low) the guest will see new caps and or facilities. A really odd
guest could stumble due to that (would actually be a guest bug
then)
Overall all of the above was considered by IBM when developing this
and should be ok. For archive wide SRU considerations, this has NO
effect on non ppc64el.
[Other Info]
* n/a
---
Power9 DD 2.3 CPUs running updated firmware will use a new Spectre v2 mitigation. The new mitigation improves performance of branch heavy workloads, but also requires kernel support in order to be fully secure.
Without the kernel support there is a risk of a Spectre v2 attack across a process context switch, though it has not been demonstrated in practice.
QEMU portion - platform definition needs to account for this new mitigation action.. so attribute for this needs to be added.
In terms of support for virtualisation there are 2 sides, kvm and qemu support. Patch list for each,
KVM:
2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_
This is part of LP1822870 already.
QEMU:
8ff43ee404 target/ppc/spapr: Add SPAPR_CAP_
399b2896d4 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
The KVM side is upstream as of v5.1-rc1.
The QEMU side is upstream as of v4.0.0-rc0.
In terms of migration the state is as follows.
In order to specify to the guest to use the count cache flush workaround we use the spapr-cap cap-ibs (indirect branch speculation) with the value workaround. Previously the only valid values were broken, fixed-ibs (indirect branch serialisation) and fixed-ccd (count cache disabled). And add a new cap cap-ccf-assist (count cache flush assist) to specify the availability of the hardware assisted flush variant.
Note the the way spapr caps work you can migrate to a host that supports a higher value, but not to one which doesn't support the current value (i.e. only supports lower values). Where for cap-ibs these are defined as:
0 - Broken
1 - Workaround
2 - fixed-ibs
3 - fixed-ccd
So the following migrations would be valid for example:
broken -> fixed-ccd, broken -> workaround, workaround -> fixed-ccd
While the following would be invalid:
fixed-ccd -> workaround, workaround ->broken, fixed-ccd -> broken
This is done to maintain at least the level of protection specified on the command line on migration.
Since the workaround must be communicated to the guest kernel at boot we cannot migrate a guest from a host with fixed-ccd to one with workaround since the guest wouldn't know to do the flush and so would be wholly unprotected.
This means that to migrate a guest from 2.2 and before to 2.3 would require the guest to either be have been booted with broken previously, or to be rebooted with workaround specified on the command line which would allow the migration to succeed to a 2.3.
== MICHAEL D. ROTH ==
I've tested a backport of count-cache-flush support consisting of the following patches applied (cleanly) on top of bionic's QEMU 2.11+dfsg-
target/ppc/spapr: Add SPAPR_CAP_
ppc/spapr-caps: Change migration macro to take full spapr-cap name
target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
target/ppc: Factor out the parsing in kvmppc_
The following tests were done using a DD 2.3 Witherspoon machine and the results seem to align with what's expected in the original summary:
== enablement tests (using 4.15.0-51-generic in both host and guests) ==
with cap-ibs=
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: hardware assisted flush sequence enabled
with cap-ibs=
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: full software flush sequence enabled.
with cap-ibs=broken
mdroth@ubuntu:~$ dmesg | grep cache-flush
[ 0.000000] count-cache-flush: software flush disabled.
== migration tests (using 4.15.0-51-generic in both host and guests) ==
Note that pseries-
smc-
smc-
smc-
but SPAPR_CAP_FIXED_CCD is not available on the DD 2.3 system I tested on (no fw-count-
cross-migration: qemu 2.11+dfsg-
source: -M bionic-
target: -M bionic-
expected: warning
actual: warning
"cap-ibs lower level (0) in incoming stream than on destination (1))"
software ccf enabled after reboot? yes
target: -M bionic-
expected: warning
actual: warning
"
hardware ccf enabled after reboot? yes
target: -M bionic-
expected: success
actual: success
migration: 2.11+dfsg-
source: -M bionic-
target: -M bionic-
expected: success
actual: success
target: -M bionic-
expected: warning
actual: warning
"
hardware ccf enabled after reboot? yes
target: -M bionic-
expected: fail
actual: fail
"cap-ibs higher level (1) in incoming stream than on destination (0)"
source: -M bionic-
target: -M bionic-
expected: success
actual: success
target: -M bionic-
expected: fail
actual: fail, "cap-ccf-assist higher level (1) in incoming stream than on destination (0)"
target: cap-ibs=broken (expected: fail, actual: )
expected: fail
actual: fail
"cap-ibs higher level (1) in incoming stream than on destination (0)"
"
Sorry, I forgot that I needed some fix-ups for the 4th/last patch, "target/ppc/spapr: Add SPAPR_CAP_
I've gone ahead and posted my git tree, which is based on top of the qemu_2.
https:/
Related branches
- Rafael David Tinoco: Approve on 2019-06-26
- Canonical Server packageset reviewers: Pending requested 2019-06-13
- Ubuntu Server Dev import team: Pending requested 2019-06-13
-
Diff: 629 lines (+589/-0)6 files modifieddebian/changelog (+7/-0)
debian/patches/series (+4/-0)
debian/patches/ubuntu/lp-1832622-0001-target-ppc-Factor-out-the-parsing-in-kvmppc_get_cpu_.patch (+101/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0003-ppc-spapr-caps-Change-migration-macro-to-take-full-s.patch (+79/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+239/-0)
- Rafael David Tinoco: Approve on 2019-06-26
- Canonical Server Team: Pending requested 2019-06-13
- Ubuntu Server Dev import team: Pending requested 2019-06-13
-
Diff: 528 lines (+494/-0)5 files modifieddebian/changelog (+7/-0)
debian/patches/series (+3/-0)
debian/patches/ubuntu/lp-1832622-0001-target-ppc-Factor-out-the-parsing-in-kvmppc_get_cpu_.patch (+101/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+224/-0)
- Rafael David Tinoco: Approve on 2019-06-26
- Canonical Server packageset reviewers: Pending requested 2019-06-13
- Ubuntu Server Dev import team: Pending requested 2019-06-13
-
Diff: 412 lines (+384/-0)4 files modifieddebian/changelog (+7/-0)
debian/patches/series (+2/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+216/-0)
- Rafael David Tinoco: Approve on 2019-06-26
- Canonical Server Team: Pending requested 2019-06-26
- Canonical Server packageset reviewers: Pending requested 2019-06-13
- Ubuntu Server Dev import team: Pending requested 2019-06-13
-
Diff: 412 lines (+384/-0)4 files modifieddebian/changelog (+7/-0)
debian/patches/series (+2/-0)
debian/patches/ubuntu/lp-1832622-0002-target-ppc-spapr-Add-workaround-option-to-SPAPR_CAP_.patch (+159/-0)
debian/patches/ubuntu/lp-1832622-0004-target-ppc-spapr-Add-SPAPR_CAP_CCF_ASSIST.patch (+216/-0)
CVE References
tags: | added: architecture-ppc64le bugnameltc-176932 severity-critical targetmilestone-inin18041 |
Changed in ubuntu: | |
assignee: | nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) |
affects: | ubuntu → qemu (Ubuntu) |
Changed in ubuntu-power-systems: | |
importance: | Undecided → Critical |
assignee: | nobody → Canonical Server Team (canonical-server) |
Christian Ehrhardt (paelzer) wrote : | #1 |
Changed in qemu (Ubuntu Xenial): | |
status: | New → Won't Fix |
Changed in qemu (Ubuntu Bionic): | |
status: | New → Triaged |
Changed in qemu (Ubuntu Cosmic): | |
status: | New → Triaged |
Changed in qemu (Ubuntu Disco): | |
status: | New → Triaged |
Changed in qemu (Ubuntu Eoan): | |
status: | New → Triaged |
description: | updated |
Christian Ehrhardt (paelzer) wrote : | #2 |
There is a rather similar set of patches for new Intel CPU revisions in the pipe. And in between will be a set of general security fixes to the virt stack.
I'd prefer to push both at the same upload, to avoid users having to download qemu too often.
I'd assume that this bug here is important, but then also not super-urgent as DD2.3 availability (right now) still should be very low anyway right?
If this is rather urgent then please let us know and test the PPA asap on all releases. If that is ok I'll ask the security Team to base their coming fixes on this instead of what is in proposed.
Mike Ranweiler (mranweil) wrote : | #3 |
That's correct on DD 2.3 - still not very available - and is ok. Will still post test results.
Changed in ubuntu-power-systems: | |
status: | New → Triaged |
Christian Ehrhardt (paelzer) wrote : | #4 |
In Eoan the merge of qemu 4.0 will fix this, this is ongoing and I added bug reference to its changelog so this bug will get an update once complete.
Rafael started to review my MPs for B/C/D and it seems ok so far.
The work on the similar and to-be-grouped upload for bug 1828495 is going well too.
A precheck by IBM on the PPA that the backports are working as expected on Bionic/Cosmic/Disco DD 2.3 HW would help tremendously to raise the confidence in this going forward towards SRUs then.
tags: | added: qemu-19.10 |
Launchpad Janitor (janitor) wrote : | #5 |
This bug was fixed in the package qemu - 1:4.0+dfsg-0ubuntu1
---------------
qemu (1:4.0+
* Merge with Upstream release of qemu 4.0.
Among many other things this fixes LP Bugs:
LP: #1782206 - SnowRidge Accelerator Interfacing Architecture (AIA)
LP: #1828038 - Update s390x CPU Model for more HW support
LP: #1832622 - count cache flush Spectre v2 mitigation for ppc64el
Remaining Changes:
- qemu-kvm to systemd unit
- d/qemu-kvm-init: script for QEMU KVM preparation modules, ksm,
hugepages and architecture specifics
- d/qemu-
- d/qemu-
- d/qemu-
- d/qemu-
- d/rules: call dh_installinit and dh_installsystemd for qemu-kvm
- Enable nesting by default
- d/qemu-
(is default on amd)
- d/qemu-
without nested=1
- d/p/ubuntu/
in qemu64 cpu type.
- d/p/ubuntu/
in qemu64 on amd
- d/qemu-
default is comfort, not full support
- Distribution specific machine type (LP: 1304107 1621042)
- d/p/ubuntu/
types
- d/qemu-
for host-phys-bits=true (LP: 1776189)
- add an info about -hpb machine type in debian/
- provide pseries-
- improved dependencies
- Make qemu-system-common depend on qemu-block-extra
- Make qemu-utils depend on qemu-block-extra
- let qemu-utils recommend sharutils
- s390x support
- Create qemu-system-s390x package
- Enable numa support for s390x
- arch aware kvm wrappers
- d/control: update VCS links
- qemu-guest-agent: freeze-hook fixes (LP: 1484990)
- d/qemu-
- d/qemu-
- d/control-in: enable RDMA support in qemu (LP: 1692476)
- enable RDMA config option
- add libibumad-dev build-dep
- tolerate ipxe size change on migrations to >=18.04 (LP: 1713490)
- d/p/ubuntu/
reference 256k path
- d/control-in: depend on ipxe-qemu-
handle incoming migrations from former releases.
- d/control-in: Disable capstone disassembler library support (universe)
- Move s390x roms to a new qemu-system-
- d/qemu-
Changed in qemu (Ubuntu Eoan): | |
status: | Triaged → Fix Released |
Changed in ubuntu-power-systems: | |
status: | Triaged → In Progress |
Christian Ehrhardt (paelzer) wrote : | #6 |
Done in Eoan.
Setting the SRU tasks to incomplete to better reflect that we at least would want to get a positive reply from a sniff test on Bionic from the PPA [1] before thrwoing that into the SRU queue.
[1]: https:/
Changed in qemu (Ubuntu Disco): | |
status: | Triaged → Incomplete |
Changed in qemu (Ubuntu Cosmic): | |
status: | Triaged → Incomplete |
Changed in qemu (Ubuntu Bionic): | |
status: | Triaged → Incomplete |
Changed in qemu (Ubuntu Eoan): | |
assignee: | Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Disco): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Cosmic): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Changed in qemu (Ubuntu Bionic): | |
assignee: | nobody → Canonical Server Team (canonical-server) |
Christian Ehrhardt (paelzer) wrote : | #7 |
Cosmic is about to end full support, lets reduce the test matrix a bit by already dropping the Cosmic task.
@IBM - I'm still waiting on a positive feedback on this sniff test.
Without I can't reliable make it part of the next coming (soon) qemu upload.
Also to be aware once SRUs on this are accepted by the SRU Team the same tests will be needed for Bionic and Disco.
Changed in qemu (Ubuntu Cosmic): | |
status: | Incomplete → Won't Fix |
Christian Ehrhardt (paelzer) wrote : | #8 |
FYI: Since I can't check this on the HW shared with us and lacking feedback on the PPA I have backed these changes out of the now started SRU update.
That gives you some more time to get this testing done ... and me the confidence to not rush something that will fail and we might have known if only we checked in advance.
Christian Ehrhardt (paelzer) wrote : | #9 |
The next Qemu SRU is about to start - probably somewhen this week.
Any chance that these checks are completed now to include this fix?
Christian Ehrhardt (paelzer) wrote : | #10 |
Given there was no reply I can't see how we hold this up as "critical" severity.
I have marked our tasks as low, given that without the feedback they aren't actionable at all.
I'd ask project tracking task to be lowered as well and unassigned from the server team (for now at least)
Changed in qemu (Ubuntu Bionic): | |
importance: | Undecided → Low |
Changed in qemu (Ubuntu Disco): | |
importance: | Undecided → Low |
Changed in ubuntu-power-systems: | |
importance: | Critical → Medium |
assignee: | Canonical Server Team (canonical-server) → nobody |
Diane Brent (drbrent) wrote : | #11 |
What causes the status for Bionic to be "incomplete" and low priority?
Frank Heimes (fheimes) wrote : | #12 |
Hello, since a test of the qemu test-build package was requested (available from the PPA mentioned in comment #1, made available mid of June), and the engineer/maintainer is waiting for some feedback since a while (please notice that we can not test this by ourselves), a prioritization was needed to unlock resources and to re-focus on further tickets (partly also other qemu bugs).
Once the package got successfully tested, the work on this one will promptly proceed and the states again adjusted. Hope this explains the procedure ...
Christian Ehrhardt (paelzer) wrote : | #13 |
Hi,
since we are waiting quite some time for ther verification of the version in the PPAit got surpassed by other SRUs. I know your engineers know how to test explicit versions from the PPA (with apt install <pkg>=version), but to make things even easier I created (just for bionic) a respin rebased to the new version.
If it helps you, then you you might use PPA [1] for your test on the DD2.3 HW.
[1]: https:/
------- Comment From <email address hidden> 2019-08-21 03:21 EDT-------
I did testing on this and got the same results. The different scenarios are listed here and all match up with original results. I tested with 1:2.11+
No migration:
max-cpu-
count-cache-flush: hardware assisted flush sequence enabled
max-cpu-
count-cache-flush: full software flush sequence enabled.
max-cpu-
count-cache-flush: software flush disabled.
First set:
Source: max-cpu-
Target: max-cpu-
Result: worked w/warning:
qemu-system-
Source: max-cpu-
Target: max-cpu-
Result: worked w/warning:
qemu-system-
qemu-system-
Source: max-cpu-
Target: max-cpu-
Result: worked
Second set:
Source: max-cpu-
Target: max-cpu-
Result: worked
Source: max-cpu-
Target: max-cpu-
Result: worked w/warning
qemu-system-
[ 0.000000] count-cache-flush: full software flush sequence enabled.
Source: max-cpu-
Target: max-cpu-
Result: fail
qemu-system-
qemu-system-
qemu-system-
Third set:
Source: max-cpu-
Target: max-cpu-
Result: worked
count-cache-flush: hardware assisted flush sequence enabled
Source: max-cpu-
Target: max-cpu-
Result: fail
qemu-system-
qemu-system-
qemu-system-
Source: max-cpu-
Target: max-cpu-
Result: fail
qemu-system-
qemu-system-
qemu-system-
qemu-system-
Andrew Cloke (andrew-cloke) wrote : | #15 |
Moving 'bionic' series back to 'triaged' to review Michael's test results (comment #14).
Changed in qemu (Ubuntu Bionic): | |
status: | Incomplete → Confirmed |
Andrew Cloke (andrew-cloke) wrote : | #16 |
...correction: moved to 'confirmed'.
Christian Ehrhardt (paelzer) wrote : | #17 |
Thanks for doign that Test Michael.
It is a lot of text so I'll summarize (e.g. for the SRU team later):
Section "No migration"
=> mitigation in the guest is detected correctly
Section with migrations has three elements:
=> source == target config -> migration works
=> source older than target config -> migration works with warning
=> source newer than target config -> migration fails
That is exactly as predicted/expected which means we can go on with this as an SRU.
Changed in qemu (Ubuntu Disco): | |
status: | Incomplete → Confirmed |
importance: | Low → High |
Changed in qemu (Ubuntu Bionic): | |
importance: | Low → High |
bugproxy (bugproxy) wrote : | #18 |
------- Comment From <email address hidden> 2019-08-21 08:29 EDT-------
(In reply to comment #28)
> Thanks for doign that Test Michael.
> It is a lot of text so I'll summarize (e.g. for the SRU team later):
> Section "No migration"
> => mitigation in the guest is detected correctly
> Section with migrations has three elements:
> => source == target config -> migration works
> => source older than target config -> migration works with warning
> => source newer than target config -> migration fails
>
> That is exactly as predicted/expected which means we can go on with this as
> an SRU.
Have tested and raised two issues
One is on migration:
Migration from cap-ibs=workaround -> cap-ibs=broken crashes guest rather to fail the migration gracefully.
expected the source guest continue to be in running state after the migration failure, but the guest crashes at destination and leaves the guest in source in paused state.
Raised Bug 180734 for the same.
Another is on usability of the hardware assisted flush(cap-
Raised Bug 180735 for the same.
Regards,
-Satheesh
bugproxy (bugproxy) wrote : Test logs | #19 |
------- Comment on attachment From <email address hidden> 2019-08-21 08:34 EDT-------
Complete test logs, apart from above two cases, other scenarios are working fine.
Regards,
-Satheesh
Christian Ehrhardt (paelzer) wrote : | #20 |
@IBM - so my working assumption then is that you'll get to us with whatever is needed/recommended for your new bugs 180734 / 180735 later on but for now want the patches we discussed and tested here to be pushed.
TL;DR: provide the security fix as tested now, potentially refine it later.
A confirmation of this would be great.
Changed in qemu (Ubuntu Eoan): | |
assignee: | Canonical Server Team (canonical-server) → nobody |
Changed in qemu (Ubuntu Cosmic): | |
assignee: | Canonical Server Team (canonical-server) → nobody |
------- Comment From <email address hidden> 2019-08-21 11:25 EDT-------
Is this ready to move out of Reopened state and to submitted or verified or something?
Dimitri John Ledkov (xnox) wrote : | #22 |
I'm not sure if that is a question about internal bugzilla statuses, or about external launchpad statuses.
In launchpad, this issue is https:/
It has been fixed in the development series already (eoan), and will not be fixed in xenial/cosmic.
bugproxy (bugproxy) wrote : | #23 |
------- Comment From <email address hidden> 2019-08-21 18:53 EDT-------
It was IBM bugzilla status, I'll move it all back.
I took a look at the new bugs - 180734 and 180735. The first (180734) I can recreate on my system if I do it exactly (or nearly so) as you do - the status shows paused (postmigrate) and it's no longer responsive.. With my setup with more options it works fine for me. I have yet figured out which option triggers the change for me.
When I use my original options and directly to qemu-system-ppc64 it doesn't crash. It is an invalid migration - going from workaround to broken should fail. I get a similar warning message when I try it - but then the source remains active. Here's what I was originally using:
/usr/bin/
The second (180735) is a feature request.
It seems like we should move forward with the SRU now and fix bug 180734 as there becomes a fix available - it doesn't look like there is one now.
Suraj/Satheesh - you agree?
bugproxy (bugproxy) wrote : | #24 |
------- Comment From <email address hidden> 2019-08-22 04:00 EDT-------
Michael, sounds like the correct approach to take
Christian Ehrhardt (paelzer) wrote : | #25 |
We have reviewed and tested the branch individually already.
I now had a test set running over night with the ones applied together that I intend to push in one SRU. All worked fine, uploading to -unapproved for the SRU Team to take a look.
Changed in qemu (Ubuntu Bionic): | |
status: | Confirmed → In Progress |
Changed in qemu (Ubuntu Disco): | |
status: | Confirmed → In Progress |
Hello bugproxy, or anyone else affected,
Accepted qemu into disco-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Changed in qemu (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed verification-needed-disco |
Changed in qemu (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
tags: | added: verification-needed-bionic |
Robie Basak (racb) wrote : | #27 |
Hello bugproxy, or anyone else affected,
Accepted qemu into bionic-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-
Further information regarding the verification process can be found at https:/
Changed in ubuntu-power-systems: | |
status: | In Progress → Fix Committed |
Diane Brent (drbrent) wrote : | #28 |
IBMm will verify this today.
------- Comment From <email address hidden> 2019-08-27 12:42 EDT-------
I tested this out with the same verification checks as above and got the same results. The summary is that the mitigations are detected correctly in the guest and the migrations works when it should, warns when it should, and fails when it should.
ii qemu-block-
ii qemu-kvm 1:2.11+
ii qemu-system-common 1:2.11+
ii qemu-system-ppc 1:2.11+
ii qemu-utils 1:2.11+
No migration:
max-cpu-
count-cache-flush: hardware assisted flush sequence enabled
max-cpu-
count-cache-flush: full software flush sequence enabled.
max-cpu-
count-cache-flush: software flush disabled.
Migrations:
Source: max-cpu-
Target: max-cpu-
Worked w/warning:
qemu-system-
count-cache-flush: software flush disabled.
Source: max-cpu-
Target: max-cpu-
Worked w/warning:
count-cache-flush: software flush disabled.
qemu-system-
qemu-system-
count-cache-flush: software flush disabled.
Source: max-cpu-
Target: max-cpu-
Worked
count-cache-flush: software flush disabled.
Set 2:
Source: max-cpu-
Target: max-cpu-
Worked
count-cache-flush: full software flush sequence enabled.
Source: max-cpu-
Target: max-cpu-
Worked w/warning
count-cache-flush: full software flush sequence enabled.
qemu-system-
Source: max-cpu-
Target: max-cpu-
Failed:
qemu-system-
qemu-system-
qemu-system-
Third set:
Source: max-cpu-
Target: max-cpu-
Andrew Cloke (andrew-cloke) wrote : | #30 |
Many thanks Michael for the bionic testing. Updating the bionic tags accordingly.
Are you also able to test the disco -proposed package 1:3.1+dfsg-
tags: |
added: verification-done-bionic removed: verification-needed-bionic |
bugproxy (bugproxy) wrote : | #31 |
------- Comment From <email address hidden> 2019-08-29 03:08 EDT-------
Sorry, my machine had a hw issue, but Satheesh made a DD 2.3 available with a fresh disco install. I had trouble with the disco qemu, though:
root@ws-
QEMU emulator version 3.1.0 (Debian 1:3.1+dfsg-
For either cap-ccf-assist=off or cap-ccf-assist=on qemu doesn't start:
qemu-system-
So maybe we're missing a patch, here or in the kernel, for disco.
Andrew Cloke (andrew-cloke) wrote : | #32 |
Thanks for testing Michael. I've marked disco as verification-
tags: |
added: verification-failed-disco removed: verification-needed-disco |
Christian Ehrhardt (paelzer) wrote : | #33 |
It is the same set of patches as we have on Bionic.
Bionic has
1. 8fea70440eb0d09
2. 399b2896d4948a1
3. 8c5909c41916f25
4. 8ff43ee404d3e29
Disco for this bug has #2+#4 while #1+#3 are already part of the base version that is in qemu of Disco.
Due to different contexts they are slightly different.
Upstream defines it as
+#define SPAPR_CAP_
Due to the context change in Bionic and Disco 0x06 and 0x08 respectively.
That index matters if it would be off in the capability_
I recounted the field to ensure there is no off by one and also otherwise compared the diffs of the upstream commits and the bionic/disco backports. There doesn't seem to be an issue in those.
@Michael could you retest this on Disco and the kernel you used (and worked) from Bionic.
If it is a kernel issue I'm fine and we can open a kernel task for it for Disco? That would help as we would not have to stop/gate qemu in that case.
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:2.11+dfsg-1ubuntu7.18) | #34 |
All autopkgtests for the newly accepted qemu (1:2.11+
The following regressions have been reported in tests triggered by the package:
ubuntu-
Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUp
https:/
[1] https:/
Thank you!
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:3.1+dfsg-2ubuntu3.4) | #35 |
All autopkgtests for the newly accepted qemu (1:3.1+
The following regressions have been reported in tests triggered by the package:
systemd/
nova/2:
Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUp
https:/
[1] https:/
Thank you!
------- Comment From <email address hidden> 2019-08-30 00:57 EDT-------
I did confirm it on bionic as a kernel issue - I could recreate the error on bionic with the bionic proposed qemu and the disco kernel (and additionally with an older bionic kernel, too). I wasn't able to get a setup for disco to confirm it working, or get the exact patch needed yet.
Changed in linux (Ubuntu Disco): | |
status: | New → Confirmed |
importance: | Undecided → High |
no longer affects: | linux (Ubuntu Cosmic) |
no longer affects: | linux (Ubuntu Eoan) |
no longer affects: | linux (Ubuntu Xenial) |
Changed in linux (Ubuntu): | |
status: | New → Fix Released |
Changed in linux (Ubuntu Bionic): | |
status: | New → Fix Released |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Confirmed |
Frank Heimes (fheimes) wrote : | #37 |
May I ask which kernel was used while testing on disco - was is the kernel from main/updates or proposed (5.0.0.27)?
bugproxy (bugproxy) wrote : | #38 |
------- Comment From <email address hidden> 2019-08-30 12:17 EDT-------
From -proposed - 5.0.0-27.
Christian Ehrhardt (paelzer) wrote : | #39 |
FYI - the related autopkgtest issues would now be resolved.
Christian Ehrhardt (paelzer) wrote : | #40 |
Lacking better options I gave this some extra testing on a pre DD2.3 P9 box.
revision : 2.2 (pvr 004e 1202)
I though at least CCF=off I should be able to test with these chips and that worked fine.
Summary:
- the new versions make cap-ibs=fixed-ibs work on DD2.2
- CCF=off works with Bionic and Disco kernels on DD 2.2
- CCF=on untestable without DD 2.3 HW as expected
- Working in Disco just as much as in Bionic
Are you 100% sure on the FW and HW levels that are on the DD2.3 machine that you used to test Disco?
Given my results are all good and your Bionic results were good with essentially the same code as in Disco I'm beginning to wonder if it might be an issue on the borrowed DD2.3 machine that you used for the Disco test.
@IBM - can you get a machine on which you first check that it works for CCF with Bionic (to ensure we know the HW/FW is good) and then directly upgrade this very same machine to Disco to verify it there?
FYI - the ongoing SRU contains more than just this change, and at some point I'll need to unblock the others.
Therefore I'd set a limit of ~48h from now. If we can't find a way to resolve the verification issue on this bug as-is until then I'll have to reroll the current SRU without this fix to get things going.
--- Tests Details ---
Note:
- Start basic guest with (and check it boots the bootloader):
This can be done after just installing qemu-system-ppc
sudo /usr/bin/
This can be done with disks for a full linux boot, but doesn't have to for this test. To do so add:
-boot strict=on -drive file=/var/
#1: Bionic as-is
- qemu: 1:2.11+
=> works (guest can be started as-is)
=> reports (-machine...?):
cap-sbbc=string (Speculation Barrier Bounds Checking (broken, workaround, fixed)(null))
cap-cfpc=string (Cache Flush on Privilege Change (broken, workaround, fixed)(null))
cap-ibs=string (Indirect Branch Speculation (broken, fixed-ibs, fixed-ccd)(null))
Test IBS modes adding ,cap-ibs=:
- broken - ok
- fixed-ccd - ok
- fixed-ibs - "not supported by kvm"
Test CCF modes ,cap-ccf-assist=
- (doesn't exist here)
#2: Bionic proposed qemu
- qemu 1:2.11+
=> works (guest can be started as-is)
=> reports (-machine...?):
cap-sbbc=string (Speculation Barrier Bounds Checking (broken, workaround, fixed)(null))
cap-cfpc=string (Cache Flush on Privilege Change (broken, workaround, fixed)(null))
cap-ibs=string (Indirect Branch Speculation (broken, fixed-ibs, fixed-ccd)(null))
+cap-ccf-
Test IBS modes adding ,cap-ibs=:
- broken - ok
- fixed-ccd - ok
- fixed-ibs - ok
Test CCF modes adding ,cap-ccf-assist=
- o...
Christian Ehrhardt (paelzer) wrote : | #41 |
I think I found the missing kernel bit.
As reported it needs:
2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_
Which was brought into Bionic/Cosmic already as part of bug LP1822870.
This is only needed when I'd be on new HW/FW
Bionic: $ grep -Hrn KVM_PPC_
arch/powerpc/
arch/powerpc/
arch/powerpc/
arch/powerpc/
Disco: the same grep finds nothing.
$ git tag --contains 2b57ecd0208f
v5.1
...
Disco is on 5.0.0.27.28, so it needs this commit.
Comparing git://kernel.
@IBM - can we release the qemu portion of this now and the kernel Team will include that on the next kernel SRU cycle? Or does the addition of this to Qemu without the related kernel change break anything. It didn't seem so to me in my DD 2.2 Tests.
Christian Ehrhardt (paelzer) wrote : | #42 |
Back in bug 1822870 it was reported that the Disco kernel is only missing 92edf8df which is still applied to Disco these days. Maybe due to that 2b57ecd0208f was lost.
@Kernel Team - could you go through all changes that made up bug 1822870 and ensure whatever is missing will be added to Disco?
Andrew Cloke (andrew-cloke) wrote : | #43 |
Bumping priority up to high after discussions with IBM.
Changed in ubuntu-power-systems: | |
importance: | Medium → High |
Juerg Haefliger (juergh) wrote : | #44 |
Confirmed that the Disco kernel is only missing 2b57ecd0208f ("KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_
Changed in linux (Ubuntu Disco): | |
status: | Confirmed → In Progress |
Christian Ehrhardt (paelzer) wrote : | #45 |
Per my Tests we already know that on DD2.0 HW things are fine, you can't enable CCF which is expected, but it doesn't break formerly working cases there.
And I'm not sure if there is DD2.3 HW in the wild already.
Furthermore I was in contact with Leonardo yesterday, he is working with the Authors of the patches to let us know if we can safely release the qemu changes before the kernel OR if we have to unroll them for now until this is fixed in the kernel.
bugproxy (bugproxy) wrote : | #46 |
------- Comment From <email address hidden> 2019-09-03 19:45 EDT-------
I ran the tests mentioned in launchpad comment #40 on a DD2.3 witherspoon machine with GA firmware. Aside from the issue caused by the missing kernel patch, QEMU behaved as expected.
One thing of note is that the following firmware features are disabled:
ibm,opal/
ibm,opal/
which means that 'cap-ibs=fixed-ibs' and 'cap-ibs=fixed-ccd' are always refused by KVM in this machine.
I attached the test results as qemu-dd2.
------- Comment (attachment only) From <email address hidden> 2019-09-03 19:44 EDT-------
Changed in linux (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
Changed in ubuntu-power-systems: | |
status: | Confirmed → Fix Committed |
Christian Ehrhardt (paelzer) wrote : | #48 |
Thanks a lot <email address hidden>.
Especially for noting the known firmware featues influencing this in your case and then combining cap-ibs=
I see that cap-ccf-assist=on can be used and successfully grants the guest
[ 0.000000] count-cache-flush: hardware assisted flush sequence enabled
The one thing I wondered is your #7 showing cap-ibs=workaround not working.
Could that be another missed kernel patch as we have seen it working in #2.
Could you please add and run the following cases to your list:
*** 8- Bionic-proposed kernel + Disco-updates QEMU
*** 9- Bionic-proposed kernel + Disco-proposed QEMU
In those (at least) test "cap-ibs=
With those two tests on top we can check if:
- if cap-ibs=workaround works in #8 but we know it failed in #7
=> the Disco kernel broke it in #7
=> We'd need to find what else the Disco kernel misses vs Bionic.
- if cap-ibs=workaround works in #8 but fails in #9
=> the new disco qemu update breaks it
=> We'd need to find why
Fabiano Rosas (farosas) wrote : | #49 |
That is the effect of the lack of "2b57ecd0208f KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_
QEMU checks for KVM_PPC_
(From lp-1832622-
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index f0f5bf9391.
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2392,7 +2392,13 @@ static int parse_cap_
static int parse_cap_
{
- if (c.character & c.character_mask & H_CPU_CHAR_
+ if ((~c.behaviour & c.behaviour_mask & H_CPU_BEHAV_
+ (~c.character & c.character_mask & H_CPU_CHAR_
+ (~c.character & c.character_mask & H_CPU_CHAR_
+ return SPAPR_CAP_FIXED_NA;
+ } else if (c.behaviour & c.behaviour_mask & H_CPU_BEHAV_
+ return SPAPR_CAP_
+ } else if (c.character & c.character_mask & H_CPU_CHAR_
return SPAPR_CAP_
} else if (c.character & c.character_mask & H_CPU_CHAR_
return SPAPR_CAP_
But I'll test the extra two scenarios anyway.
Fabiano Rosas (farosas) wrote : | #50 |
Here is test #9 (#8 is the same as #4 from my previous tests. And not of much help since Disco-updates QEMU (v=1:3.
*** 9- Bionic-proposed kernel + Disco-proposed QEMU
$ uname -r; qemu-system-ppc64 --version | head -n 1
4.15.0-60-generic
QEMU emulator version 3.1.0 (Debian 1:3.1+dfsg-
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: hardware assisted flush sequence enabled
$ qemu-system-ppc64 -machine pseries,? 2>&1 | grep "\|ibs\|ccf"
cap-ibs=string (Indirect Branch Speculation (broken, workaround, fixed-ibs,
cap-ccf-
- cap-ibs=broken
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: software flush disabled.
- cap-ibs=workaround
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: full software flush sequence enabled.
- cap-ibs=fixed-ccd
qemu-system-ppc64: Requested safe indirect branch capability level not supported by kvm, try cap-ibs=workaround
- cap-ibs=fixed-ibs
qemu-system-ppc64: Requested safe indirect branch capability level not supported by kvm, try cap-ibs=workaround
- cap-ccf-assist=off
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: software flush disabled.
- cap-ccf-assist=on
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: software flush disabled.
- cap-ibs=
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: hardware assisted flush sequence enabled
- cap-ibs=
$ dmesg | grep count-cache
[ 0.000000] count-cache-flush: full software flush sequence enabled.
So my interpretation of the results is that the Disco kernel is indeed to blame for cap-ibs=workaround not working with QEMU 1:3.1+dfsg-
Christian Ehrhardt (paelzer) wrote : | #51 |
Thanks a lot Fabiano!
So I summarize:
- #7 is in no way a degradation to #4:
- all cap-ibs= modes are failing on that before and after
- that means the new qemu didn't break anything in that regard
- #9 confirms that as soon as we have a fixed kernel under that new disco-qemu it will work for cap-ibs=workaround as well as cap-ccf-
And IMHO that means we have confirmed that:
a) the new fix in qemu works
b) the new fix in qemu does not degrade it if used on the current kernel
c) we need the kernel change to eventually fully work (well we have known that)
With that I think we can declare qemu in disco verified and let it release.
And the upcoming kernel update will resolve ibs/ccf to be really usable in Disco.
Christian Ehrhardt (paelzer) wrote : | #52 |
After discussing this with the Team I really think it is ok to release this.
As stated before we confirmed:
- that on a good kernel the fix works
- the fix doesn't break features if not running on the new kernel
- the fix is confirmed to get in the kernel soon (this kernel cycle)
In addition releasing this now gives us the benefit of reaching earlier CloudArchive based on Disco which on the Bionic kernel will work right away.
People can always run with a newer/older kernel, so in this case just as with other SRUs where we say confirmed by install and "configuration" here the "configuration" for now in Disco is to provide a kernel with the change applied.
Therefore I'm now marking it verified in Disco.
Thanks everyone for all your involvement and looking forward to the kernel change verified and then landing at probably the end of this month.
tags: |
added: verification-done verification-done-disco removed: verification-failed-disco verification-needed |
The verification of the Stable Release Update for qemu has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #54 |
This bug was fixed in the package qemu - 1:2.11+
---------------
qemu (1:2.11+
* d/p/ubuntu/
(LP: #1832622)
* d/p/ubuntu/
* d/p/ubuntu/
(LP: #1836154)
-- Christian Ehrhardt <email address hidden> Thu, 13 Jun 2019 08:08:33 +0200
Changed in qemu (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #55 |
This bug was fixed in the package qemu - 1:3.1+dfsg-
---------------
qemu (1:3.1+
* d/p/ubuntu/
(LP: #1832622)
* d/p/ubuntu/
(LP: #1836154)
-- Christian Ehrhardt <email address hidden> Thu, 13 Jun 2019 08:40:55 +0200
Changed in qemu (Ubuntu Disco): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #56 |
This bug was fixed in the package linux - 5.0.0-31.33
---------------
linux (5.0.0-31.33) disco; urgency=medium
* disco/linux: 5.0.0-31.33 -proposed tracker (LP: #1846026)
* Packaging resync (LP: #1786013)
- [Packaging] update helper scripts
* /proc/self/maps paths missing on live session (was vlc won't start; eoan
19.10 & bionic 18.04 ubuntu/
(LP: #1842382)
- SAUCE: Revert "UBUNTU: SAUCE: shiftfs: enable overlayfs on shiftfs"
linux (5.0.0-30.32) disco; urgency=medium
* disco/linux: 5.0.0-30.32 -proposed tracker (LP: #1844362)
* Disco update: upstream stable patchset 2019-08-20 (LP: #1840846)
- Revert "e1000e: fix cyclic resets at link up with active tx"
- e1000e: start network tx queue only when link is up
- Input: synaptics - enable SMBUS on T480 thinkpad trackpad
- nilfs2: do not use unexported cpu_to_
- drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
- firmware: improve LSM/IMA security behaviour
- irqchip/gic-v3-its: Fix command queue pointer comparison bug
- clk: ti: clkctrl: Fix returning uninitialized data
- efi/bgrt: Drop BGRT status field reserved bits check
- perf/core: Fix perf_sample_
- ARM: dts: gemini Fix up DNS-313 compatible string
- ARM: omap2: remove incorrect __init annotation
- afs: Fix uninitialised spinlock afs_volume:
- x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
- be2net: fix link failure after ethtool offline test
- ppp: mppe: Add softdep to arc4
- sis900: fix TX completion
- ARM: dts: imx6ul: fix PWM[1-4] interrupts
- pinctrl: mcp23s08: Fix add_data and irqchip_add_nested call order
- dm table: don't copy from a NULL pointer in realloc_argv()
- dm verity: use message limit for data block corruption message
- x86/boot/64: Fix crash if kernel image crosses page table boundary
- x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
- HID: chicony: add another quirk for PixArt mouse
- pinctrl: mediatek: Ignore interrupts that are wake only during resume
- cpu/hotplug: Fix out-of-bounds read when setting fail state
- pinctrl: mediatek: Update cur_mask in mask/mask ops
- linux/kernel.h: fix overflow for DIV_ROUND_UP_ULL
- genirq: Delay deactivation in free_irq()
- genirq: Fix misleading synchronize_irq() documentation
- genirq: Add optional hardware synchronization for shutdown
- x86/ioapic: Implement irq_get_
- x86/irq: Handle spurious interrupt after shutdown gracefully
- x86/irq: Seperate unused system vectors from spurious entry again
- ARC: hide unused function unw_hdr_alloc
- s390: fix stfle zero padding
- s390/qdio: (re-)initialize tiqdio list entries
- s390/qdio: don't touch the dsci in tiqdio_
- crypto: talitos - move struct talitos_edesc into talitos.h
- crypto: talitos - fix hash on SEC1.
- crypto/NX: Set receive window credits to max number of CRBs in RxFIFO
- drm/udl: introduce a macro to convert dev t...
Changed in linux (Ubuntu Disco): | |
status: | Fix Committed → Fix Released |
Changed in ubuntu-power-systems: | |
status: | Fix Committed → Fix Released |
I'm glad that the kernel patch is already integrated by bug 1822870 in >=Bionic - no dependency on the kernel here then.
The patches themselve look small and clean. get_cpu_ characteristics ()
Thanks for identifying the extra dependencies to:
- 8fea7044 (>=3.0) target/ppc: Factor out the parsing in kvmppc_
- 8c5909c4 (>=2.12) ppc/spapr-caps: Change migration macro to take full spapr-cap name
That overall makes the request to apply: get_cpu_ characteristics () CCF_ASSIST
- 8c5909c4 (>=2.12) ppc/spapr-caps: Change migration macro to take full spapr-cap name
- 8fea7044 (>=3.0) target/ppc: Factor out the parsing in kvmppc_
- 399b2896 (>=4.0) target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
- 8ff43ee4 (>=4.0) target/ppc/spapr: Add SPAPR_CAP_
By reading the bug top down I ran into issues with patch #4, but then I read the rest and found that you already handled that. Taking the backport from the referenced git worked great, thanks Michael.
There was some minor noise bringing that to 2.12 and 3.0 but it worked rather straight forward as expected for 2.12. In qemu 3.0 thou we need something else for the fourth patch. Neither the upstream original (9 rejects), nor the backport you provided for 2.11 apply (10 rejects).
Upstream is a bit closer, the lack of "large decr" in qemu 3.0 shows up as context change a few times, but those were resovable.
For "SPAPR_ CAP_CCF_ ASSIST" I followed your backport of leaving no holes in the cap numbering (the alternative would be to retain it being 0x9, but leave some in between undefined which would break when iterating).
TODO hw/ppc/ spapr.h SPAPR_CAP_ CCF_ASSIST for wholes
check cosmic applied include/
IIRC Xenial has no P9 support and probably would be much harder to backport, so unless further discussion this is a Won't Fix for Xenial.
Timing: we have a qemu SRU in the pipe that needs verification and release. Once done we will enqueue that one.
But until then we can still work on this. Cosmic/ Disco/Eoan (linked to the bug here) and a PPA [1].
I opend MPs for internal review with the backports for Bionic/
If you want to test anything ahead of proposed please feel free to take a look at MPs and/or the PPA.
[1]: https:/ /launchpad. net/~paelzer/ +archive/ ubuntu/ bug-1832622- qemu-spectre- ppc