Ubuntu18.04 - qemu patch to align memory to allow 2MB THP

Bug #1781526 reported by bugproxy on 2018-07-13
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Medium
Unassigned
qemu (Ubuntu)
Medium
Unassigned
Bionic
Medium
Unassigned

Bug Description

[Impact]

 * Memory on ppc64el was only aligned at pagesize, but for THP to benefit
   aligning on 2MB would be beneficial for translations (performance)

[Test Case]

 * It was reported that running specjbb2005 in guests on a P9 system can
   gain up to 10% performance. Therefore running that with a guest that
   workload (guest pinned, THP set to always) would be the test that shall
   be done (on a non shared performance capable machine). IBM committed to
   verify the PPA and eventually the SRU with such a setup (or
   comparable).

[Fix]

 * Backport https://git.qemu.org/?p=qemu.git;a=commit;h=0c1272cc7c72dfe0ef66be8f283cf67c74b58586

[Regression Potential]

 * It adds a PPC64 check for the alignment, that does not change any other
   architectures (unless this one line change is wrong). The change runs
   fine in later versions and is requested by IBM owning the ppc64el arch.
   The effect should be just some speedup by THP being able to pick up
   some more memory. As we have seen (comment #8) side effects are
   possible, but all that are known are sorted out now.

[Other Info]

 * This was postponed for a while as the change would have made another
   issue more likely. But that is now fixed (in the kernel), so we can go
   on.

--- original bug ---

== Comment: #0 - JENIFER HOPPER <email address hidden> - 2018-07-12 11:41:50 ==

---Problem Description---
We would like to request the following simple qemu patch be applied because it can provide significant performance improvements for Power9 KVM guests that benefit from 2MB THP usage:

commit 0c1272cc7c72dfe0ef66be8f283cf67c74b58586
osdep: powerpc64 align memory to allow 2MB radix THP page tables

This allows KVM with the Book3S radix MMU mode to take advantage of
THP and install larger pages in the partition scope page tables (the
host translation).

--

For instance, when THP is enabled I measured a 10% throughput improvement with this patch for a full system P9 Radix guest running the SPECjbb2005 workload.

Contact Information = <email address hidden>

---uname output---
Linux perfwsp6 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = 8335-GTH

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 Run a workload that benefits from THP usage in a P9 KVM guest.

bugproxy (bugproxy) on 2018-07-13
tags: added: architecture-ppc64le bugnameltc-169712 severity-high targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → qemu (Ubuntu)
tags: added: triage-g
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Server Team (canonical-server)

Hi Jennifer,
I'll take a look putting it into the coming qemu 2.12 merge for Cosmic first (needs to be there for the SRU policy anyway). I saw that it is so new, that is isn't even in 2.12 (upstream) so I'll add it as a patch to Cosmic as well.

Changed in qemu (Ubuntu Bionic):
status: New → Triaged
Changed in qemu (Ubuntu):
status: New → In Progress
Changed in ubuntu-power-systems:
status: New → Triaged

Part of the 2.12 Test builds at [1] for Cosmic.
It seems good to me, if you can afford a quick test on the ppa feel free to do so.
Otherwise the final 2.12 upload for Cosmic will be what you could test.

From there - once complete - we can consider backporting to Bionic.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3306

Manoj Iyer (manjo) on 2018-07-16
Changed in qemu (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → David Britton (davidpbritton)
Changed in qemu (Ubuntu Bionic):
assignee: nobody → David Britton (davidpbritton)
Changed in qemu (Ubuntu):
importance: Undecided → High
Changed in qemu (Ubuntu Bionic):
importance: Undecided → High

------- Comment From <email address hidden> 2018-07-16 12:16 EDT-------
Yes it is a recent patch, it is tagged for qemu-3.0.
Thanks I will try to provide ppa test results soon to confirm.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:2.12+dfsg-3ubuntu3

---------------
qemu (1:2.12+dfsg-3ubuntu3) cosmic; urgency=medium

  * d/p/lp-1755912-qxl-fix-local-renderer-crash.patch: Fix an issue triggered
    by migrations with UI frontends or frequent guest resolution changes
    (LP: #1755912)

 -- Christian Ehrhardt <email address hidden> Thu, 19 Jul 2018 08:26:52 +0200

Changed in qemu (Ubuntu):
status: In Progress → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-08-07 17:34 EDT-------
I was able to verify the fix in cosmic qemu version 1:2.12+dfsg-3.

The 10% overhead was eliminated for a pinned full system guest running SPECjbb2005 with this new qemu package when THP is set to always.

Thanks Jenifer, the Bionic backport is planned.
FYI on the further timing - Atm I want to push (other) new bits into Cosmic before the Feature freeze.
Then I'll collect libvirt/qemu changes we made in Cosmic and backport all applicable including this one to Bionic (which would allow a better efficiency to test all of those together).

If the backport of this is more urgent and can't wait these up-to 2 extra weeks delay please let me know.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-08-08 10:04 EDT-------
(In reply to comment #9)
> Thanks Jenifer, the Bionic backport is planned.
> FYI on the further timing - Atm I want to push (other) new bits into Cosmic
> before the Feature freeze.
> Then I'll collect libvirt/qemu changes we made in Cosmic and backport all
> applicable including this one to Bionic (which would allow a better
> efficiency to test all of those together).
>
> If the backport of this is more urgent and can't wait these up-to 2 extra
> weeks delay please let me know.

No problem, it is not too urgent, thanks!

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-08-20 17:12 EDT-------
Hi, in some environments it was observed that this qemu patch to enable THP made it more likely to hit guest migration issues, however the following kernel patch resolves those migration issues:

https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git/commit/?h=kvm-ppc-next&id=c066fafc595eef5ae3c83ae3a8305956b8c3ef15
KVM: PPC: Book3S HV: Use correct pagesize in kvm_unmap_radix()

Once merged upsrteam, it would be good to include that change as well to avoid potential migration problems. Should I open a new bug for that or is it better to track here?

New Problem, new Bug - at least since the current one is already closed in Cosmic.
Fortunately the SRU itself was still on hold waiting for another PPC fix to complete to bundle them so we can make this part of it still.

I forked bug 1788098 for it.

Please set that for reverse bug mirroring to track it.

Furthermore you could help there by adding the most simple steps to reproduce.
I assume doing migration on ppc64el with some memory dirty workload plus mem verification would do?
But exact steps would help for sure.

FYI - atm this is blocked on 1788098 being delivered
I'm tracking the other bug and we can pick up here again once resolved.

Manoj Iyer (manjo) on 2018-11-05
Changed in qemu (Ubuntu Bionic):
assignee: David Britton (davidpbritton) → nobody
Changed in qemu (Ubuntu):
assignee: David Britton (davidpbritton) → nobody
Changed in ubuntu-power-systems:
status: Triaged → Incomplete
Andrew Cloke (andrew-cloke) wrote :

Marking as incomplete, while waiting for IBM to help unblock LP#1788098. Please add a comment if this dependency can be problem, or the bug can be progressed via an alternate route.
Thanks.

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-11-27 02:13 EDT-------
(In reply to comment #14)
> Marking as incomplete, while waiting for IBM to help unblock LP#1788098.
> Please add a comment if this dependency can be problem, or the bug can be
> progressed via an alternate route.
> Thanks.

Hi Jennifer,

Please have a look at above query above from Canonical.

Blocked on bug 1788098 still - lacking a "block" status I'll call it incomplete (as it is still feedback we need, yet on another bug).

Changed in qemu (Ubuntu Bionic):
status: Triaged → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-12-07 04:51 EDT-------
(In reply to comment #16)
> (In reply to comment #14)
> > Marking as incomplete, while waiting for IBM to help unblock LP#1788098.
> > Please add a comment if this dependency can be problem, or the bug can be
> > progressed via an alternate route.
> > Thanks.
>
>
> Hi Jennifer,
>
> Please have a look at above query above from Canonical.

Hi Leonardo,

Can you please have a look at this.
Got to know that Jennifer has already left IBM.

Stopped SRU considerations for now due to bug 1788098 being non verifiable (for now).

Changed in qemu (Ubuntu Bionic):
status: Incomplete → Won't Fix
Changed in ubuntu-power-systems:
status: Incomplete → Won't Fix
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-03-15 14:40 EDT-------
Just updating:
I was able to verify bug 1788098, and backported the kernel patches to Ubuntu-18.04.
The patches are already upstrem on kernel v4.18.20.

So this qemy patch may be unblocked.

Thanks Leonardo!

Since it is a performance fix I'll need your setup (my machines aren't suited for that) to verify the fix.

Therefore I wanted to ask before the SRU gets started if someone in your Org:
a) are willing and able to do the performance check on the SRU in -proposed then
b) could provide a summary of a good testcase that can show the benefit

I got the feedback per mail that this will in fact be performance tested by IBM - thanks!
OTOH there is still some uncertainty which benchmark/setup can be used to do so.

To let us move on with this together I prepared a PPA, on that you can set up and validate your tests. Please share when you are ready and the PPA was verified (vs what you have in Bionic right now). This should include some visible perf improvement that we can later on use to call it verified.

Once that is confirmed we can move it to become the actual SRU into Bionic (where you would do the last final verification then, so don't throw away the test setup after the PPA test).

The test PPA is available at:
https://launchpad.net/~paelzer/+archive/ubuntu/bug-1781526-ppc64-qemu-alignment

Status: Incomplete until a test setup is established.

Changed in qemu (Ubuntu Bionic):
status: Won't Fix → Incomplete
description: updated
Changed in ubuntu-power-systems:
status: Won't Fix → Incomplete

FYI bug 1788098 was found to be blocked again, but this does not stop you from prepping the perf-testbed - it only gates the final release of the fix.

Manoj Iyer (manjo) wrote :

Lowered priority to Medium based on recommendation from IBM.

Changed in ubuntu-power-systems:
importance: High → Medium
Changed in qemu (Ubuntu Bionic):
importance: High → Medium
Manoj Iyer (manjo) on 2019-07-22
Changed in qemu (Ubuntu):
importance: High → Medium
Changed in ubuntu-power-systems:
assignee: Canonical Server Team (canonical-server) → nobody
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2019-09-26 15:35 EDT-------
Thank you for all the work on this. I think we can close this for bionic now. We need to test the blocking bug, 1788098, to make sure the migration issues are fixed in 19.10. Then we would be ok with this fixed in Eoan and 20.04.

Ok, closing as Won't Fix then - thanks Michael

Changed in qemu (Ubuntu Bionic):
status: Incomplete → Won't Fix
Changed in ubuntu-power-systems:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers