Nova hugepage support does not include aarch64

Bug #1623871 reported by Veena
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Wishlist
Veena
nova (Ubuntu)
Fix Released
Undecided
dann frazier
Xenial
Fix Released
Undecided
dann frazier

Bug Description

[Impact]
Although aarch64 supports spawning a vm with hugepages, in nova code, the libvirt driver considers only x86_64 and I686. Both for NUMA and Hugepage support, AARCH64 needs to be added. Due to this bug, vm can not be launched with hugepage using OpenStack on aarch64 servers.

Note: this depends on the fix for LP: #1627926.

[Test Case]
Steps to reproduce:
On an openstack environment running on aarch64:
1. Configure compute to use hugepages.
2. Set mem_page_size="2048" for a flavor
3. Launch a VM using the above flavor.

Expected result:
VM should be launched with hugepages and the libvirt xml should have

  <memoryBacking>
      <hugepages>
        <page size='2048' unit='KiB' nodeset='0'/>
      </hugepages>
  </memoryBacking>

Actual result:
VM is launched without hugepages.

There are no error logs in nova-scheduler.

[Regression Risk]
Risk is minimized by the fact that this change is just enabling the same code for arm64 that is already enabled for Ubuntu/x86.

Related branches

Veena (mveenasl)
Changed in nova:
assignee: nobody → Veena (mveenasl)
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: libvirt numa
Changed in nova:
importance: Undecided → Wishlist
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Thanks for proposing to add aarch64 support to hugepages and NUMA instances.

Considering the operator impact and the fact that we would need to verify the libvirt and qemu minimum versions that we ship for making sure they support the above, I think it is really important to have that specific feature request to be handled accordingly with people able to review your proposal.

In Nova, we follow a specific process for writing new specifications and proposals that you can find more information on http://docs.openstack.org/developer/nova/process.html#how-do-i-get-my-code-merged

Basically, it requires you to write a blueprint and discuss on IRC to see whether your blueprint needs a formal specification writing called a "spec" file.

Changed in nova:
status: In Progress → Invalid
Revision history for this message
Veena (mveenasl) wrote :

I think this is a bug in Nova as the arch specific versions are hard coded and aarch64 is ignored in that. We just need to add the check for aarch64 also. So instead of a BP, it'll be good to fix it as a bug. The minimum version of libvert(1.2.7) and QEMU(2.1.0) which are verified for x86 also hold good for aarch64. Please let me know if I can proceed with fixing it as a bug.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/372304

Changed in nova:
status: Invalid → In Progress
dann frazier (dannf)
Changed in nova (Ubuntu):
status: New → Triaged
Changed in nova (Ubuntu Xenial):
status: New → Triaged
Revision history for this message
dann frazier (dannf) wrote :
description: updated
description: updated
description: updated
tags: added: patch
Revision history for this message
dann frazier (dannf) wrote :

@mveenasl asked that I describe my testing process - here it is:

I pushed this change to a PPA:
  https://launchpad.net/~ce-hyperscale/+archive/ubuntu/cloud-mitaka

I then deployed OpenStack across a cluster of Cavium ThunderX CRB1s systems
using the current Mitaka Juju charms. I configured the nova-compute charm to pull
packages from this overlay PPA:

  openstack-origin: ppa:ce-hyperscale/cloud-mitaka

I logged into each of the 3 nova-compute nodes and created hugepages:

  $ echo 4096 | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

And restarted libvirt/nova-compute just to be sure they were detected:

  $ sudo service libvirt-bin restart; sudo service nova-compute restart

I configured the m1.small flavor type to use hugepages:
  $ nova flavor-key m1.small set hw:mem_page_size=2048

I then launched a guest:
  $ nova boot --image xenial-uefi --flavor m1.small --nic net-id=0fed1d06-2c7c-48ab-b81b-112af6d362d7 uefi

Then found the corresponding hypervisor node and logged in.

I verified that QEMU was started w/ the appropriate memory-backend settings:

$ ps -ef | grep qemu
libvirt+ 850307 1 42 16:17 ? 00:00:12 /usr/bin/qemu-system-aarch64 -name instance-00000001 -S -machine virt,accel=kvm,usb=off,gic-version=3 -cpu host -drive file=/usr/share/AAVMF/AAVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/instance-00000001_VARS.fd,if=pflash,format=raw,unit=1 -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=yes,size=2147483648,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 [...]

And that the hugepage pool was actually depleted:
 $ cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/free_hugepages
 3072

And finally, checked the console log to make sure the guest OS actually booted:

ubuntu@ubuntu:~$ nova console-log uefi | tail -2
[ 196.813164] cloud-init[1242]: Cloud-init v. 0.7.7 finished at Fri, 30 Sep 2016 16:21:21 +0000. Datasource DataSourceEc2. Up 196.73 seconds

Revision history for this message
dann frazier (dannf) wrote :

Patch rebased on 2:14.0.0~rc2-0ubuntu2, MP here:
  https://code.launchpad.net/~dannf/ubuntu/+source/nova/+git/nova/+merge/307230

dann frazier (dannf)
Changed in nova (Ubuntu):
status: Triaged → Fix Committed
assignee: nobody → dann frazier (dannf)
Changed in nova (Ubuntu Xenial):
assignee: nobody → dann frazier (dannf)
status: Triaged → In Progress
Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2:14.0.0-0ubuntu1

---------------
nova (2:14.0.0-0ubuntu1) yakkety; urgency=medium

  * New upstream release for OpenStack Newton.
  * d/t/nova-compute-daemons: Skip test execution if running within a
    container, ensuring that autopkgtests don't fail on armhf and s390x.
  * d/t/control,nova-compute-daemons: Don't install nova-compute as part
    of the autopkgtest control setup, direct install hypervisor specific
    nova-compute packages ensuring packages are configured in the correct
    order and that nova-compute can access the libvirt socket.

 -- James Page <email address hidden> Fri, 07 Oct 2016 08:48:28 +0100

Changed in nova (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/372304
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=50e0106d35ec1a3204c18f3912b0dc6cf6632305
Submitter: Jenkins
Branch: master

commit 50e0106d35ec1a3204c18f3912b0dc6cf6632305
Author: VeenaSL <email address hidden>
Date: Mon Sep 19 13:36:53 2016 +0530

    Adding hugepage and NUMA support check for aarch64

    Nova ignores aarch64 while verifying for hugepage and NUMA support.
    AARCH64 also supports hugepage and NUMA on the same libvirt versions as of x86.
    Hence adding this chek for aarch64 also.

    Change-Id: I7b5ae1dbdca4fdd0aee2eefd4099c4c4953b609a
    Closes-bug: #1623871

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Veena, or anyone else affected,

Accepted nova into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nova/2:13.1.2-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2:13.1.2-0ubuntu2

---------------
nova (2:13.1.2-0ubuntu2) xenial; urgency=medium

  [ dann frazier ]
  * d/p/libvirt-add-hugepages-support-for-Power.patch (LP: #1568086).
  * d/p/libvirt-add-hugepages-support-for-arm64.patch (LP: #1623871).

 -- Corey Bryant <email address hidden> Tue, 18 Oct 2016 13:56:15 -0400

Changed in nova (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote : Update Released

The verification of the Stable Release Update for nova has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.