hugepages for non-default pagesize need manual setup

Bug #1643675 reported by JuanJo Ciarlante
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Medium
Unassigned
libvirt (Ubuntu)
Won't Fix
Undecided
Unassigned
nova-compute (Juju Charms Collection)
Invalid
Medium
Unassigned
systemd (Ubuntu)
Confirmed
Wishlist
Unassigned

Bug Description

This is a amd64 xenial/mitaka deploy using 1607 charms release
where I need to have 1G hugepages available for openstack/kvm.

nova-compute charm should provide ways to ease e.g.
1G hugepages setup, currently only exposes a
"hugepages" setting that ends setting vm.nr_hugepages
for the node, but even after settings kernel cmdline
parameters for 1G hugepages (hugepagesz=1G hugepages=128),
is not usable by libvirt-bin:

#1 systemd:
Installs /lib/systemd/system/dev-hugepages.mount with
default pagesize
=> no hugetblfs mount with (e.g.) pagesize=1G

#2 libvirt-bin:
Installs apparmor profile ("libvirt-qemu") which only allows:
  # for access to hugepages
  owner "/run/hugepages/kvm/libvirt/qemu/**" rw,
  owner "/dev/hugepages/libvirt/qemu/**" rw,
=> not possible to have other mount points for other pagesizes.

I've workaround'd #1,#2 above by overriding systemd's
by creating a /etc/systemd/system/dev-hugepages.mount
with below extra line at [Mount] section:
Options=pagesize=1G
-> https://paste.ubuntu.com/23513048/

JuanJo Ciarlante (jjo)
summary: - amd64: 1G hugepages need manual setup
+ hugepages for non-default pagesize need manual setup
Revision history for this message
James Page (james-page) wrote :

Where do 1G hugepages get exposed by default? or don't they?

Changed in nova-compute (Juju Charms Collection):
importance: Undecided → Medium
Revision history for this message
James Page (james-page) wrote :

OK tried this and they don't get exposes afaict.

tags: added: sriov
tags: added: telco
removed: sriov
Changed in nova-compute (Juju Charms Collection):
status: New → Triaged
Revision history for this message
James Page (james-page) wrote :

This feels like this generally needs improving, with automatic mounting of hugepage FS for different pagesize options - so for example:

   /dev/hugepages1G
   /def/hugepages

with appropriate apparmor changes as well for libvirt to allow access to common paths on Ubuntu systems.

Revision history for this message
James Page (james-page) wrote :

@jjo

Use of 'default_hugepagesz=1G' might also tweak things a bit.

Revision history for this message
James Page (james-page) wrote :

FWIW I think kernel boot command line options should be the preferred way of allocating hugepages as its generally the most reliable way of doing it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1643675] Re: hugepages for non-default pagesize need manual setup

On Thu, Feb 2, 2017 at 12:58 PM, James Page <email address hidden> wrote:

> This feels like this generally needs improving, with automatic mounting
> of hugepage FS for different pagesize options - so for example:
>
> /dev/hugepages1G
> /def/hugepages
>
> with appropriate apparmor changes as well for libvirt to allow access to
> common paths on Ubuntu systems.
>

I agree,
I faced the same when we faced the need for a basic hugepage allocation
service for DPDK and wanted to share a few of my thoughts back then.
BTW by starting early the dpdk service so far had no problems, but when
increasing and restarting later on that was an issue sometimes.
Yet there the kernel option would not help without reboot either.
It has a few fallbacks like dropping caches when trying to get more huge
pages to mitigate this.
But it already says in the config comments that it is a basic helper and
admins need to cover more complex cases.

FYI:
Config:
https://gerrit.fd.io/r/gitweb?p=deb_dpdk.git;a=blob;f=debian/dpdk.conf;h=a5aea865a8c43623761f9299549cbd2b25fd06a9;hb=refs/heads/16.11.x
Code:
https://gerrit.fd.io/r/gitweb?p=deb_dpdk.git;a=blob;f=debian/dpdk-init;h=103488edce6592a5f93e1b32cd5d476416374f81;hb=refs/heads/16.11.x

Back then I noted a todo that at some point all applications will need a
joint hugepage allocation strategy - just what you suggested here.

In addition to your correct statement of some (configurable) default paths
there also should be a way to work together in a better way.
One program sets 2x1G pages, the next 4x1G and the next 200x2M - what gets
to the boot parm?
Also since these don't know of each other they often interfer with each
other.

I can't find my old notes, but at the core in addition to what you suggest
it was about establishing a common shared way to be reused and not stomping
on each other.
In my vision it was like:
/etc/hugepaged/config <- core config
/etc/hugepaged/config.d/ <- every Package/Admin could drop their need
(libvirt, dpdk, databases, java, users adding custom things)

The service would then process the aggregate of that - ensure that it has
no conflicting specifications and allocates the sum of all those configs.
It could auto generate kernel commandline args to be added to make
allocation more reliable and still fall back to late allocation as we do
today.

Note: there is also a package hugepages (2.19-0ubuntu1) which provides:
  This package contains a number of utilities that will help administrate
the
  use of huge pages on your system. hugeedit modifies binaries to set
default
  segment remapping behavior. hugectl sets environment variables for using
huge
  pages and then execs the target program. hugeadm gives easy access to
huge page
  pool size control. pagesize lists page sizes available on the machine.

Back then I thought creating a proper service infrastructure as addon to
that project might be a good way to go.
Of course the systemd task suggests that this might also be a way to go.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

E.g. also /etc/default/qemu-kvm also adds an extra mount to /run/hugepages/kvm if you set KVM_HUGEPAGES=1

So yeah a unified approach is what we should strive for these days.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libvirt (Ubuntu):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
James Page (james-page)
Changed in charm-nova-compute:
importance: Undecided → Medium
status: New → Triaged
Changed in nova-compute (Juju Charms Collection):
status: Triaged → Invalid
Revision history for this message
Dan Streetman (ddstreet) wrote :

please reopen if this is still an issue

Changed in systemd (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yes, having a centralized default available multi-size hugepage mountpoints would still be nice to have.

Changed in libvirt (Ubuntu):
status: Confirmed → Won't Fix
Changed in nova-compute (Juju Charms Collection):
status: Invalid → Confirmed
status: Confirmed → Invalid
Changed in systemd (Ubuntu):
status: Won't Fix → Confirmed
Nick Rosbrook (enr0n)
Changed in systemd (Ubuntu):
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.