Kata runtime does not support hugepages

Bug #1864383 reported by Brent Rowsell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Triaged
Low
Unassigned

Bug Description

Brief Description
-----------------
Kata runtime does not support hugepages assigned via k8s
I launch a pod with the following spec

apiVersion: v1
kind: Pod
metadata:
  name: testpod2
spec:
  runtimeClassName: kata
  containers:
  - name: appcntr1
    image: centos/tools
    imagePullPolicy: IfNotPresent
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 300000; done;" ]
    volumeMounts:
    - mountPath: /hugepages
      name: hugepage
    resources:
      requests:
        cpu: 2
        memory: "1Gi"
        hugepages-1Gi: 1Gi
      limits:
        cpu: 2
        memory: "1Gi"
        hugepages-1Gi: 1Gi
  volumes:
    - name: hugepage
      emptyDir:
        medium: HugePages

I would expect to see it to see a hugepages mount in the container as follows

nodev on /hugepages type hugetlbfs (rw,relatime,pagesize=1Gi)

It is not present for kata, works fine for runc

Severity
--------
Majo performance feature not available with kata runtime

Steps to Reproduce
------------------
See above

Expected Behavior
------------------
See above

Actual Behavior
----------------
See above

Reproducibility
---------------
100%

System Configuration
--------------------
All

Branch/Pull Time/Commit
-----------------------
BUILD_DATE="2020-02-22 04:15:31 -0500"

Last Pass
---------
Likely never

Timestamp/Logs
--------------
See above

Test Activity
-------------
Developer Testing

Workaround
----------
None

Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / high priority - serious limitation with kata containers.

This will likely require follow-up with the upstream kata container project.

tags: added: stx.4.0 stx.containers
Changed in starlingx:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Lin Shuicheng (shuicheng)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to the Kata Containers Feature Prime

Revision history for this message
Lin Shuicheng (shuicheng) wrote :

There is issue with huge page support in kata currently.
And it is already tracked in kata community.
Here is the open issues relate to huge page with kata runtime:
https://github.com/kata-containers/runtime/issues/2353
https://github.com/kata-containers/runtime/issues/2172
https://github.com/kata-containers/runtime/issues/1548

Revision history for this message
Steven Webster (swebster-wr) wrote :

Just as a data point here's what I had to do to get kata+hugepages somewhat working in a StarlingX lab:

Option 1. Custom built kata-containers kernel and rootfs with an empty hugepages
directory mapping those on the host system

Kernel config:

CONFIG_CGROUP_HUGETLB=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

Rootfs empty directory created:

/mnt/huge-1048576kB
/mnt/huge-2048kB

Modify the containerd config.toml with:

        [plugins.cri.containerd.runtimes.kata]
          runtime_type = "io.containerd.kata.v2"
+ pod_annotations = ["io.katacontainers.*"]

The user could then specify the hugepage configuration in the pod annotation:

  annotations:
    io.katacontainers.config.hypervisor.kernel_params: "default_hugepagesz=1G hugepagesz=1G hugepages=2"

Note that in the above command, the hugepages=2 does not seem to actually set the number of hugepages.

Since Kata does not support the k8s sysctl facility:
https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/ ,

It's possible to write a systemd service/target which takes the value of hugepages in the kernel parameters and sets the sysctl on vm startup.

Finally, in the container itself, it's up to the end user application to create a hugetlbfs mount for hugepages:

mount -t hugetlbfs -o pagesize=1G none /mnt/huge-1048576kB

or

mount -t hugetlbfs -o pagesize=2M none /mnt/huge-2048kB

Option 2. In the kata-containers configuration.toml, set enable_hugepages = true

This results in all memory in the kata VM being backed by hugepages, based on
the memory resource request/limits

Similar to 1. , it would then be required for the container application itself
to create a hugepage directory, then mount the hugetlbfs appropriately

Revision history for this message
Frank Miller (sensfan22) wrote :

This support is not feasible in the stx.4.0 timeline. Moving this to stx.5.0.

tags: added: stx.5.0
removed: stx.4.0
Changed in starlingx:
assignee: Lin Shuicheng (shuicheng) → nobody
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Lowering the priority as nobody seems to be working on this. We will not hold up stx.5.0 for this issue.

tags: removed: stx.5.0
Changed in starlingx:
importance: High → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers