Comment 10 for bug 2073214

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: hugepages causes permissions error [apparmor profile]

Thank you Tim for continuing on this.

Thanks for all your efforts in comment #6.
I wonder how you can create a new 24.04 and trigger this while I can not.
At some point we need a third person to decide who of us has the uncommon case.

Thanks for tracking down the denial that you are seeing.
This is all bound to huge page operations and your strace shows it at memfd_create operations.

Those are in qemu.git/util/memfd.c and inspired by it I've written a test program

$ cat /home/paelzer/work/qemu/lp-2073214-hugepage-noble/test.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

// Create an anonymous file in memory with huge pages
int main() {
    off_t reqsize = (2*1024*1024);

    // Test flags one by one
    int mfd = memfd_create("test", MFD_CLOEXEC|MFD_ALLOW_SEALING);
    if (mfd >= 0)
        close(mfd);
    else
        return -1;
    mfd = memfd_create("test", MFD_CLOEXEC|MFD_HUGETLB);
    if (mfd >= 0)
        close(mfd);
    else
        return -1;

    int fd = memfd_create("memory-backend-memfd", MFD_CLOEXEC | MFD_HUGETLB | MFD_ALLOW_SEALING);
    if (fd == -1) {
        perror("memfd_create");
        exit(EXIT_FAILURE);
    }
    printf("Created\n");

    // Truncate it to the desired length
    if (ftruncate(fd, reqsize) == -1) {
        perror("truncate");
        goto err;
    }
    printf("Truncated\n");

    // Add seals
    if (fcntl(fd, F_ADD_SEALS, 1) == -1) {
        perror("seal");
        goto err;
    }
    printf("Added seal\n");

    // Close the file descriptor
    close(fd);
    printf("Closed\n");

    return 0;

err:
    close(fd);
    exit(EXIT_FAILURE);
}

Just like your more complex test with qemu, without an apparmor profile this works just fine.
(Interestingly even without allocating huge pages).

$ ./test
Created
Truncated
Added seal
closed

Running it with the default profile for guests (no custom paths needed) should trigger the same.
So I use this (one might need to adopt paths accordingly) profile.
Inspired by the files we usually generate for the guests:

$ sudo cat /etc/apparmor.d/home.paelzer.work.qemu.lp-2073214-hugepage-noble.test
# Last Modified: Mon Jul 22 10:23:53 2024
abi <abi/3.0>,

include <tunables/global>

/home/paelzer/work/qemu/lp-2073214-hugepage-noble/test flags=(attach_disconnected) {
  include <abstractions/libvirt-qemu>

  /home/paelzer/work/qemu/lp-2073214-hugepage-noble/test mr,

}

$ sudo apparmor_parser --replace /etc/apparmor.d/home.paelzer.work.qemu.lp-2073214-hugepage-noble.test

With that in place I can recreate the problem and the denial matches yours:

$ ./test
Created
truncate: Permission denied

[5365065.306262] audit: type=1400 audit(1721636961.319:9592): apparmor="DENIED" operation="truncate" class="file" profile="/home/paelzer/work/qemu/lp-2073214-hugepage-noble/test" name="/" pid=3113938 comm="test" requested_mask="w" denied_mask="w" fsuid=1000 ouid=1000

I think apparmor struggles to detect and mediate that path accordingly.
name="/" is only there due to a lack of something better.
If I'd not use attach_disconnected it would show as:
- info="Failed name lookup - disconnected path"
- name=""

I tried the same in a Jammy system:

$ apt install libvirt-daemon-system apparmor-utils gcc strace
# place the test.c in /root
$ gcc -Wall -Werror test.c -o test
# create the matching /etc/apparmor.d/root.test profile
$ sudo apparmor_parser --replace /etc/apparmor.d/root.test

But other than in your case, 22.04 behaved the same in my case.
Also getting
[ 444.373044] audit: type=1400 audit(1721637616.280:51): apparmor="DENIED" operation="truncate" profile="/root/test" name="/" pid=5039 comm="test" requested_mask="w" denied_mask="w" fsuid=0 ouid=0

So in summary, on one hand the suggested rule fixes it according to your test.
But I think the rule "/ rw" is a bit too open to just add :-)

And I still can't see why my huge page guests work and yours do not, but on the other I could now recreate this at least outside of qemu (there is still works for me)
And with my instruction everyone can recreate this more easily now.

Hence I'm subscribing the package maintainer and expert of apparmor as he might be able to help much better suggesting which rule might be more appropriate for such code (which now is much easier to understand).

@JJ - how should we guard correctly in apparmor to allow usage of for anonymous via memfd_create as allocation backend files?