Comment 16 for bug 1808412

Revision history for this message
Marc GariƩpy (mgariepy) wrote :

I see the same behavior on both Bionic and Xenial on the 4.15.0-43 kernel.

4.4.0-141 doesn't have the same issue but is still reacting weirdly.

with both libvirt and qemu from ubuntu updates and Ubuntu cloud archive.

the problem is more obvious when starting the vm on a server with multiple numa nodes.

I reproduced the issue on a dual AMD EPYC 7301 16-Core Processor

# numactl -H
available: 8 nodes (0-7)
node 0 cpus: 0 1 2 3 32 33 34 35
node 0 size: 32095 MB
node 0 free: 31947 MB
node 1 cpus: 4 5 6 7 36 37 38 39
node 1 size: 32252 MB
node 1 free: 32052 MB
node 2 cpus: 8 9 10 11 40 41 42 43
node 2 size: 32252 MB
node 2 free: 31729 MB
node 3 cpus: 12 13 14 15 44 45 46 47
node 3 size: 32252 MB
node 3 free: 31999 MB
node 4 cpus: 16 17 18 19 48 49 50 51
node 4 size: 32252 MB
node 4 free: 32166 MB
node 5 cpus: 20 21 22 23 52 53 54 55
node 5 size: 32252 MB
node 5 free: 32185 MB
node 6 cpus: 24 25 26 27 56 57 58 59
node 6 size: 32231 MB
node 6 free: 32161 MB
node 7 cpus: 28 29 30 31 60 61 62 63
node 7 size: 32250 MB
node 7 free: 32183 MB
node distances:
node 0 1 2 3 4 5 6 7
  0: 10 16 16 16 32 32 32 32
  1: 16 10 16 16 32 32 32 32
  2: 16 16 10 16 32 32 32 32
  3: 16 16 16 10 32 32 32 32
  4: 32 32 32 32 10 16 16 16
  5: 32 32 32 32 16 10 16 16
  6: 32 32 32 32 16 16 10 16
  7: 32 32 32 32 16 16 16 10

Step to reproduce:
1- install ubuntu with libvirt
2- configure pci passthrough
3- create a vm that use more ram than a single numa node has.
4- add a pci device to your vm * this step makes qemu pre-allocate the ram to the vm.
5- one qemu-system-x86_64 takes 1 core 100% cpu, and the ram usage goes up. until the numa memory of the running core is full. then it stalls until the process utilisation goes to another core.

setting /sys/kernel/mm/transparent_hugepage/enabled to [never], helps mitigate the issue.