Comment 14 for bug 1832915

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

While this numbering is pretty common at power (all non SMT systems) and s390x (scaling #cpus on load) it is uncommon on x86. Never the less in theory the issue should exist there as well
But I tried this for an hour and it didn't trigger (plenty of assigns happened)

Repro (x86)
1. Get a KVM guest with numa memory nodes
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='2097152' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='2097152' unit='KiB'/>
    </numa>
  </cpu>
2. disable some cpus in the mid
  $ echo 0 | sudo tee /sys/bus/cpu/devices/cpu1/online
  $ echo 0 | sudo tee /sys/bus/cpu/devices/cpu2/online
  $ lscpu
  CPU(s): 4
  On-line CPU(s) list: 0,3
  Off-line CPU(s) list: 1,2
3. install, start and follow the log of numad
  $ sudo apt install numad
  $ sudo systemctl start numad
  $ journalctl -f -u numad
4. run some memory load that will make numad assign processes
  $ sudo apt install stress-ng
  $ stress-ng --vm 2 --vm-bytes 90% -t 5m

If we follow the log of numad with verbose enabled we will after a while see numa assignments like:
Mon Jun 17 10:32:05 2019: Advising pid 3416 (stress-ng-vm) move from nodes (0-1) to nodes (0)
Mon Jun 17 10:32:23 2019: Advising pid 3417 (stress-ng-vm) move from nodes (0-1) to nodes (1)

Maybe on ppc also the numa node numbering is non linear, I remember working on fixes for numactl in that regard - and maybe that is important as well.