From kenrel log, 2M hugepage is 6807 which is smaller than 7024. It looks system has no 7024 2M pages
Total expected hugepage size is: 7024*2M + 1G = 14.7GB
It is not reasonable. We have several reserved memory resources as below:
1. 8GB reserved by worker_reserved.conf: WORKER_BASE_RESERVED=("node0:8000MB:1" "node1:2000MB:1")
2. 10% reserved by below code:
sysinv host.py: vm_hugepages_nr_2M = int(m.vm_hugepages_possible_2M * 0.9)
vm_hugepages_possible_2M is calculated by _inode_get_memory_hugepages() function as below logic:
node_total_kb = total_hp_mb * SIZE_KB + free_kb + pss_mb * SIZE_KB
total_hp_mb is 0 since 2M hugepage is not reserved by kernel command line
free_kb is from /sys/devices/system/node/node0/meminfo
pss_mb is collected from /proc/*/smaps
vm_hugepages_possible_2M: node_total_kb - base_mem_mb - vswitch_mem_kb
base_mem_mb is 8GB from WORKER_BASE_RESERVED in worker_reserved.conf vswitch_mem_kb is 1GB from COMPUTE_VSWITCH_MEMORY in worker_reserved.conf
So far, the vm_hugepages_possible_2M is always corrected on Shanghai bare metal tests.
Could reporters help to provide more info while this bug is triggered?
1. run system host-memory-list <compute node>
2. run cat /proc/sys/vm/overcommit_* #the mode will impact the free_kb calculation
3. run cat /proc/*/smaps 2>/dev/null | awk '/^Pss:/ {a += $2;} END {printf "%d\n", a/1024.0;}' on compute nodes
4. run cat /sys/devices/system/node/node*/meminfo on compute nodes
from compute- 1_20190507. 124154/ var/log/ kern.log 05-06T16: 24:12.266 localhost kernel: debug [ 0.000000] On node 0 totalpages: 4174118 05-06T17: 28:56.749 compute-1 kernel: info [ 1515.471986] Node 0 hugepages_total=1 hugepages_free=0 hugepages_surp=0 hugepages_ size=1048576kB 05-06T17: 28:56.749 compute-1 kernel: info [ 1515.471987] Node 0 hugepages_ total=6807 hugepages_free=6807 hugepages_surp=0 hugepages_ size=2048kB
2019-
...
2019-
2019-
...
from hieradata/ 192.168. 204.77. yaml: :compute: :hugepage: :params: :vm_2M_ pages: '"7024,7172"' :compute: :params: :worker_ base_reserved: ("node0:8000MB:1" "node1:2000MB:1")
platform:
...
platform:
from puppet.log system/ node/node0/ hugepages/ hugepages- 2048kB/ nr_hugepages] system/ node/node1/ hugepages/ hugepages- 2048kB/ nr_hugepages]
...
Exec[Allocate 7024 /sys/devices/
...
Exec[Allocate 7172 /sys/devices/
...
The total memory on node 0 is 16GB;
From kenrel log, 2M hugepage is 6807 which is smaller than 7024. It looks system has no 7024 2M pages
Total expected hugepage size is: 7024*2M + 1G = 14.7GB
It is not reasonable. We have several reserved memory resources as below: reserved. conf:
WORKER_ BASE_RESERVED= ("node0: 8000MB: 1" "node1:2000MB:1") hugepages_ possible_ 2M * 0.9)
1. 8GB reserved by worker_
2. 10% reserved by below code:
sysinv host.py: vm_hugepages_nr_2M = int(m.vm_
vm_hugepages_ possible_ 2M is calculated by _inode_ get_memory_ hugepages( ) function as below logic:
node_total_kb = total_hp_mb * SIZE_KB + free_kb + pss_mb * SIZE_KB system/ node/node0/ meminfo
total_hp_mb is 0 since 2M hugepage is not reserved by kernel command line
free_kb is from /sys/devices/
pss_mb is collected from /proc/*/smaps
vm_ hugepages_ possible_ 2M: node_total_kb - base_mem_mb - vswitch_mem_kb BASE_RESERVED in worker_ reserved. conf
vswitch_ mem_kb is 1GB from COMPUTE_ VSWITCH_ MEMORY in worker_ reserved. conf
base_mem_mb is 8GB from WORKER_
So far, the vm_hugepages_ possible_ 2M is always corrected on Shanghai bare metal tests.
Could reporters help to provide more info while this bug is triggered? vm/overcommit_ * #the mode will impact the free_kb calculation system/ node/node* /meminfo on compute nodes
1. run system host-memory-list <compute node>
2. run cat /proc/sys/
3. run cat /proc/*/smaps 2>/dev/null | awk '/^Pss:/ {a += $2;} END {printf "%d\n", a/1024.0;}' on compute nodes
4. run cat /sys/devices/
thanks,
Bin