When the memory allocation was stalled, here is what perf top was giving me: Samples: 483K of event 'cycles:ppp', Event count (approx.): 52114089074 Overhead Shared Object Symbol 28.53% [kernel] [k] total_mapcount 25.34% [kernel] [k] kvm_age_rmapp 13.54% [kernel] [k] slot_rmap_walk_next 11.24% [kernel] [k] kvm_handle_hva_range 6.35% [kernel] [k] rmap_get_first 3.69% [kernel] [k] __x86_indirect_thunk_r13 1.33% [kernel] [k] __isolate_lru_page 0.63% [kernel] [k] isolate_lru_pages.isra.58 0.48% [kernel] [k] page_vma_mapped_walk 0.40% [kernel] [k] __mod_node_page_state 0.35% [kernel] [k] clear_page_erms 0.31% [kernel] [k] shrink_page_list 0.28% [kernel] [k] _find_next_bit 0.27% [kernel] [k] putback_inactive_pages 0.27% [kernel] [k] move_active_pages_to_lru 0.27% [kernel] [k] inactive_list_is_low 0.22% [kernel] [k] __mod_zone_page_state
numactl -H when the memorry allocation stalled: root@gpu-compute028:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 node 0 size: 64288 MB node 0 free: 55983 MB node 1 cpus: 1 3 5 7 9 11 13 15 node 1 size: 64489 MB node 1 free: 63810 MB node distances: node 0 1 0: 10 21 1: 21 10 root@gpu-compute028:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 node 0 size: 64288 MB node 0 free: 366 MB node 1 cpus: 1 3 5 7 9 11 13 15 node 1 size: 64489 MB node 1 free: 63782 MB node distances: node 0 1 0: 10 21 1: 21 10 root@gpu-compute028:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 node 0 size: 64288 MB node 0 free: 368 MB node 1 cpus: 1 3 5 7 9 11 13 15 node 1 size: 64489 MB node 1 free: 63757 MB node distances: node 0 1 0: 10 21 1: 21 10 root@gpu-compute028:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 node 0 size: 64288 MB node 0 free: 368 MB node 1 cpus: 1 3 5 7 9 11 13 15 node 1 size: 64489 MB node 1 free: 63744 MB node distances: node 0 1 0: 10 21 1: 21 10 root@gpu-compute028:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 8 10 12 14 node 0 size: 64288 MB node 0 free: 366 MB node 1 cpus: 1 3 5 7 9 11 13 15 node 1 size: 64489 MB node 1 free: 63504 MB node distances: node 0 1 0: 10 21 1: 21 10
then i killed the process.
When the memory allocation was stalled, here is what perf top was giving me: hva_range thunk_r13 lru_pages. isra.58 mapped_ walk page_state inactive_ pages pages_to_ lru list_is_ low page_state
Samples: 483K of event 'cycles:ppp', Event count (approx.): 52114089074
Overhead Shared Object Symbol
28.53% [kernel] [k] total_mapcount
25.34% [kernel] [k] kvm_age_rmapp
13.54% [kernel] [k] slot_rmap_walk_next
11.24% [kernel] [k] kvm_handle_
6.35% [kernel] [k] rmap_get_first
3.69% [kernel] [k] __x86_indirect_
1.33% [kernel] [k] __isolate_lru_page
0.63% [kernel] [k] isolate_
0.48% [kernel] [k] page_vma_
0.40% [kernel] [k] __mod_node_
0.35% [kernel] [k] clear_page_erms
0.31% [kernel] [k] shrink_page_list
0.28% [kernel] [k] _find_next_bit
0.27% [kernel] [k] putback_
0.27% [kernel] [k] move_active_
0.27% [kernel] [k] inactive_
0.22% [kernel] [k] __mod_zone_
numactl -H when the memorry allocation stalled: compute028: ~# numactl -H compute028: ~# numactl -H compute028: ~# numactl -H compute028: ~# numactl -H compute028: ~# numactl -H
root@gpu-
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 55983 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63810 MB
node distances:
node 0 1
0: 10 21
1: 21 10
root@gpu-
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63782 MB
node distances:
node 0 1
0: 10 21
1: 21 10
root@gpu-
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63757 MB
node distances:
node 0 1
0: 10 21
1: 21 10
root@gpu-
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63744 MB
node distances:
node 0 1
0: 10 21
1: 21 10
root@gpu-
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63504 MB
node distances:
node 0 1
0: 10 21
1: 21 10
then i killed the process.