test_maps in ubunut_bpf failed on AWS c4.large (OOM, ssh session killed)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned |
Bug Description
This is not a regression.
Issue found when digging the ubuntu_bpf failure on this instance across different kernels:
* J-aws-5.
* J-aws-5.
It's only failing on this instance.
From the test log you will see incomplete test_map test result like this:
10:22:54 INFO | START ubuntu_
10:22:54 DEBUG| Persistent state client.
10:22:54 DEBUG| Persistent state client.
10:22:54 DEBUG| Waiting for pid 24841 for 1800 seconds
10:22:54 WARNI| System python is too old, crash handling disabled
10:22:54 DEBUG| Running './test_maps'
10:22:55 DEBUG| [stdout] Fork 1024 tasks to 'test_update_
10:22:55 DEBUG| [stdout] Fork 1024 tasks to 'test_update_
10:22:55 DEBUG| [stdout] Fork 100 tasks to 'test_hashmap'
10:22:55 DEBUG| [stdout] Fork 100 tasks to 'test_hashmap_
10:22:55 DEBUG| [stdout] Fork 100 tasks to 'test_hashmap_
-------
R E S U L T S
-------
For a successful run, you should see test printing "test_maps: OK, 0 SKIPPED" in the end, and the "END GOOD" tag from autotest.
Digging more, you will see that OOM killer was invoked on this instance, and sshd was sacrificed:
[12266.071294] Out of memory: Killed process 3335 (packagekitd) total-vm:295980kB, anon-rss:2920kB, file-rss:3236kB, shmem-rss:0kB, UID:0 pgtables:156kB oom_score_adj:0
[12266.084117] systemd-logind invoked oom-killer: gfp_mask=
[12266.084122] CPU: 1 PID: 634 Comm: systemd-logind Tainted: G W 5.15.0-1031-aws #35-Ubuntu
[12266.084125] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[12266.084126] Call Trace:
[12266.084128] <TASK>
[12266.084130] show_stack+
[12266.084133] dump_stack_
[12266.084137] dump_stack+
[12266.084140] dump_header+
[12266.084144] oom_kill_
[12266.084147] out_of_
[12266.084151] __alloc_
[12266.084156] __alloc_
[12266.084159] alloc_pages+
[12266.084162] __page_
[12266.084164] pagecache_
[12266.084167] ? page_cache_
[12266.084171] filemap_
[12266.084174] ? filemap_
[12266.084177] __do_fault+
[12266.084181] do_read_
[12266.084184] do_fault+0xa0/0x2e0
[12266.084187] handle_
[12266.084191] __handle_
[12266.084195] handle_
[12266.084199] do_user_
[12266.084202] exc_page_
[12266.084206] asm_exc_
[12266.084209] RIP: 0033:0x7f196ad58f91
[12266.084215] Code: Unable to access opcode bytes at RIP 0x7f196ad58f67.
[12266.084216] RSP: 002b:00007ffdc9
[12266.084219] RAX: 000055cbcd55a078 RBX: 000055cbcd559f30 RCX: 00000000000dccda
[12266.084222] RDX: 0000000000000000 RSI: 00000002db0d0c40 RDI: 431bde82d7b634db
[12266.084224] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[12266.084225] R10: 00007ffdc9554080 R11: 0000000000000286 R12: 00000000000000c8
[12266.084227] R13: 0000000000000022 R14: 0000000000000014 R15: 000055cbcd55a0c0
[12266.084230] </TASK>
[12266.084232] Mem-Info:
[12266.084233] active_anon:449 inactive_anon:16013 isolated_anon:0
[12266.084239] Node 0 active_anon:1796kB inactive_
[12266.084245] Node 0 DMA free:14848kB min:180kB low:224kB high:268kB reserved_
[12266.084251] lowmem_reserve[]: 0 3670 3670 3670 3670
[12266.084256] Node 0 DMA32 free:50000kB min:51016kB low:62232kB high:73448kB reserved_
[12266.084263] lowmem_reserve[]: 0 0 0 0 0
[12266.084268] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) = 14848kB
[12266.084283] Node 0 DMA32: 4113*4kB (UME) 1818*8kB (UME) 688*16kB (UME) 234*32kB (UME) 6*64kB (UM) 1*128kB (U) 2*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 50516kB
[12266.084303] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
[12266.084305] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_
[12266.084307] 2606 total pagecache pages
[12266.084309] 0 pages in swap cache
[12266.084310] Swap cache stats: add 0, delete 0, find 0/0
[12266.084311] Free swap = 0kB
[12266.084312] Total swap = 0kB
[12266.084314] 982941 pages RAM
[12266.084315] 0 pages HighMem/MovableOnly
[12266.084316] 27103 pages reserved
[12266.084317] 0 pages hwpoisoned
[12266.084318] Tasks state (memory values in pages):
[12266.084319] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[12266.084322] [ 364] 0 364 14015 626 118784 0 -250 systemd-journal
[12266.084326] [ 401] 0 401 72328 6775 110592 0 -1000 multipathd
[12266.084329] [ 404] 0 404 5773 1060 65536 0 -1000 systemd-udevd
[12266.084332] [ 577] 100 577 4063 818 73728 0 0 systemd-network
[12266.084335] [ 579] 101 579 6315 1499 86016 0 0 systemd-resolve
[12266.084338] [ 613] 0 613 703 285 45056 0 0 acpid
[12266.084341] [ 618] 0 618 1821 599 53248 0 0 cron
[12266.084344] [ 619] 102 619 2180 965 57344 0 -900 dbus-daemon
[12266.084347] [ 626] 0 626 20674 746 57344 0 0 irqbalance
[12266.084350] [ 629] 104 629 55600 937 77824 0 0 rsyslogd
[12266.084352] [ 631] 0 631 326277 1249 180224 0 0 amazon-ssm-agen
[12266.084355] [ 633] 0 633 218807 4135 253952 0 -900 snapd
[12266.084358] [ 634] 0 634 3872 806 69632 0 0 systemd-logind
[12266.084361] [ 705] 114 705 4722 713 61440 0 0 chronyd
[12266.084363] [ 709] 114 709 2640 132 61440 0 0 chronyd
[12266.084366] [ 896] 0 896 58864 1094 86016 0 0 polkitd
[12266.084369] [ 898] 0 898 1554 193 53248 0 0 agetty
[12266.084372] [ 901] 0 901 1543 214 45056 0 0 agetty
[12266.084374] [ 904] 0 904 3856 1169 73728 0 -1000 sshd
[12266.084377] [ 1204] 0 1204 4228 1211 73728 0 0 sshd
[12266.084380] [ 1207] 1000 1207 4245 1049 73728 0 0 systemd
[12266.084383] [ 1208] 1000 1208 42587 1214 98304 0 0 (sd-pam)
[12266.084386] [ 1291] 1000 1291 4303 1094 73728 0 0 sshd
[12266.084389] [ 1292] 1000 1292 2308 997 53248 0 0 bash
[12266.084392] [ 1302] 0 1302 4230 1252 65536 0 0 sshd
[12266.084395] [ 1349] 1000 1349 4305 800 65536 0 0 sshd
[12266.084398] [ 1350] 1000 1350 2275 951 57344 0 0 bash
[12266.084400] [ 1431] 1000 1431 2974 634 57344 0 0 sudo
[12266.084403] [ 1432] 1000 1432 2974 223 53248 0 0 sudo
[12266.084406] [ 1433] 0 1433 1606 474 45056 0 0 dmesg
[12266.084409] [ 2811] 0 2811 4228 1263 69632 0 0 sshd
[12266.084411] [ 2860] 1000 2860 4303 1085 69632 0 0 sshd
[12266.084414] [ 3404] 1000 3404 722 214 45056 0 0 sh
[12266.084417] [ 3406] 1000 3406 2973 775 61440 0 0 sudo
[12266.084425] [ 21228] 0 21228 867 480 45056 0 0 test_maps
[12266.084429] [ 23493] 0 23493 867 37 45056 0 0 test_maps
[12266.084431] [ 23495] 0 23495 867 37 45056 0 0 test_maps
[12266.084434] [ 23499] 0 23499 867 37 45056 0 0 test_maps
[12266.084437] [ 23502] 0 23502 867 37 45056 0 0 test_maps
[12266.084439] [ 23503] 0 23503 867 37 45056 0 0 test_maps
[12266.084442] [ 23504] 0 23504 867 37 45056 0 0 test_maps
[12266.084444] [ 23507] 0 23507 867 37 45056 0 0 test_maps
[12266.084446] [ 23508] 0 23508 867 37 45056 0 0 test_maps
[12266.084449] [ 23509] 0 23509 867 37 45056 0 0 test_maps
[12266.084451] [ 23510] 0 23510 867 37 45056 0 0 test_maps
[12266.084460] [ 23512] 0 23512 867 37 45056 0 0 test_maps
[12266.084462] [ 23513] 0 23513 867 37 45056 0 0 test_maps
[12266.084465] [ 23514] 0 23514 867 37 45056 0 0 test_maps
[12266.084467] [ 23515] 0 23515 867 37 45056 0 0 test_maps
[12266.084470] [ 23519] 0 23519 867 37 45056 0 0 test_maps
[12266.084472] [ 23520] 0 23520 867 37 45056 0 0 test_maps
[12266.084475] [ 23521] 0 23521 867 37 45056 0 0 test_maps
[12266.084477] [ 23523] 0 23523 867 37 45056 0 0 test_maps
[12266.084479] [ 23524] 0 23524 867 37 45056 0 0 test_maps
[12266.084482] [ 23525] 0 23525 867 37 45056 0 0 test_maps
[12266.084484] [ 23526] 0 23526 867 37 45056 0 0 test_maps
[12266.084487] [ 23528] 0 23528 867 37 45056 0 0 test_maps
[12266.084489] [ 23529] 0 23529 867 37 45056 0 0 test_maps
[12266.084491] [ 23530] 0 23530 867 37 45056 0 0 test_maps
[12266.084494] [ 23531] 0 23531 867 37 45056 0 0 test_maps
[12266.084496] [ 23533] 0 23533 867 37 45056 0 0 test_maps
[12266.084499] [ 23534] 0 23534 867 37 45056 0 0 test_maps
[12266.084501] [ 23535] 0 23535 867 37 45056 0 0 test_maps
[12266.084504] [ 23537] 0 23537 867 37 45056 0 0 test_maps
[12266.084507] [ 23538] 0 23538 867 37 45056 0 0 test_maps
[12266.084510] [ 23540] 0 23540 867 37 45056 0 0 test_maps
[12266.084513] [ 23542] 0 23542 867 37 45056 0 0 test_maps
[12266.084515] [ 23543] 0 23543 867 37 45056 0 0 test_maps
[12266.084517] [ 23544] 0 23544 867 37 45056 0 0 test_maps
[12266.084520] [ 23545] 0 23545 867 37 45056 0 0 test_maps
[12266.084523] [ 23546] 0 23546 867 37 45056 0 0 test_maps
[12266.084526] [ 23547] 0 23547 867 37 45056 0 0 test_maps
[12266.084528] [ 23549] 0 23549 867 37 45056 0 0 test_maps
[12266.084531] [ 23551] 0 23551 867 37 45056 0 0 test_maps
[12266.084533] [ 23552] 0 23552 867 37 45056 0 0 test_maps
[12266.084536] [ 23553] 0 23553 867 37 45056 0 0 test_maps
[12266.084538] [ 23554] 0 23554 867 37 45056 0 0 test_maps
[12266.084541] [ 23555] 0 23555 867 37 45056 0 0 test_maps
[12266.084544] [ 23557] 0 23557 867 37 45056 0 0 test_maps
[12266.084547] [ 23558] 0 23558 867 37 45056 0 0 test_maps
[12266.084549] [ 23559] 0 23559 867 37 45056 0 0 test_maps
[12266.084552] [ 23561] 0 23561 867 37 45056 0 0 test_maps
[12266.084555] [ 23563] 0 23563 867 37 45056 0 0 test_maps
[12266.084557] [ 23564] 0 23564 867 37 45056 0 0 test_maps
[12266.084560] [ 23565] 0 23565 867 37 45056 0 0 test_maps
[12266.084563] [ 23566] 0 23566 867 37 45056 0 0 test_maps
[12266.084566] [ 23568] 0 23568 867 37 45056 0 0 test_maps
[12266.084568] [ 23570] 0 23570 867 37 45056 0 0 test_maps
[12266.084571] [ 23571] 0 23571 867 37 45056 0 0 test_maps
[12266.084574] [ 23573] 0 23573 867 37 45056 0 0 test_maps
[12266.084577] [ 23574] 0 23574 867 83 45056 0 0 test_maps
[12266.084579] [ 23575] 0 23575 867 37 45056 0 0 test_maps
[12266.084582] [ 23576] 0 23576 867 37 45056 0 0 test_maps
[12266.084585] [ 23578] 0 23578 867 37 45056 0 0 test_maps
[12266.084587] [ 23579] 0 23579 867 37 45056 0 0 test_maps
[12266.084590] [ 23581] 0 23581 867 37 45056 0 0 test_maps
[12266.084593] [ 23582] 0 23582 867 37 45056 0 0 test_maps
[12266.084596] [ 23583] 0 23583 867 37 45056 0 0 test_maps
[12266.084599] [ 23584] 0 23584 867 37 45056 0 0 test_maps
[12266.084601] [ 23585] 0 23585 867 37 45056 0 0 test_maps
[12266.084604] [ 23586] 0 23586 867 111 45056 0 0 test_maps
[12266.084606] [ 23588] 0 23588 867 37 45056 0 0 test_maps
[12266.084609] [ 23589] 0 23589 867 37 45056 0 0 test_maps
[12266.084611] [ 23590] 0 23590 867 37 45056 0 0 test_maps
[12266.084614] [ 23592] 0 23592 867 37 45056 0 0 test_maps
[12266.084622] oom-kill:
[12266.084642] Out of memory: Killed process 579 (systemd-resolve) total-vm:25260kB, anon-rss:3940kB, file-rss:2056kB, shmem-rss:0kB, UID:101 pgtables:84kB oom_score_adj:0
[12266.640694] Out of memory: Killed process 631 (amazon-ssm-agen) total-vm:1305108kB, anon-rss:4996kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:176kB oom_score_adj:0
[12267.120369] Out of memory: Killed process 2811 (sshd) total-vm:16912kB, anon-rss:2156kB, file-rss:2896kB, shmem-rss:0kB, UID:0 pgtables:68kB oom_score_adj:0
[12267.161590] Out of memory: Killed process 1302 (sshd) total-vm:16920kB, anon-rss:2164kB, file-rss:2844kB, shmem-rss:0kB, UID:0 pgtables:64kB oom_score_adj:0
[12267.257516] Out of memory: Killed process 1208 ((sd-pam)) total-vm:170348kB, anon-rss:4856kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:96kB oom_score_adj:0
[12267.294204] Out of memory: Killed process 1204 (sshd) total-vm:16912kB, anon-rss:2156kB, file-rss:2688kB, shmem-rss:0kB, UID:0 pgtables:72kB oom_score_adj:0
[12267.427756] Out of memory: Killed process 1291 (sshd) total-vm:17212kB, anon-rss:2472kB, file-rss:1904kB, shmem-rss:0kB, UID:1000 pgtables:72kB oom_score_adj:0
[12267.537564] Out of memory: Killed process 629 (rsyslogd) total-vm:222400kB, anon-rss:1644kB, file-rss:2732kB, shmem-rss:0kB, UID:104 pgtables:76kB oom_score_adj:0
[12267.607202] Out of memory: Killed process 2860 (sshd) total-vm:17212kB, anon-rss:2440kB, file-rss:1420kB, shmem-rss:0kB, UID:1000 pgtables:68kB oom_score_adj:0
The sshd being killed is the one we use to trigger test from our Jenkins.
This is why the test is being marked as failed after test.
Memory size on AWS c4.large
total used free shared buff/cache available
$ free -mh
Mem: 3.6Gi 203Mi 3.3Gi 0.0Ki 184Mi 3.3Gi
Swap: 0B 0B 0B