stress-ng memory test is (unexpectedly) triggering oom

Bug #1996595 reported by Frank Heimes
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Stress-ng
Won't Fix
Low
Colin Ian King
Ubuntu on IBM z Systems
New
Undecided
Unassigned
stress-ng (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Running stress-ng (here done as part of the server certification suite 'test-memory') executed on Ubuntu Server 22.10/kinetic/5.19 fails on s390x (but maybe not related to this architecture) with the following (log) messages:

Nov 2 11:47:56 hwe0008 stress-ng: invoked with 'stress-ng --aggressive --verify --timeout 300 --mlock 0' by user 0 'root'
Nov 2 11:47:56 hwe0008 stress-ng: system: 'hwe0008' Linux 5.19.0-23-generic #24-Ubuntu SMP Fri Oct 14 15:39:36 UTC 2022 s390x
Nov 2 11:47:56 hwe0008 stress-ng: memory (MB): total 10020.65, free 6822.25, shared 0.80, buffer 13.85, swap 11136.01, free swap 11136.01
Nov 2 11:48:02 hwe0008 kernel: [ 5953.666806] stress-ng invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=1000
Nov 2 11:48:18 hwe0008 kernel: [ 5953.666823] CPU: 4 PID: 11352 Comm: stress-ng Not tainted 5.19.0-23-generic #24-Ubuntu
Nov 2 11:48:37 hwe0008 kernel: [ 5953.666827] Hardware name: IBM 2964 N63 400 (z/VM 6.4.0)
Nov 2 11:49:08 hwe0008 kernel: [ 5953.666828] Call Trace:
Nov 2 11:49:27 hwe0008 kernel: [ 5953.666830] [<0000000109129c0a>] dump_stack_lvl+0x62/0x90
Nov 2 11:49:42 hwe0008 kernel: [ 5953.666841] [<0000000109122cca>] dump_header+0x62/0x270
Nov 2 11:49:45 hwe0008 kernel: [ 5953.666843] [<00000001087ee634>] oom_kill_process+0x214/0x220
Nov 2 11:50:03 hwe0008 kernel: [ 5953.666850] [<00000001087ef814>] out_of_memory+0xf4/0x3c0
Nov 2 11:50:30 hwe0008 kernel: [ 5953.666853] [<0000000108866e58>] __alloc_pages_slowpath.constprop.0+0x938/0xc00
Nov 2 11:50:38 hwe0008 kernel: [ 5953.666857] [<000000010886745e>] __alloc_pages+0x33e/0x370
Nov 2 11:50:50 hwe0008 kernel: [ 5953.666859] [<0000000108867f9e>] __folio_alloc+0x2e/0x80
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666862] [<000000010888fb02>] vma_alloc_folio+0x92/0x400
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666866] [<0000000108837a34>] do_anonymous_page+0x1f4/0x590
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666869] [<000000010883d44e>] __handle_mm_fault+0x2ae/0x4f0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666872] [<000000010883d75e>] handle_mm_fault+0xce/0x240
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666875] [<00000001088323c8>] __get_user_pages+0x258/0x400
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666878] [<000000010883377a>] populate_vma_page_range+0x6a/0xd0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666880] [<0000000108833940>] __mm_populate+0xc0/0x1c0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666883] [<0000000108842238>] __s390x_sys_mlockall+0x1b8/0x1f0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666886] [<000000010912ea48>] __do_syscall+0x1e8/0x210
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666889] [<000000010913e342>] system_call+0x82/0xb0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666892] Mem-Info:
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] active_anon:23 inactive_anon:15 isolated_anon:0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] active_file:23 inactive_file:0 isolated_file:0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] unevictable:2455507 dirty:0 writeback:5
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] slab_reclaimable:27315 slab_unreclaimable:26081
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] mapped:2807 shmem:256 pagetables:14370 bounce:0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] kernel_misc_reclaimable:0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666894] free:13526 free_pcp:0 free_cma:0
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666899] Node 0 active_anon:92kB inactive_anon:60kB active_file:92kB inactive_file:0kB unevictable:9822028kB isolated(anon):0kB isolated(file):0kB mapped:11228kB dirty:0kB writeback:20kB shmem:1024kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:5888kB pagetables:57480kB all_unreclaimable? no
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666905] Node 0 DMA free:36132kB boost:0kB min:4560kB low:6628kB high:8696kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:2052340kB writepending:0kB present:2097152kB managed:2097068kB mlocked:2052340kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666911] lowmem_reserve[]: 0 7972 7972
Nov 2 11:50:57 hwe0008 kernel: [ 5953.666915] Node 0 Normal free:17972kB boost:1024kB min:18988kB low:27144kB high:35300kB reserved_highatomic:0KB active_anon:372kB inactive_anon:60kB active_file:0kB inactive_file:332kB unevictable:7769824kB writepending:192kB present:8388608kB managed:8164076kB mlocked:7769824kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB

It looks like mlock is causing OOM Killer to kick-in, but that test should not trigger it (or at least doesn't expect it to trigger, from what one can read in the man page for mlock).

This does not seem happen on jammy/22.04.1 running it on the same system(s).

(A separate CPU stress run seems to be fine.)

I've attached the full syslog.

Tags: s390x
Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

On top I've created an issue upstream: https://github.com/ColinIanKing/stress-ng/issues/243

Revision history for this message
Colin Ian King (colin-king) wrote :

The kernel stack trace shows that this occurs when mmap populates the pages, that is, when the mmap'd range is requested to be backed by physical memory. This OOM is not occurring when the mlock is being called. This behavior is expected, pages are mmap'd and locked by many of the stressor instances leaving little physical memory left. Some stressors get unlucky, their mmap request don't have physical memory for the mmap + populate causing an OOM.

Changed in stress-ng:
status: New → Won't Fix
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → High
Changed in stress-ng:
importance: High → Low
Changed in stress-ng (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.