test_maps test from ubuntu_bpf will cause OOM on Eoan s390x LPAR

Bug #1856163 reported by Po-Hsu Lin
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Invalid
Undecided
Unassigned
ubuntu-kernel-tests
Won't Fix
Undecided
Unassigned
linux (Ubuntu)
Won't Fix
Undecided
Unassigned
Eoan
Won't Fix
Undecided
Unassigned

Bug Description

This issue can be reproduced with 5.3.0-24, so this is not a regression in this cycle.

Reproduce rate: 3/3

On the s390x LPAR (s2lp4), the test_maps test from ubuntu_bpf will cause OOM and the following tests will be aborted.

[ 591.755446] test_maps invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ 591.755451] CPU: 1 PID: 19054 Comm: test_maps Tainted: P W O 5.3.0-25-generic #27-Ubuntu
[ 591.755452] Hardware name: IBM 2964 N63 400 (LPAR)
[ 591.755453] Call Trace:
[ 591.755460] ([<000000023d11b98e>] show_stack+0x8e/0xd0)
[ 591.755464] [<000000023d995f6a>] dump_stack+0x8a/0xb8
[ 591.755469] [<000000023d30e082>] dump_header+0x62/0x250
[ 591.755470] [<000000023d30d152>] oom_kill_process+0x172/0x178
[ 591.755472] [<000000023d30d322>] out_of_memory.part.0+0x1ca/0x4e0
[ 591.755473] [<000000023d30df1e>] out_of_memory+0x6e/0xf8
[ 591.755477] [<000000023d36af02>] __alloc_pages_slowpath+0xda2/0xeb0
[ 591.755478] [<000000023d36b2b6>] __alloc_pages_nodemask+0x2a6/0x318
[ 591.755481] [<000000023d387d6c>] alloc_pages_vma+0x104/0x1d8
[ 591.755482] [<000000023d372ca4>] __read_swap_cache_async+0x18c/0x268
[ 591.755483] [<000000023d372daa>] read_swap_cache_async+0x2a/0x60
[ 591.755485] [<000000023d37300c>] swap_cluster_readahead+0x22c/0x2e8
[ 591.755486] [<000000023d3734f8>] swapin_readahead+0x2d0/0x408
[ 591.755488] [<000000023d34a14c>] do_swap_page+0x1f4/0x880
[ 591.755489] [<000000023d34bf2c>] __handle_mm_fault+0x7f4/0x910
[ 591.755491] [<000000023d34c10e>] handle_mm_fault+0xc6/0x1a0
[ 591.755492] [<000000023d12c2bc>] do_exception+0x12c/0x3e0
[ 591.755494] [<000000023d12d122>] do_dat_exception+0x2a/0x58
[ 591.755497] [<000000023d9b67f0>] pgm_check_handler+0x1cc/0x220
[ 591.755497] Mem-Info:
[ 591.755501] active_anon:0 inactive_anon:33 isolated_anon:6
                active_file:59 inactive_file:93 isolated_file:3
                unevictable:6786 dirty:0 writeback:2 unstable:0
                slab_reclaimable:27072 slab_unreclaimable:44918
                mapped:3670 shmem:0 pagetables:12555 bounce:0
                free:19327 free_pcp:356 free_cma:0
[ 591.755505] Node 0 active_anon:0kB inactive_anon:132kB active_file:236kB inactive_file:372kB unevictable:27144kB isolated(anon):24kB isolated(file):12kB mapped:14680kB dirty:0kB writeback:8kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 591.755505] Node 0 DMA free:57800kB min:2876kB low:4944kB high:7012kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2097152kB managed:2097056kB mlocked:0kB kernel_stack:496kB pagetables:4964kB bounce:0kB free_pcp:620kB local_pcp:248kB free_cma:0kB
[ 591.755509] lowmem_reserve[]: 0 13806 13806
[ 591.755511] Node 0 Normal free:19508kB min:19648kB low:33784kB high:47920kB active_anon:0kB inactive_anon:132kB active_file:0kB inactive_file:576kB unevictable:27144kB writepending:8kB present:14680064kB managed:14141200kB mlocked:27144kB kernel_stack:25040kB pagetables:45256kB bounce:0kB free_pcp:804kB local_pcp:120kB free_cma:0kB
[ 591.755515] lowmem_reserve[]: 0 0 0
[ 591.755516] Node 0 DMA: 1*4kB (M) 5*8kB (UME) 6*16kB (UME) 2*32kB (M) 4*64kB (UME) 4*128kB (ME) 5*256kB (UME) 4*512kB (ME) 52*1024kB (UM) = 57548kB
[ 591.755525] Node 0 Normal: 244*4kB (UME) 198*8kB (UMEH) 94*16kB (UMEH) 149*32kB (MEH) 31*64kB (UME) 10*128kB (UMEH) 2*256kB (MH) 2*512kB (MH) 5*1024kB (UM) = 18752kB
[ 591.755534] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1024kB
[ 591.755535] 3724 total pagecache pages
[ 591.755536] 74 pages in swap cache
[ 591.755537] Swap cache stats: add 98390, delete 98316, find 11531/42289
[ 591.755538] Free swap = 935816kB
[ 591.755538] Total swap = 1001096kB
[ 591.755539] 4194304 pages RAM
[ 591.755540] 0 pages HighMem/MovableOnly
[ 591.755540] 134740 pages reserved
[ 591.755541] 0 pages cma reserved
[ 591.755542] Unreclaimable slab info:
[ 591.755542] Name Used Total
[ 591.755614] nf_conntrack 128KB 128KB
[ 591.755680] abd_t 3KB 3KB
[ 591.755682] zio_data_buf_16384 64KB 64KB
[ 591.755683] zio_buf_16384 32KB 32KB
[ 591.755695] mod_hash_entries 3KB 3KB
[ 591.755697] spl_vn_file_cache 8KB 8KB
[ 591.755699] spl_vn_cache 8KB 8KB
[ 591.755740] scsi_sense_cache 1592KB 1592KB
[ 591.755741] qeth_buf 15KB 15KB
[ 591.755743] qdio_q 63KB 63KB
[ 591.755746] RAWv6 843KB 1000KB
[ 591.755747] TCPv6 90KB 90KB
[ 591.755749] mqueue_inode_cache 64KB 64KB
[ 591.755750] fuse_request 47KB 47KB
[ 591.755754] kioctx 126KB 126KB
[ 591.755755] dnotify_struct 4KB 4KB
[ 591.755756] pid_namespace 15KB 15KB
[ 591.755758] posix_timers_cache 15KB 15KB
[ 591.755759] UNIX 157KB 157KB
[ 591.755762] RAW 1152KB 1312KB
[ 591.755763] tw_sock_TCP 63KB 63KB
[ 591.755765] request_sock_TCP 31KB 31KB
[ 591.755766] TCP 126KB 126KB
[ 591.755767] hugetlbfs_inode_cache 63KB 63KB
[ 591.755769] eventpoll_pwq 23KB 23KB
[ 591.755770] PCI_DMA_region_tables 128KB 128KB
[ 591.755771] request_queue 122KB 122KB
[ 591.755772] blkdev_ioc 31KB 31KB
[ 591.755775] biovec-max 592KB 704KB
[ 591.755776] biovec-128 320KB 320KB
[ 591.755777] khugepaged_mm_slot 3KB 3KB
[ 591.755779] ksm_rmap_item 252KB 272KB
[ 591.755781] skbuff_ext_cache 32KB 32KB
[ 591.755783] skbuff_head_cache 432KB 496KB
[ 591.755784] configfs_dir_cache 3KB 3KB
[ 591.755785] file_lock_cache 63KB 63KB
[ 591.755786] fsnotify_mark_connector 16KB 16KB
[ 591.755787] net_namespace 56KB 56KB
[ 591.755789] task_delay_info 127KB 127KB
[ 591.755790] taskstats 63KB 63KB
[ 591.755791] proc_dir_entry 141KB 141KB
[ 591.755792] pde_opener 15KB 15KB
[ 591.755793] shmem_inode_cache 727KB 727KB
[ 591.755794] kernfs_iattrs_cache 31KB 31KB
[ 591.755923] kernfs_node_cache 14837KB 16391KB
[ 591.755924] mnt_cache 768KB 768KB
[ 591.755926] names_cache 288KB 352KB
[ 591.755927] uts_namespace 95KB 95KB
[ 591.755929] nsproxy 15KB 15KB
[ 591.755930] vm_area_struct 1124KB 1124KB
[ 591.755931] mm_struct 160KB 160KB
[ 591.755932] files_cache 416KB 416KB
[ 591.755933] signal_cache 1875KB 1875KB
[ 591.755934] sighand_cache 3339KB 3339KB
[ 591.755936] task_struct 6234KB 6247KB
[ 591.755941] cred_jar 766KB 1008KB
[ 591.755943] anon_vma_chain 292KB 292KB
[ 591.755944] anon_vma 284KB 284KB
[ 591.755945] pid 176KB 176KB
[ 591.755946] numa_policy 79KB 79KB
[ 591.755947] ftrace_event_field 187KB 187KB
[ 591.755948] pool_workqueue 704KB 704KB
[ 591.755949] task_group 352KB 352KB
[ 591.755970] vmap_area 714KB 921KB
[ 591.755971] dma-kmalloc-8k 64KB 64KB
[ 591.755974] dma-kmalloc-1k 1106KB 1376KB
[ 591.755976] dma-kmalloc-512 690KB 864KB
[ 591.755978] dma-kmalloc-256 32KB 32KB
[ 591.755979] dma-kmalloc-128 32KB 32KB
[ 591.755981] dma-kmalloc-64 63KB 108KB
[ 591.755982] dma-kmalloc-16 4KB 4KB
[ 591.755983] dma-kmalloc-8 4KB 4KB
[ 591.755985] dma-kmalloc-192 23KB 23KB
[ 591.755986] dma-kmalloc-96 7KB 7KB
[ 591.755989] kmalloc-8k 5984KB 6144KB
[ 591.755991] kmalloc-4k 5188KB 5248KB
[ 591.755999] kmalloc-2k 6284KB 6864KB
[ 591.756000] kmalloc-1k 53600KB 53600KB
[ 591.756003] kmalloc-512 2307KB 2368KB
[ 591.756004] kmalloc-256 1328KB 1328KB
[ 591.756005] kmalloc-192 1874KB 1874KB
[ 591.756006] kmalloc-128 232KB 232KB
[ 591.756007] kmalloc-96 1382KB 1382KB
[ 591.756010] kmalloc-64 1037KB 1044KB
[ 591.756012] kmalloc-32 890KB 900KB
[ 591.756015] kmalloc-16 337KB 356KB
[ 591.756016] kmalloc-8 256KB 256KB
[ 591.756017] kmem_cache_node 116KB 116KB
[ 591.756018] kmem_cache 928KB 928KB

Please find the attachment for the complete dmesg output after executing test_maps test.

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: linux-image-5.3.0-25-generic 5.3.0-25.27
ProcVersionSignature: Ubuntu 5.3.0-25.27-generic 5.3.13
Uname: Linux 5.3.0-25-generic s390x
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.11-0ubuntu8.3
Architecture: s390x
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Date: Thu Dec 12 04:44:29 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1:
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=UUID=d15a5734-c073-4c9a-80ad-65a914214bb8 crashkernel=196M BOOT_IMAGE=1
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-25-generic N/A
 linux-backports-modules-5.3.0-25-generic N/A
 linux-firmware 1.183.2
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Po-Hsu Lin (cypressyew)
description: updated
tags: added: 5.3
tags: added: ubuntu-bpf
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu Eoan):
status: New → Confirmed
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

It looks like the test will trigger OOM, but the autotest gets killed this time.

I tried with older 5.3.0-24 kernel, the test_maps passed one time, but the OOM error message can still be seen.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Can be reproduced with Eoan LPAR s390x 5.3.0-28-generic. Rate 4 out of 4 attempts.

If you ssh to the node, when this happens you will be disconnected. This is why the test got interrupted on jenkins.

tags: added: sru-20200106
Po-Hsu Lin (cypressyew)
tags: added: sru-20200217
Sean Feole (sfeole)
Changed in ubuntu-kernel-tests:
status: New → Triaged
Revision history for this message
Sean Feole (sfeole) wrote :

Confirmed, was able to reproduce this, using the kernel as stated in the bug. This also appears to break the networking after the failure.

transaction with reduced feature level UDP.
Mar 3 02:55:01 s2lp4 CRON[4294]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Mar 3 02:55:20 s2lp4 systemd[1]: Started Session 11 of user ubuntu.
Mar 3 02:56:48 s2lp4 systemd[1]: Started Session 12 of user ubuntu.
Mar 3 02:58:26 s2lp4 kernel: [ 320.392257]
Mar 3 02:58:26 s2lp4 kernel: [ 320.392258] **********************************************************
Mar 3 02:58:26 s2lp4 kernel: [ 320.392259] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392259] ** **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392260] ** trace_printk() being used. Allocating extra memory. **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392260] ** **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392261] ** This means that this is a DEBUG kernel and it is **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392261] ** unsafe for production use. **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392262] ** **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392263] ** If you see this message and you are not debugging **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392263] ** the kernel, report this immediately to your vendor! **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392264] ** **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392264] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
Mar 3 02:58:26 s2lp4 kernel: [ 320.392265] **********************************************************

Mar 3 02:58:37 s2lp4 kernel: [ 330.840545] sshd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

Connection to 10.245.80.42 closed by remote host.
Connection to 10.245.80.42 closed.
sfeole@bsg75:~$ ssh ubuntu@10.245.80.42
ssh_exchange_identification: read: Connection reset by peer

4: enP1p0s0d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000

Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Confirmed
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

I have the disk free space on / of s2lp4 freed from 6.6G to 8.2G, and this time the sru_misc test set has passed.

http://10.246.72.4:8080/view/sut-s2lp4/job/sru-misc__E_s390x.LPAR-generic__using_s2lp4__for_kernel/lastSuccessfulBuild/

I will give it another try.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

The test didn't manage to pass with the second attempt (run #11).

Revision history for this message
Sean Feole (sfeole) wrote :

Eoan is EOL and no longer in the usual round of testing. Closing bug

Changed in linux (Ubuntu Eoan):
status: Confirmed → Won't Fix
Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
Changed in ubuntu-kernel-tests:
status: Triaged → Won't Fix
Changed in ubuntu-z-systems:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.