move_pages12 test from ubuntu_ltp_syscalls failed on X/B/D

Bug #1831043 reported by Po-Hsu Lin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Triaged
Undecided
Unassigned
linux (Ubuntu)
Confirmed
Undecided
Unassigned
Xenial
Confirmed
Undecided
Unassigned
Bionic
Confirmed
Undecided
Unassigned
Disco
Won't Fix
Undecided
Unassigned
linux-azure (Ubuntu)
Confirmed
Undecided
Unassigned
Xenial
Confirmed
Undecided
Unassigned
Bionic
Confirmed
Undecided
Unassigned
Disco
Won't Fix
Undecided
Unassigned

Bug Description

This is a new test case landed 8 days ago, but we already have the patch in B, and it looks like this is not failing across all the nodes:
    move_pages12.c:114: FAIL: move_pages failed: ENOMEM

<<<test_start>>>
tag=move_pages12 stime=1559209337
cmdline="move_pages12"
contacts=""
analysis=exit
<<<test_output>>>
incrementing stop
tst_test.c:1096: INFO: Timeout per run is 0h 05m 00s
move_pages12.c:235: INFO: Free RAM 31883452 kB
move_pages12.c:253: INFO: Increasing 2048kB hugepages pool on node 0 to 4
move_pages12.c:263: INFO: Increasing 2048kB hugepages pool on node 1 to 4
move_pages12.c:179: INFO: Allocating and freeing 4 hugepages on node 0
move_pages12.c:179: INFO: Allocating and freeing 4 hugepages on node 1
move_pages12.c:169: PASS: Bug not reproduced
move_pages12.c:114: FAIL: move_pages failed: ENOMEM
move_pages12.c:81: FAIL: madvise failed: SUCCESS
move_pages12.c:81: FAIL: madvise failed: SUCCESS
....
move_pages12.c:81: FAIL: madvise failed: SUCCESS
move_pages12.c:81: FAIL: madvise failed: SUCCESS

Summary:
passed 1
failed 998
skipped 0
warnings 0
<<<execution_status>>>
initiation_status="ok"
duration=4 termination_type=exited termination_id=1 corefile=no
cutime=93 cstime=474
<<<test_end>>>

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-50-generic 4.15.0-50.54
ProcVersionSignature: User Name 4.15.0-50.54-generic 4.15.18
Uname: Linux 4.15.0-50-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 30 09:29 seq
 crw-rw---- 1 root audio 116, 33 May 30 09:29 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu May 30 09:42:46 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
MachineType: HP ProLiant DL360 Gen9
PciMultimedia:

ProcFB: 0 mgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-50-generic root=UUID=6422cfdd-2a69-4c0b-9784-6809a77ab980 ro
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-50-generic N/A
 linux-backports-modules-4.15.0-50-generic N/A
 linux-firmware 1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/25/2017
dmi.bios.vendor: HP
dmi.bios.version: P89
dmi.board.name: ProLiant DL360 Gen9
dmi.board.vendor: HP
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrP89:bd04/25/2017:svnHP:pnProLiantDL360Gen9:pvr:rvnHP:rnProLiantDL360Gen9:rvr:cvnHP:ct23:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant DL360 Gen9
dmi.sys.vendor: HP

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Test passed on Cosmic.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Po-Hsu Lin (cypressyew) wrote : Re: move_pages12 test from ubuntu_ltp_syscalls failed on B

Issue found on a P8 node with 4.15 kernel on Xenial as well.

tags: added: ppc64el
tags: added: ubuntu-ltp-syscalls
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Issue found with 4.4 kernel on Trusty, only spotted on a Power8 node.

tags: added: 4.15 4.4 sru-20190701
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

And Xenial 4.4, with the Power8 node as well.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

The error message for Xenial kernel is a bit different, it didn't complain about ENOMEM, just:
move_pages12.c:81: FAIL: madvise failed: SUCCESS

Brad Figg (brad-figg)
tags: added: ubuntu-certified
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

From all 18 aws testing node with B-5.0, this is only failing on i3.metal and i3.en

Revision history for this message
Po-Hsu Lin (cypressyew) wrote : Re: move_pages12 test from ubuntu_ltp_syscalls failed on X/B

In this cycle it's failing with the following message on X-ppc:
    FAIL: madvise failed: EIO

summary: - move_pages12 test from ubuntu_ltp_syscalls failed on B
+ move_pages12 test from ubuntu_ltp_syscalls failed on X/B
tags: added: xenial
Po-Hsu Lin (cypressyew)
summary: - move_pages12 test from ubuntu_ltp_syscalls failed on X/B
+ move_pages12 test from ubuntu_ltp_syscalls failed on X/B/D
tags: added: disco
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

P9 Disco is failing with:
 tag=move_pages12 stime=1568942645 dur=0 exit=exited stat=2 core=no cu=11 cs=113
 startup='Fri Sep 20 01:24:05 2019'
 tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
 move_pages12.c:263: INFO: Free RAM 111047168 kB
 move_pages12.c:281: INFO: Increasing 2048kB hugepages pool on node 0 to 4
 move_pages12.c:291: INFO: Increasing 2048kB hugepages pool on node 8 to 4
 move_pages12.c:207: INFO: Allocating and freeing 4 hugepages on node 0
 move_pages12.c:207: INFO: Allocating and freeing 4 hugepages on node 8
 move_pages12.c:197: PASS: Bug not reproduced
 tst_test.c:1163: BROK: Test killed by SIGBUS!

 Summary:
 passed 1
 failed 0
 skipped 0
 warnings 0
 move_pages12.c:131: FAIL: move_pages failed: EINVAL (22)

Po-Hsu Lin (cypressyew)
tags: added: sru-20190902
Po-Hsu Lin (cypressyew)
tags: added: sru-20191111
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

For X-Azure 4.15, it's not happening on all the instances, spotted on
* Standard_DS15_v2
* Standard_D48_v3

Sean Feole (sfeole)
tags: added: sru-20191202
Revision history for this message
Sean Feole (sfeole) wrote :

12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:263: INFO: Free RAM 260417808 kB
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:281: INFO: Increasing 2048kB hugepages pool on node 0 to 4
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:291: INFO: Increasing 2048kB hugepages pool on node 1 to 4
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:207: INFO: Allocating and freeing 4 hugepages on node 0
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:207: INFO: Allocating and freeing 4 hugepages on node 1
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:197: PASS: Bug not reproduced
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)
12/09 20:34:48 DEBUG| utils:0153| [stdout] move_pages12.c:95: FAIL: madvise failed: EIO (5)

Changed in ubuntu-kernel-tests:
status: New → Triaged
Revision history for this message
Sean Feole (sfeole) wrote :

Can be found on Bionic ARM64 - 4.15.0-73.82-generic console output above in previous comment

Sean Feole (sfeole)
tags: added: sru-20200106
Po-Hsu Lin (cypressyew)
tags: added: sru-20200127
Sean Feole (sfeole)
tags: added: sru-20200217
Steve Langasek (vorlon)
Changed in linux (Ubuntu Disco):
status: New → Won't Fix
Changed in linux-azure (Ubuntu Disco):
status: New → Won't Fix
tags: added: sru-20200921
Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

failed on Bionic/azure-4.15 : 4.15.0-1097.107 : amd64

tags: added: azure
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Still visible in Xenial 4.4.0-1118.132 AWS c5.metal

tags: added: sru-20201109
tags: added: aws
Po-Hsu Lin (cypressyew)
tags: added: oracle
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Failed on xenial/linux-oracle: 4.15.0-1069.77~16.04.1 for sru-20210315

tags: added: sru-20210315
tags: added: aws-hwe sru-20210412
Revision history for this message
Krzysztof Kozlowski (krzk) wrote :
Download full text (3.2 KiB)

bionic/azure 4.15.0-1122.135 and bionic/azure-fips 4.15.0-2034.38, only on Azure instance Standard_D48_v3 has a slightly different error message:
04:06:33 DEBUG| [stdout] move_pages12.c:274: TINFO: Free RAM 194497904 kB
04:06:33 DEBUG| [stdout] move_pages12.c:292: TINFO: Increasing 2048kB hugepages pool on node 0 to 4
04:06:33 DEBUG| [stdout] move_pages12.c:302: TINFO: Increasing 2048kB hugepages pool on node 1 to 4
04:06:33 DEBUG| [stdout] move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 0
04:06:33 DEBUG| [stdout] move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 1
04:06:33 DEBUG| [stdout] move_pages12.c:208: TPASS: Bug not reproduced
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
04:06:33 DEBUG| [stdout] move_pages...

Read more...

tags: added: fips-test sru-20210719
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Spotted on Azure cloud with instance "Standard_D48_v3" on T-azure-4.15.0-1124.137~14.04.1

Note that sometime this will be marked as skipped on this instance with:
  move_pages_support.c:407: TCONF: at least 2 allowed NUMA nodes are required

Test failed with:
move_pages12.c:274: TINFO: Free RAM 195061872 kB
move_pages12.c:292: TINFO: Increasing 2048kB hugepages pool on node 0 to 4
move_pages12.c:302: TINFO: Increasing 2048kB hugepages pool on node 1 to 4
move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 0
move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 1
move_pages12.c:208: TPASS: Bug not reproduced
....
move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
move_pages12.c:106: TFAIL: madvise failed: ENOMEM (12)
move_pages12.c:208: TPASS: Bug not reproduced

HINT: You _MAY_ be missing kernel fixes, see:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e66f17ff7177
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9d398fa2378
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4643d67e8cb0
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f4b815a439a

Summary:
passed 2
failed 876
broken 0
skipped 0
warnings 0
INFO: ltp-pan reported some tests FAIL

tags: added: sru-20210906
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Xenial):
status: New → Confirmed
Changed in linux-azure (Ubuntu Bionic):
status: New → Confirmed
Changed in linux-azure (Ubuntu Xenial):
status: New → Confirmed
Changed in linux-azure (Ubuntu):
status: New → Confirmed
Revision history for this message
Krzysztof Kozlowski (krzk) wrote :

Found on 2021.09.27/bionic/linux-azure-fips/4.15.0-2037.41 only instance Standard_D48_v3

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Failed on T-aws-4.4 with:
Running tests.......
tst_test.c:1313: TINFO: Timeout per run is 0h 05m 00s
move_pages12.c:274: TINFO: Free RAM 525748832 kB
move_pages12.c:292: TINFO: Increasing 2048kB hugepages pool on node 0 to 4
move_pages12.c:302: TINFO: Increasing 2048kB hugepages pool on node 1 to 4
move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 0
move_pages12.c:218: TINFO: Allocating and freeing 4 hugepages on node 1
move_pages12.c:208: TPASS: Bug not reproduced
tst_test.c:1363: TBROK: Test killed by SIGBUS!
move_pages12.c:142: TFAIL: move_pages failed: ESRCH (3)

On:
i3.metal|i3en.24xlarge|r5.metal

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.