Test in ubuntu_ltp_syscalls / ubuntu_ltp_stable.commands failed with TWARN: ioctl(/dev/loop5, LOOP_CLR_FD, 0) no ENXIO for too long on Google N2D instances

Bug #2012695 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned

Bug Description

Issue spotted on:
 * F-gcp-fips 5.4.0-1102.111+fips1
 * F-gcp 5.4.0-1102.111
 * F-gcp-5.15 5.15.0-1031.38~20.04.1
 * F-gke 5.4.0-1096.103
 * F-gke-5.15 5.15.0-1029.34~20.04.1

With n2d-standard-2 and n2d-standard-64 instances only.

Jammy 5.15 gcp / gke kernel does not have this issue.

This is not a regression, I can see this issue way back to sru-20220509, probably exists when we first add N2D instances to the pool.

Even the test is not failing, but the return value is 1 due to warning issued (same as bug 2009943)

Output example:
INFO: Test start time: Sat Mar 4 00:54:03 UTC 2023
COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 18684 -n 18684 -f /tmp/ltp-3k5fq0dsFO/alltests -l /dev/null -C /dev/null -T /dev/null
LOG File: /dev/null
FAILED COMMAND File: /dev/null
TCONF COMMAND File: /dev/null
Running tests.......
tst_test.c:1526: TINFO: Timeout per run is 0h 00m 30s
tst_device.c:89: TINFO: Found free device 5 '/dev/loop5'
ioctl_loop01.c:85: TPASS: /sys/block/loop5/loop/partscan = 0
ioctl_loop01.c:86: TPASS: /sys/block/loop5/loop/autoclear = 0
ioctl_loop01.c:87: TPASS: /sys/block/loop5/loop/backing_file = '/tmp/ltp-3k5fq0dsFO/ioceZc4p8/test.img'
ioctl_loop01.c:57: TPASS: get expected lo_flag 12
ioctl_loop01.c:59: TPASS: /sys/block/loop5/loop/partscan = 1
ioctl_loop01.c:60: TPASS: /sys/block/loop5/loop/autoclear = 1
ioctl_loop01.c:69: TPASS: access /dev/loop5p1 succeeds
ioctl_loop01.c:75: TPASS: access /sys/block/loop5/loop5p1 succeeds
ioctl_loop01.c:91: TINFO: Test flag can be clear
ioctl_loop01.c:57: TPASS: get expected lo_flag 8
ioctl_loop01.c:59: TPASS: /sys/block/loop5/loop/partscan = 1
ioctl_loop01.c:60: TPASS: /sys/block/loop5/loop/autoclear = 0
ioctl_loop01.c:69: TPASS: access /dev/loop5p1 succeeds
ioctl_loop01.c:75: TPASS: access /sys/block/loop5/loop5p1 succeeds
tst_device.c:255: TWARN: ioctl(/dev/loop5, LOOP_CLR_FD, 0) no ENXIO for too long

Summary:
passed 13
failed 0
broken 0
skipped 0
warnings 1
INFO: ltp-pan reported some tests FAIL
LTP Version: 20220527
INFO: Test end time: Sat Mar 4 00:54:14 UTC 2023

Affected test cases in ubuntu_ltp_syscalls are:
 * ioctl_loop01
 * ioctl_loop03
 * ioctl_loop04
 * ioctl_loop05
 * ioctl_loop06
 * ioctl_loop07
 * lchown03
 * lchown03_16
 * linkat02
 * mknod07
 * mknodat02
 * mount01
 * mount04
 * mount06

Affected test cases in ubuntu_ltp_commands are:
(on 5.4)
 * df01_exfat_sh
 * df01_ext2_sh
 * df01_ext3_sh
 * df01_ext4_sh
 * df01_vfat_sh
 * df01_xfs_sh
 * mkfs01_btrfs_sh
 * mkfs01_ext2_sh
 * mkfs01_ext3_sh
 * mkfs01_minix_sh
 * mkfs01_msdos_sh
 * mkfs01_vfat_sh
 * mkfs01_xfs_sh

(on 5.15)
  * df01_exfat_sh
  * df01_ext2_sh
  * df01_ext3_sh
  * df01_ext4_sh
  * df01_ntfs_sh
  * df01_vfat_sh
  * df01_xfs_sh
  * mkfs01_btrfs_sh
  * mkfs01_ext2_sh
  * mkfs01_ext3_sh
  * mkfs01_ext4_sh
  * mkfs01_minix_sh
  * mkfs01_msdos_sh
  * mkfs01_ntfs_sh
  * mkfs01_sh
  * mkfs01_vfat_sh
  * mkfs01_xfs_sh

One example output:
 startup='Sat Mar 18 02:20:51 2023'
 df01 1 TINFO: timeout per run is 0h 5m 0s
 tst_device.c:89: TINFO: Found free device 6 '/dev/loop6'
 df01 1 TCONF: 'mkfs.exfat' not found
 tst_device.c:255: TWARN: ioctl(/dev/loop6, LOOP_CLR_FD, 0) no ENXIO for too long

 Usage: tst_device acquire [size [filename]]
    or: tst_device release /path/to/device

 df01 1 TWARN: Failed to release device '/dev/loop6'
 df01 1 TINFO: AppArmor enabled, this may affect test results
 df01 1 TINFO: it can be disabled with TST_DISABLE_APPARMOR=1 (requires super/root)
 df01 1 TINFO: loaded AppArmor profiles: none

 Summary:
 passed 0
 failed 0
 broken 0
 skipped 1
 warnings 1

Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ):

This issue is also affecting ubuntu_ltp_stable tests on N2D instances. Bug description updated.

summary: - ioctl related test in ubuntu_ltp_syscalls failed with TWARN:
- ioctl(/dev/loop5, LOOP_CLR_FD, 0) no ENXIO for too long on Google N2D
- instances
+ Test in ubuntu_ltp_syscalls / ubuntu_ltp_stable.commands failed with
+ TWARN: ioctl(/dev/loop5, LOOP_CLR_FD, 0) no ENXIO for too long on Google
+ N2D instances
tags: added: ubuntu-ltp-stable
Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
description: updated
Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hey, Po-Hsu Lin.

I marked this as a duplicate of #1999554, as I did some research on it before. So, many of these tests use a loop device and then try to clear it. The clear function will retry multiple times because udev may have a hold on the device. After a number of tries with some timeout, it will emit that WARNING because the loop device failed to clear. That is because udev still has a hold on it.

I debugged this and the reason for that is because udev will read efivars every once in a while (instead of caching the results), and reading efivar on those gcp instance types takes multiple seconds. That has been causing other issues, like nbd being busy as well and that test failing. I believe this should be fixed on udev.

Cascardo.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.