LTP uevent01 will fail on GCP n2d-standard-64

Bug #1999554 reported by Thadeu Lima de Souza Cascardo
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
linux (Ubuntu)
Confirmed
Undecided
Unassigned
udev (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Because udev takes too long to release the loop device, LTP test uevent01 (and some other tests) may warn about not being able to detach the loop device. This is identified as a failure.

ubuntu@selfprovisioned-cascardo-n2d:~$ sudo ./uevent01
tst_test.c:1423: TINFO: Timeout per run is 0h 05m 00s
tst_device.c:88: TINFO: Found free device 6 '/dev/loop6'
uevent01.c:24: TINFO: Attaching device /dev/loop6
uevent.h:49: TINFO: Got uevent:
uevent.h:52: TINFO: change@/devices/virtual/block/loop6
uevent.h:52: TINFO: ACTION=change
uevent.h:52: TINFO: DEVPATH=/devices/virtual/block/loop6
uevent.h:52: TINFO: SUBSYSTEM=block
uevent.h:52: TINFO: DISK_MEDIA_CHANGE=1
uevent.h:52: TINFO: MAJOR=7
uevent.h:52: TINFO: MINOR=6
uevent.h:52: TINFO: DEVNAME=loop6
uevent.h:52: TINFO: DEVTYPE=disk
uevent.h:52: TINFO: DISKSEQ=7
uevent.h:52: TINFO: SEQNUM=2959
uevent.h:140: TPASS: Got expected UEVENT
uevent.h:49: TINFO: Got uevent:
uevent.h:52: TINFO: change@/devices/virtual/block/loop6
uevent.h:52: TINFO: ACTION=change
uevent.h:52: TINFO: DEVPATH=/devices/virtual/block/loop6
uevent.h:52: TINFO: SUBSYSTEM=block
uevent.h:52: TINFO: MAJOR=7
uevent.h:52: TINFO: MINOR=6
uevent.h:52: TINFO: DEVNAME=loop6
uevent.h:52: TINFO: DEVTYPE=disk
uevent.h:52: TINFO: DISKSEQ=7
uevent.h:52: TINFO: SEQNUM=2960
uevent.h:140: TPASS: Got expected UEVENT
uevent01.c:26: TINFO: Detaching device /dev/loop6
tst_device.c:254: TWARN: ioctl(/dev/loop6, LOOP_CLR_FD, 0) no ENXIO for too long

Summary:
passed 2
failed 0
broken 0
skipped 0
warnings 1

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

On focal, this reproduces with 5.4 and 5.15 kernels.

One of the causes for the delay is that udev is reading efivars for secure boot, which takes 0.3 seconds. And it does it multiple times.

When running a 5.19 kernel on focal, however, the test seems to be able to detach the device before udev starts reading the efivars. Same udev version here. Perhaps events are pushed in a different order?

Cascardo.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1999554

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Changed in linux (Ubuntu):
status: Expired → Confirmed
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Hey, Po-Hsu Lin.

I marked this as a duplicate of #1999554, as I did some research on it before. So, many of these tests use a loop device and then try to clear it. The clear function will retry multiple times because udev may have a hold on the device. After a number of tries with some timeout, it will emit that WARNING because the loop device failed to clear. That is because udev still has a hold on it.

I debugged this and the reason for that is because udev will read efivars every once in a while (instead of caching the results), and reading efivar on those gcp instance types takes multiple seconds. That has been causing other issues, like nbd being busy as well and that test failing. I believe this should be fixed on udev.

Cascardo.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Awesome, let me mark another one as dup too.

Thanks for the info. I will add ubuntu-kernel-test project here for tracking purpose.

tags: added: 5.15 5.4 focal ubuntu-ltp ubuntu-ltp-stable ubuntu-ltp-syscalls
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in udev (Ubuntu):
status: New → Confirmed
Po-Hsu Lin (cypressyew)
tags: added: sru-20230227
tags: added: gcp gke sru-20221010
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.