Ubuntu
ubuntu-drivers-common package

Race condition on builders result in 100% fail rate

Bug #2077654 reported by Kuba Pawlak on 2024-08-22

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	ubuntu-drivers-common (Ubuntu)	Invalid	Undecided	Unassigned
	Jammy	Fix Released	Undecided	Unassigned

Bug Description

[ Impact ]
ADT testing fails on arm builders due to a racy test

[Fix]

Split the test avoiding the race.

[ Test Plan ]
Run ADT tests in PPA or locally with:
PYTHONPATH=. python3 tests/run test_ubuntu_drivers.DetectTest.test_system_driver_packages_force_install_nvidia

The error message to look for is:
======================================================================
FAIL: test_system_driver_packages_force_install_nvidia (test_ubuntu_drivers.DetectTest)
system_driver_packages() force install config points to an older version.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.10/unittest/mock.py", line 1379, in patched
    return func(*newargs, **newkeywargs)
  File "/home/ubuntu/ubuntu-drivers-common-0.9.6.2~0.22.04.6~ppa0/tests/test_ubuntu_drivers.py", line 1479, in test_system_driver_packages_force_install_nvidia
    self.assertFalse('nvidia-driver-510' in res_wrong_json)
AssertionError: True is not false

[ Where problems could occur ]
during ADT testing on builders

[ Other Info ]
Cause unknown. It sometimes fails when locally on x86 machine but only in Jammy container. Jammy package ran in Noble container does not fail.

See original description

Tags:

Kuba Pawlak (kuba-t-pawlak) on 2024-08-22

summary:	- Race condition on arm builders result in 100% fail rate + Race condition on builders result in 100% fail rate
description:	updated

Kuba Pawlak (kuba-t-pawlak) on 2024-08-23

description:

updated

Revision history for this message

Robie Basak (racb) wrote on 2024-08-28:

> + # Race condition may happen and the test fails. This only happens in Jammy containers.
> + # Disable the test for now.
> + return

I don't think it's appropriate to disable a test without an analysis that considers what it was testing, how to mitigate the gap created by disabling the test or an explanation of why it isn't necessary to run the test.

We also need to make arrangements to ensure that a future SRU or security update of this package will make the same mitigation.

In the lack of such an analysis or a proper fix, if it only sometimes fails, wouldn't it be better to just retry a few times to land the SRU?

Revision history for this message

Kuba Pawlak (kuba-t-pawlak) wrote on 2024-08-28:

Hi Robie,

the test is failing 100% on arm builder in PPA. It sometimes fails for amd64 too.
This issue was already there when the previous version of this package was introduced and it was also retried until it passed somehow. I could not make it pass myself.

Revision history for this message

Kuba Pawlak (kuba-t-pawlak) wrote on 2024-08-28:

the test origiannly was testing 6 different scenarios by injecting dependency in a separate file.
I split this test into 6 scenarios, each testing one thing only. for now it seems to work.

Revision history for this message

Robie Basak (racb) wrote on 2024-08-28:

OK, but how does this relate to the upload in Jammy unapproved that disables the test outright?

Revision history for this message

Kuba Pawlak (kuba-t-pawlak) wrote on 2024-08-28:

I disabled the test as there are some races there between different scenarios in one context.
I just tested that splitting this one big test so each smaller test only checks one thing, make all of them pass.
Instead of disabling it, refactor it.

Timo Aaltonen (tjaalton) on 2024-08-28

Changed in ubuntu-drivers-common (Ubuntu):
status:	New → Invalid
description:	updated

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2024-09-02: Please test proposed package

Hello Kuba, or anyone else affected,

Accepted ubuntu-drivers-common into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ubuntu-drivers-common/1:0.9.6.2~0.22.04.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ubuntu-drivers-common (Ubuntu Jammy):
status:	New → Fix Committed
tags:	added: verification-needed verification-needed-jammy

Revision history for this message

Kuba Pawlak (kuba-t-pawlak) wrote on 2024-09-09:

Test cases verified on Jammy ubuntu-drivers-common 1:0.9.6.2~0.22.04.7

tags:	added: verification-done-jammy removed: verification-needed-jammy
tags:	added: verification-done removed: verification-needed

Revision history for this message

Launchpad Janitor (janitor) wrote on 2024-09-10:

This bug was fixed in the package ubuntu-drivers-common - 1:0.9.6.2~0.22.04.7

---------------
ubuntu-drivers-common (1:0.9.6.2~0.22.04.7) jammy; urgency=medium

  * share/hybrid/71-u-d-c-gpu-detection.rules
    - Remove SimpleDRM device when nvidia-drm loads (LP: #2060268)
  * share/hybrid/gpu-manager.c
    - Wait for nvidia-drm to settle before opening /dev/dri/card*
    (LP: #2060268)
  * tests/test_ubuntu_drivers.py
   - Split a test to fix a race condition
    (LP: #2077654)
   - revert RLIMIT_NOFILE set and test (LP: #2077646)

-- Kuba Pawlak <email address hidden> Wed, 07 Aug 2024 13:45:38 +0200

Changed in ubuntu-drivers-common (Ubuntu Jammy):
status:	Fix Committed → Fix Released

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2024-09-10: Update Released

The verification of the Stable Release Update for ubuntu-drivers-common has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntuubuntu-drivers-common package

Race condition on builders result in 100% fail rate

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
ubuntu-drivers-common package