autopkgtest failures are ignored, apparently for packages with only superficial tests

Bug #2035101 reported by Jeremy Bícha
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Auto Package Testing
New
Undecided
Unassigned
britney
New
Undecided
Unassigned

Bug Description

Example 1
=========
Yesterday, I noticed that gnome-metronome 1.3.0-0ubuntu2 was allowed to migrate from mantic-proposed to mantic despite its own autopkgtest failing on every architecture. This migration happened after an automatic migration-reference run happened which returned the result "neutral". "neutral" is expected for superficial autopkgtests like the one in gnome-metronome. Before that migration-reference run, the failures were correctly shown on the excuses page.

Screenshots attached.

https://autopkgtest.ubuntu.com/packages/gnome-metronome/mantic/amd64

(I then uploaded 1.3.0-0ubuntu3 which fixed the autopkgtest regression)

Here's the gnome-metronome autopkgtest code:
https://salsa.debian.org/a-wai/gnome-metronome/-/blob/debian/master/debian/tests/control

Example 2
=========
https://ubuntu-archive-team.ubuntu.com/proposed-migration/update_excuses.html#webkit2gtk

does not show that the 2.41 series of webkit2gtk triggered an autopkgtest failure for devhelp

https://autopkgtest.ubuntu.com/packages/devhelp/mantic/amd64

However, Debian's britney is showing the failure:
https://release.debian.org/britney/pseudo-excuses-experimental.html#webkit2gtk

Here's the devhelp autopkgetst code:
https://salsa.debian.org/gnome-team/devhelp/-/blob/debian/latest/debian/tests/control

Theory
======
My theory is that failures are being ignored for packages which only have superficial autopkgtests, like for devhelp and gnome-metronome.

Possibly, this issue has been fixed in Debian's copy of britney already

Tags: adt-561
Revision history for this message
Jeremy Bícha (jbicha) wrote :
Revision history for this message
Jeremy Bícha (jbicha) wrote :
summary: - autopkgtest failures are ignored, apparently for superficial tests
+ autopkgtest failures are ignored, apparently for packages with only
+ superficial tests
Revision history for this message
Tim Andersson (andersson123) wrote :

So, just so I understand this correctly, is this a new failure you've noticed or just something that could've always been happening and you noticed now?

Revision history for this message
Jeremy Bícha (jbicha) wrote :

I believe this used to work correctly in previous years.

I believe it has been failing for several weeks or more but it took me until now to be alert enough tosee it happen live with screenshots for my first example and still happening while we wait for webkit2gtk to be eligible for migration.

(We are fixing the devhelp regression regardless.)

Revision history for this message
Brian Murray (brian-murray) wrote :

Example 2 could happen in the event that the autopkgtest.db files were out of sync on the autopkgtest-web units and britney used the one with old results. However, querying the database we can see that there are results for webkit2gtk and devhelp in both databases.

ubuntu@juju-4d1272-prod-proposed-migration-11:~$ sqlite3 autopkgtest.db "SELECT DISTINCT test.arch, test.package, result.triggers, result.run_id, result.exitcode FROM RESULT INNER JOIN test on result.test_id=test.id WHERE test.arch = 'amd64' AND test.package = 'devhelp' AND result.run_id LIKE '%202309%'"
amd64|devhelp|glib2.0/2.77.3-1|20230904_021459_84fbc@|8
amd64|devhelp|webkit2gtk/2.41.91-2|20230907_075337_8d55b@|4
amd64|devhelp|gsettings-desktop-schemas/45~rc-1ubuntu1|20230907_195316_59de4@|8
amd64|devhelp|webkit2gtk/2.41.92-1|20230909_052611_fd604@|4
amd64|devhelp|glib2.0/2.78.0-1|20230909_105044_9533a@|8
ubuntu@juju-4d1272-prod-proposed-migration-10:~/public$ sqlite3 autopkgtest.db "SELECT DISTINCT test.arch, test.package, result.triggers, result.run_id, result.exitcode FROM RESULT INNER JOIN test on result.test_id=test.id WHERE test.arch = 'amd64' AND test.package = 'devhelp' AND result.run_id LIKE '%202309%'"
amd64|devhelp|glib2.0/2.77.3-1|20230904_021459_84fbc@|8
amd64|devhelp|webkit2gtk/2.41.91-2|20230907_075337_8d55b@|4
amd64|devhelp|gsettings-desktop-schemas/45~rc-1ubuntu1|20230907_195316_59de4@|8
amd64|devhelp|webkit2gtk/2.41.92-1|20230909_052611_fd604@|4
amd64|devhelp|glib2.0/2.78.0-1|20230909_105044_9533a@|8

Revision history for this message
Jeremy Bícha (jbicha) wrote :

It looks like gnome-feeds 2.2.0-1 is being correctly prevented from migrating today even though it only has a superficial autopkgtest also. Its migration-reference run was last year.

Revision history for this message
Jeremy Bícha (jbicha) wrote :

I did some migration-reference/0 tests on the other architectures but maybe this is another example? With a neutral migration-reference/0 , this package or things triggering it should never have migrated, right?

https://autopkgtest.ubuntu.com/packages/r/rust-ureq/mantic/s390x

Revision history for this message
Jeremy Bícha (jbicha) wrote :

Here's another one today. libwacom 2.9.0-1 has a clear regression detected by its superficial autopkgtest but Ubuntu'sbritney is reporting "Not a regression" for s390x.

The regression was reported as https://bugs.debian.org/1060687 and Debian's britney is correctly detecting it as an autopkgtest regression.

https://autopkgtest.ubuntu.com/packages/libw/libwacom/noble/s390x

https://ubuntu-archive-team.ubuntu.com/proposed-migration/update_excuses.html#libwacom

Revision history for this message
Jeremy Bícha (jbicha) wrote :

Here's another screenshot. amd64 is reported as "Reference test in progress, but real test failed already". When the migration-reference/0 run returns "neutral", it will be reported as "Not a regression" and would be allowed to migrate if I hadn't noticed the issue and filed a block-proposed bug.

Revision history for this message
Tim Andersson (andersson123) wrote :

Just to understand correctly, the issue is results getting marked as neutral, then Britney sees these neutral results and incorrectly migrates the package? Is the issue at its core with Britney or autopkgtest?

Revision history for this message
Tim Andersson (andersson123) wrote :

Or that the test should never be returning Neutral at all?

Revision history for this message
Tim Andersson (andersson123) wrote :

When I get a bit more info I can look into a fix for you if it's autopkgtest related

Revision history for this message
Jeremy Bícha (jbicha) wrote :

superficial tests that pass are marked as neutral. In Debian's implementation, this means that a package that only has superficial autopkgtests will NOT get the autopkgtest "bounty" which reduces the number of days a package must wait in Unstable before automatically migrating to Testing. (The bounty is set to 3 so a typical upload will take 2 days instead of 5 days).

https://salsa.debian.org/ci-team/autopkgtest/-/blob/master/doc/README.package-tests.rst

https://salsa.debian.org/release-team/britney2/-/blob/master/etc/britney.conf#L81

However, if a superficial autopkgtest fails, as happens with libwacom 2.9.0-1 , it must be treated as a failure.

A migration-reference/0 of neutral should not allow a failing autopkgtest to migrate.

As of today, libwacom is still not fixed and is only held in proposed by my block-proposed bug so I think this is a clear test case right now.

Revision history for this message
Jeremy Bícha (jbicha) wrote :

My blind guess is that this is a britney configuration issue but I never looked very deeply into how britney and autopkgtest work.

Revision history for this message
Tim Andersson (andersson123) wrote :

To me, this seems like an issue best solved with Britney, rather than with autopkgtest. I believe, from the autopkgtest-cloud side, all we can do is mark migration-reference/0 + neutral results to, from now on, be "failures" instead to stop the packages migrating, but I don't think this is ideal. I think it's better this gets solved via Britney.

@brian @paride, do you guys have any thoughts?

Revision history for this message
Steve Langasek (vorlon) wrote :
Download full text (6.1 KiB)

https://ubuntu-archive-team.ubuntu.com/proposed-migration/log/noble/2024-01-12/23:28:17.log.gz shows that libwacom is in the 'autopkgtest-pending.json' for amd64 and s390x. So that's a bug, to start with.

(We have a bug in general that state/autopkgtest-pending.json does not get properly garbage-collected, so grows without bound. File is currently 112477 lines long. There are 22,000 results marked as 'Test in progress' on update_excuses, but it's still excessive?)

A deep dive into https://ubuntu-archive-team.ubuntu.com/proposed-migration/noble/update_excuses.yaml.xz (which shows information that update_excuses.html hides, for browser performance reasons) shows:

- excuses:
  - 'Migration status for libwacom (2.8.0-1 to 2.9.0-1): BLOCKED: Rejected/violates
    migration policy/introduces a regression'
  - 'Issues preventing migration:'
  - 'Additional info:'
  - 4 days old
  - Not touching package as requested in <a href="https://launchpad.net/bugs/2049237">bug
    2049237</a> on Fri Jan 12 22:25:14 2024
  is-candidate: false
  item-name: libwacom
  maintainer: Timo Aaltonen
  migration-policy-verdict: REJECTED_PERMANENTLY
  new-version: 2.9.0-1
  old-version: 2.8.0-1
  policy_info:
    age:
      age-requirement: 0
      current-age: 4.585891203703704
      verdict: PASS
    autopkgtest:
      cinnamon-control-center/5.8.2-1:
        amd64:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/amd64/c/cinnamon-control-center/20240113_163413_fd9be@/log.gz
        - https://autopkgtest.ubuntu.com/packages/c/cinnamon-control-center/noble/amd64
        - null
        - null
        arm64:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/arm64/c/cinnamon-control-center/20240113_084446_18593@/log.gz
        - https://autopkgtest.ubuntu.com/packages/c/cinnamon-control-center/noble/arm64
        - null
        - null
        armhf:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/armhf/c/cinnamon-control-center/20240113_030203_158e7@/log.gz
        - https://autopkgtest.ubuntu.com/packages/c/cinnamon-control-center/noble/armhf
        - null
        - null
        ppc64el:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/ppc64el/c/cinnamon-control-center/20240113_120123_f31b7@/log.gz
        - https://autopkgtest.ubuntu.com/packages/c/cinnamon-control-center/noble/ppc64el
        - null
        - null
        s390x:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/s390x/c/cinnamon-control-center/20240112_191145_ac481@/log.gz
        - https://autopkgtest.ubuntu.com/packages/c/cinnamon-control-center/noble/s390x
        - null
        - null
      libinput/1.23.0-2.1:
        amd64:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/amd64/libi/libinput/20240113_163212_eefe2@/log.gz
        - https://autopkgtest.ubuntu.com/packages/libi/libinput/noble/amd64
        - null
        - null
        arm64:
        - PASS
        - https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/arm64/libi/libinput/20240113_084221_c...

Read more...

Revision history for this message
Steve Langasek (vorlon) wrote :

There are upstream changes to britney2/policies/autopkgtest.py touching relevant code. I think the right next step here is to rebase our britney2 branch against upstream and see if this persists.

tags: added: adt-561
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.