bionic autopkgtests failing (and blocking since for whatever reason they worked once before)

Bug #1734148 reported by Christian Ehrhardt 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
resource-agents (Debian)
Fix Released
Unknown
resource-agents (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Issues like this [1] block resource-agents and any other depending app atm.

I could isolate three issues:
- mysql not starting correctly
- a sed failing on ipaddr2
- named failing on s390x

[1]: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/s390x/r/resource-agents/20171122_132610_df704@/log.gz

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

mysqld can be fixed by fixing the correct socket path in debian/patches/mysql-path.patch

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

iproute test can be fixed by changing the sed to no more use patterns (as a device name could contain some).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

named only shows on s390x - so testing there now.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

To be clear the history of these tests is very bad [1], but the if we can fix it that is better than force badtest.

[1]: http://autopkgtest.ubuntu.com/packages/resource-agents

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

In Autopkgtest env it fails in 5/5 last runs, but it isn't reproducible in a s390x VM.

autopkgtest [12:09:59]: test command6: debian/tests/run-ocft named
autopkgtest [12:09:59]: test command6: [-----------------------
Making 'named':
    - case 0: check base env
    - case 1: check base env: invalid 'OCF_RESKEY_named'
    - case 2: normal start
    - case 3: normal stop
    - case 4: double start
    - case 5: double stop
    - case 6: running monitor
    - case 7: not running monitor
    - case 8: unimplemented command
    - case 9: non-existent user
Initializing 'named' ...
Starting bind9 (via systemctl): bind9.service.
Stopping bind9 (via systemctl): bind9.service.
Done.

named: check base env - /usr/share/resource-agents/ocft/caselib: line 83: 3227 Terminated setsid $aroot/$agent $cmd > /tmp/.ocft_runlog 2>&1
ERROR: The agent was hanging, killed it, maybe you damaged the agent or system's environment, see details below:
ocf-exit-reason:named didn't answer properly for localhost.
2017/11/22_12:10:12 ERROR: Expected: 127.0.0.1.
2017/11/22_12:10:12 ERROR: Got: ;; connection timed out; no servers could be reached

named: check base env: invalid 'OCF_RESKEY_named' - ERROR: './named monitor' failed, the return code is 1.
named: normal start - ERROR: './named monitor' failed, the return code is 1.
named: normal stop - ERROR: './named monitor' failed, the return code is 1.
named: double start - ERROR: './named monitor' failed, the return code is 1.
named: double stop - ERROR: './named monitor' failed, the return code is 1.
named: running monitor - ERROR: './named monitor' failed, the return code is 1.
named: not running monitor - ERROR: './named monitor' failed, the return code is 1.
named: unimplemented command - ERROR: './named monitor' failed, the return code is 1.
named: non-existent user - ERROR: './named monitor' failed, the return code is 1.

Therefore for this test mask it on s390x.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Testing in local autopkgtest is now successful on amd64.
Unfortunately Bileto no more gives me cross arch LP tests so I have to hope that my runs in containers were good enough (those worked).

While we discussed most of it on ubuntu-devel yesterday I'm prepping an MP for someone to take a 2nd look at the approach I've taken (before making things worse).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Changed in resource-agents (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package resource-agents - 1:4.1.0~rc1-1ubuntu1

---------------
resource-agents (1:4.1.0~rc1-1ubuntu1) bionic; urgency=medium

  * fix autopkgtests (LP: #1734148)
    - debian/tests/run-ocft: debian/tests/IPaddr2: run tests in verbose mode
    - debian/tests/IPaddr2: pick only one device to test (could break tests
      if multiple devices are on the default routes network)
    - debian/tests/IPaddr2: devices can have special characters, do not use
      patterns but literal sed replacements
    - debian/patches/mysql-path.patch: refresh for up to date paths
      - debian/tests/run-ocft: no need to disable apparmor on that test
        anymore
    - debian/tests/control: skip named test on Ubuntu autopkgtest
      infrastructure (retains the coverage of all the other cases compared
      to a force badtest hint)

 -- Christian Ehrhardt <email address hidden> Wed, 22 Nov 2017 15:57:30 +0100

Changed in resource-agents (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Kicked all currently blocked on these in update-excuses so the new version can unlock those as well.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Reported to Debian so they can harden the tests as well (also a chance to become a sync again).
Interested to see opinions about fixing mysql by path or by sticking with apparmor disable.

Changed in resource-agents (Debian):
status: Unknown → New
Changed in resource-agents (Debian):
status: New → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

With this fixed in Debian picking many of my changes I had for this Delta it works again.
That means this can become a sync again.
Thanks Valentin Vidic for testing my changes and accepting them (fixes) including the special Ubuntu-only workaround for named.

All former tests seem to work now, but there is a new issue on pgsql.

autopkgtest [14:24:42]: test command5: [-----------------------
Running: ocft make pgsql
Making 'pgsql':
    - case 0: check base env
    - case 1: check base env: invalid 'OCF_RESKEY_pgctl'
    - case 2: normal start
    - case 3: normal stop
    - case 4: double start
    - case 5: double stop
    - case 6: running monitor
    - case 7: not running monitor
    - case 8: unimplemented command
    - case 9: non-existent user
    - case 10: invalid user
Running: ocft test -v pgsql
Initializing 'pgsql' ...

ERROR: Install 'postgresql-server' failed.
WARNING: SETUP failed, break all tests of 'pgsql'.

Logging in right after that has all of it working.
A race?

Hmm, so for the old issues it could be a sync, but for the new issues it can't yet :-/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.