fs064 tests are failing on master and wallaby - FATAL | try modifying forward dns record | undercloud | error={"changed": false, "msg": "`arecord` not found."

Bug #1988347 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
Critical
Unassigned

Bug Description

fs064 tests are failing overcloud deployment on master and wallaby with :

    FATAL | try modifying forward dns record | undercloud | error={"changed": false, "msg": "`arecord` not found."}

...

Error: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe create database if not exists `glance` character set `utf8` collate `utf8_general_ci`' returned 1: ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
2022-08-31 19:04:18 | Error: /Stage[main]/Glance::Db::Mysql/Openstacklib::Db::Mysql[glance]/Mysql_database[glance]/ensure: change from 'absent' to 'present' failed: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe create database if not exists `glance` character set `utf8` collate `utf8_general_ci`' returned 1: ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction
2022-08-31 19:04:18 | Error: Execution of '/usr/bin/mysql --defaults-extra-file=/root/.my.cnf -NBe create database if not exists `heat` character set `utf8` collate `utf8_general_ci`' returned 1: ERROR 1205 (HY000) at line 1: Lock wait timeout exceeded; try restarting transaction

Full overcloud deploy log:

https://logserver.rdoproject.org/56/44656/7/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master/c4f8bcc/logs/undercloud/home/zuul/overcloud1_deploy.log.txt.gz

Errors in Neutron API container:

https://logserver.rdoproject.org/56/44656/7/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master/c4f8bcc/logs/undercloud/var/log/extra/podman/containers/neutron_api/stdout.log.txt.gz

Other logs:

https://logserver.rdoproject.org/61/44661/7/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-wallaby/a30ae4b/logs/undercloud/home/zuul/overcloud1_deploy.log.txt.gz

Revision history for this message
Ronelle Landy (rlandy) wrote :

Comment from security team:

<dwilde> rlandy: thanks, it looks like https://opendev.org/x/tripleo-ipa/src/branch/master/tripleo_ipa/roles/tripleo_ipa_dns/tasks/dns.yaml#L57 is failing because the task isn’t parsing the zone item properly

tags: added: promotion-blocker
Changed in tripleo:
milestone: none → zed-1
importance: Undecided → Critical
status: New → Triaged
description: updated
Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Jakob Meng (jm1337) wrote :

On most (all?) jobs this error happens inconsistently though. For example, after periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master [2] had failed two times on this arecord issue, it passed [0].

Besides fs64 this issue is also affecting other jobs:
* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-security-master [1]
* periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master [3][4]
* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master [5][6]

[0] https://review.rdoproject.org/zuul/build/63a7bd6281634855a5260c2a9c56e197

[1] https://logserver.rdoproject.org/openstack-component-security/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-security-master/6ea2096/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[2] https://logserver.rdoproject.org/56/44656/5/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master/0c48b0f/logs/undercloud/home/zuul/overcloud1_deploy.log.txt.gz

[3] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master/a0594a2/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

[4] https://logserver.rdoproject.org/57/44657/6/check/periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master/8226ab0/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

[5] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master/545f2f1/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[6] https://logserver.rdoproject.org/56/44656/5/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master/40d3797/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

Revision history for this message
Ade Lee (alee-3) wrote :

I don't see how this error could be a cause for the failure of this job.

The code that is being executed is here:

https://opendev.org/x/tripleo-ipa/src/branch/master/tripleo_ipa/roles/tripleo_ipa_dns/tasks/dns.yaml#L50-L78

Essentially, the code attempts to modify the existing DNS entries, and if that fails, then in the rescue section adds the required entry.

We see this in the logs.

2022-08-31 18:42:18 | 2022-08-31 18:42:18.080724 | fa163edd-246c-3638-fe65-000000005895 | TIMING | tripleo_ipa_dns : set record type | undercloud | 0:14:39.445888 | 0.04s
2022-08-31 18:42:18 | 2022-08-31 18:42:18.089570 | fa163edd-246c-3638-fe65-000000005896 | TASK | add dns zone
2022-08-31 18:42:18 | 2022-08-31 18:42:18.929152 | fa163edd-246c-3638-fe65-000000005896 | OK | add dns zone | undercloud
2022-08-31 18:42:18 | 2022-08-31 18:42:18.931318 | fa163edd-246c-3638-fe65-000000005896 | TIMING | tripleo_ipa_dns : add dns zone | undercloud | 0:14:40.296473 | 0.84s
2022-08-31 18:42:18 | 2022-08-31 18:42:18.943993 | fa163edd-246c-3638-fe65-000000005898 | TASK | try modifying forward dns record
2022-08-31 18:42:19 | 2022-08-31 18:42:19.889906 | fa163edd-246c-3638-fe65-000000005898 | FATAL | try modifying forward dns record | undercloud | error={"changed": false, "msg": "`arecord` not found."}
2022-08-31 18:42:19 | 2022-08-31 18:42:19.891074 | fa163edd-246c-3638-fe65-000000005898 | TIMING | tripleo_ipa_dns : try modifying forward dns record | undercloud | 0:14:41.256232 | 0.95s
2022-08-31 18:42:19 | 2022-08-31 18:42:19.903305 | fa163edd-246c-3638-fe65-00000000589a | TASK | add forward dns record
2022-08-31 18:42:20 | 2022-08-31 18:42:20.806253 | fa163edd-246c-3638-fe65-00000000589a | CHANGED | add forward dns record | undercloud
2022-08-31 18:42:20 | 2022-08-31 18:42:20.807581 | fa163edd-246c-3638-fe65-00000000589a | TIMING | tripleo_ipa_dns : add forward dns record | undercloud | 0:14:42.172740 | 0.90s

Note that the execution does not halt. Rather it fails later on trying to set up mysql - which of course, causes a bunch of other stuff to fail.

I think this error is a red herring -- unless it points to some underlying change in DNS that may be causing the real failure.

Revision history for this message
Jakob Meng (jm1337) wrote :

Agree with @alee-3. Digging deeper into this shows that the jobs in comment #3 fail on different errors actually. For example..

* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-security-master failed with mysql issue 'WSREP has not yet prepared node for application use' during keystone_bootstrap [1]

* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master failed on mysql issues during keystone_bootstrap [2]

* periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master failed on https://bugs.launchpad.net/tripleo/+bug/1988053 [3]

* periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master failed on https://bugs.launchpad.net/tripleo/+bug/1988053 [4]

* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master failed on https://bugs.launchpad.net/tripleo/+bug/1987092 [5]

* periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master failed on mysql related issues [6]

So whenever one sees

  FATAL | try modifying forward dns record | undercloud | error={"changed": false, "msg": "`arecord` not found."}

the error is most likely further down that log file.

[1] https://logserver.rdoproject.org/openstack-component-security/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-security-master/6ea2096/logs/overcloud-controller-0/var/log/containers/keystone/keystone-manage.log.txt.gz

[2] https://logserver.rdoproject.org/56/44656/5/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-master/0c48b0f/logs/overcloud1-controller-0/var/log/containers/keystone/keystone-manage.log.txt.gz

[3] https://logserver.rdoproject.org/openstack-component-tripleo/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master/a0594a2/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

[4] https://logserver.rdoproject.org/57/44657/6/check/periodic-tripleo-ci-centos-9-standalone-on-multinode-ipa-tripleo-master/8226ab0/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

[5] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master/545f2f1/logs/overcloud-controller-1/var/log/containers/stdouts/galera-bundle.log.txt.gz

[6] https://logserver.rdoproject.org/56/44656/5/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039-master/40d3797/logs/overcloud-controller-0/var/log/containers/stdouts/nova_db_sync.log.txt.gz

Revision history for this message
Ronelle Landy (rlandy) wrote :

Closing this out - tracking real issues elsewhere

Changed in tripleo:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.