freeipa not resolving mirror.regionone.rdo-cloud-tripleo.rdoproject.org broken trust chain resolving

Bug #1824772 reported by Quique Llorente
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Sagi (Sergey) Shnaidman

Bug Description

Periodic rocky and stein jobs for fs03 freeipa is fails resolving rdo mirrors:
http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein/c744d9e/logs/supplemental/var/log/journal.txt.gz
Apr 15 06:53:44 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/AAAA/IN': 38.145.33.91#53
Apr 15 06:53:44 ipa.ooo.test named-pkcs11[18919]: validating mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A: bad cache hit (org/DS)
Apr 15 06:53:44 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A/IN': 38.145.33.91#53
Apr 15 06:53:44 ipa.ooo.test named-pkcs11[18919]: validating mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A: bad cache hit (org/DS)
Apr 15 06:53:44 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A/IN': 38.145.33.91#53
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: validating mirror.regionone.rdo-cloud-tripleo.rdoproject.org/AAAA: bad cache hit (org/DS)
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/AAAA/IN': 38.145.33.91#53
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: validating mirror.regionone.rdo-cloud-tripleo.rdoproject.org/AAAA: bad cache hit (org/DS)
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/AAAA/IN': 38.145.33.91#53
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: validating mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A: bad cache hit (org/DS)
Apr 15 06:53:47 ipa.ooo.test named-pkcs11[18919]: broken trust chain resolving 'mirror.regionone.rdo-cloud-tripleo.rdoproject.org/A/IN': 38.145.33.91#53

Changed in tripleo:
milestone: stein-rc1 → train-1
Revision history for this message
Ronelle Landy (rlandy) wrote :

fs039 tests have been failing on most periodic and check jobs.
Results are pretty bleak on http://cistatus.tripleo.org/promotion/.

The current trust chain issue may or may not be consistent but the failures here are.

Revision history for this message
Ronelle Landy (rlandy) wrote :

failed again on current promotion for master:
2019-04-15 16:01:58.913261 | primary | http://mirror.regionone.rdo-cloud-tripleo.rdoproject.org/centos/7/os/x86_64/Packages/dstat-0.7.2-12.el7.noarch.rpm: [Errno 14] curl#6 - "Could not resolve host: mirror.regionone.rdo-cloud-tripleo.rdoproject.org; Unknown error"
2019-04-15 16:01:58.913363 | primary | Trying other mirror.
2019-04-15 16:01:58.913384 | primary |
2019-04-15 16:01:58.913401 | primary |
2019-04-15 16:01:58.913451 | primary | Error downloading packages:
2019-04-15 16:01:58.913559 | primary | dstat-0.7.2-12.el7.noarch: [Errno 256] No more mirrors to try.
2019-04-15 16:01:58.913581 | primary |

but stein passed:
periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein success

Revision history for this message
Ronelle Landy (rlandy) wrote :

Tried this revert:
https://review.openstack.org/#/c/652798/

gets by the undercloud install issue - still fails overcloud deploy

Revision history for this message
Ronelle Landy (rlandy) wrote :

Failing master check jobs

tags: added: alert
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Quique Llorente (quiquell) wrote :
Alan Pevec (apevec)
summary: - freeip not resolving mirror.regionone.rdo-cloud-tripleo.rdoproject.org
+ freeipa not resolving mirror.regionone.rdo-cloud-tripleo.rdoproject.org
broken trust chain resolving
Revision history for this message
Ronelle Landy (rlandy) wrote :

Bug related to overcloud deploy failures:

<sshnaidm> we have this bug too: https://bugs.launchpad.net/tripleo/+bug/1821377
<openstack> Launchpad bug 1821377 in tripleo "TLS everywhere deployments fail when using many composable networks" [Medium,In progress] - Assigned to Harald Jensås (harald-jensas)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/653135

Revision history for this message
Ronelle Landy (rlandy) wrote :

https://review.openstack.org/#/c/646005/ merged - will see if that fixes the overcloud deploy failures

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart (master)

Change abandoned by Ronelle Landy (<email address hidden>) on branch: master
Review: https://review.openstack.org/653135

Revision history for this message
Quique Llorente (quiquell) wrote :

Sometimes we get a very slow response from freeipa (;; Query time: 1389 msec)
[zuul@undercloud ~]$ dig @10.0.0.250 pypi.org

; <<>> DiG 9.9.4-RedHat-9.9.4-73.el7_6 <<>> @10.0.0.250 pypi.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22263
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;pypi.org. IN A

;; ANSWER SECTION:
pypi.org. 60 IN A 151.101.64.223
pypi.org. 60 IN A 151.101.128.223
pypi.org. 60 IN A 151.101.0.223
pypi.org. 60 IN A 151.101.192.223

;; Query time: 1389 msec
;; SERVER: 10.0.0.250#53(10.0.0.250)
;; WHEN: Wed Apr 17 06:48:34 UTC 2019
;; MSG SIZE rcvd: 101

Maybe we need to re-dimension cache at tripleo infra DNS server 38.145.33.91

Revision history for this message
Quique Llorente (quiquell) wrote :

Looks like freeipa was using router DNS servers not tripleo infra one or google's one

https://review.openstack.org/#/c/653174

Changed in tripleo:
assignee: Quique Llorente (quiquell) → Sagi (Sergey) Shnaidman (sshnaidm)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.openstack.org/653174
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=a33bdae59fecd103e4001ad304e9dc77e0623294
Submitter: Zuul
Branch: master

commit a33bdae59fecd103e4001ad304e9dc77e0623294
Author: Sagi Shnaidman <email address hidden>
Date: Wed Apr 17 02:44:59 2019 +0300

    Add explicit DNS forwarders to TLS job

    Current settings "--auto-forwarders" used as DNS server address
    of the router in the tenant. Because it could delay requests let's
    use explicit DNS forwarders from custome_nameserver variable.
    Closes-Bug: #1824772
    Change-Id: I31fe475843a751aefe3d8b574edf42ff762aaab8

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.