intermittent dns issues during container build

Bug #1908556 reported by Scott Little
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Scott Little

Bug Description

Brief Description
-----------------
Intermittent failures are seen when building docker images on cengn.

In the most recent case, the following images all appear to have failed on DNS lookup.

rvmc
stx-fm-subagent
stx-libvirt
stx-nova-api-proxy
stx-oidc-client

There is no clear logs on the host indicating a general networking outage. I've added a cron job to test the network and log the result.

Is it possible the default name server 8.8.8.8 (google) is having a temporary outage,
and is at fault?

Suggest adding a backup name server within the docker images.

Severity
--------
Major

Steps to Reproduce
------------------
$MY_REPO/build-tools/build-docker-images/build-stx-images.sh --prefix master --latest-prefix master --os centos --stream stable --version 20201215T223645Z --base docker.io/starlingx/stx-centos:master-stable-20201215T223645Z --wheels http://mirror.starlingx.cengn.ca:80//mirror/starlingx/master/centos/containers/20201215T223645Z/outputs/wheels/stx-centos-stable-wheels.tar --user starlingx --registry docker.io --attempts 1 --latest --clean

Expected Behavior
------------------
images are built

Actual Behavior
----------------
sometimes images fail to build

Reproducibility
---------------
intermittent, typically late night.

System Configuration
--------------------
N/A

Branch/Pull Time/Commit
-----------------------
Dec 15 2020

Last Pass
---------
Dec 14 2020

Timestamp/Logs
--------------
...
fatal: unable to access 'https://github.com/coreos/go-oidc/': Could not resolve host: github.com
package github.com/coreos/go-oidc: exit status 128
The command '/bin/sh -c go get -d -v ./...' returned a non-zero code: 1
Command (docker) failed,
...
Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container error was
14: curl#6 - "Could not resolve host: mirrorlist.centos.org; Unknown error"
The command '/bin/sh -c yum -y update' returned a non-zero code: 1
...

Test Activity
-------------
Build

Workaround
----------
Try again at a later time

Ghada Khalil (gkhalil)
tags: added: stx.build
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium - Results in intermittent build failues, so would be good to fix

tags: added: stx.5.0
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Scott Little (slittle1)
Revision history for this message
Scott Little (slittle1) wrote :

Alternate DNS servers wouldn't help.

CENGN has improved there network setup. Seems to be behaving a lot better now. Will continue to monitor.

Changed in starlingx:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.