tripleo gate jobs are failing to pull containers when running on ovh provider with "UNAUTHORIZED" error

Bug #1839532 reported by Ronelle Landy
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
wes hayutin

Bug Description

We have seen up to 30 gate failures in the tripleo queue in a period of 24 hours. Many of these failures show a variation of the following error:

{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"tripleomaster/centos-binary-nova-compute-ironic","Action":"pull"}]}]}

as show in the log file:

https://logs.opendev.org/63/674363/1/gate/tripleo-ci-centos-7-undercloud-containers/9285c4c/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz?level=ERROR

The following link in logstash shows the frequency of the error:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Name%5C%5C%5C%22%3A%5C%5C%5C%22tripleomaster%2Fcentos-binary%5C%22%20AND%20tags%3A%5C%22console%5C%22

(clicking the node_provider checkbox shows the provider where the nodes display this error).

Noting that these jobs pass regularly on other providers, we suspect a setup/infra issue here in on ovh.

Ronelle Landy (rlandy)
tags: added: alert
Changed in tripleo:
importance: Undecided → Critical
status: New → Triaged
milestone: none → train-3
Revision history for this message
Ronelle Landy (rlandy) wrote :

Related review for tracking error:
https://review.opendev.org/675453 Add query for UNAUTHORIZED error when pulling containers

Revision history for this message
Ronelle Landy (rlandy) wrote :

See also: <cloudnull> and asked for us to work in a re-auth on 401 which I did here - https://review.opendev.org/#/c/674672/

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Ronelle Landy (rlandy) wrote :
wes hayutin (weshayutin)
tags: added: promotion-blocker
Revision history for this message
chandan kumar (chkumar246) wrote :
Revision history for this message
chandan kumar (chkumar246) wrote :

http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2019-08-19.log.html#t2019-08-19T10:55:50
There is not an issue at FN with centos. I posted links showing that is was functional yesterday.

Revision history for this message
Alex Schultz (alex-schultz) wrote :

https://bugs.launchpad.net/tripleo/+bug/1840973 has been identified as an issue where we end up with 0 byte layers on the local registry

Revision history for this message
wes hayutin (weshayutin) wrote :

elastic-recheck is no longer showing this as an issue, closing

Changed in tripleo:
status: Triaged → Fix Released
wes hayutin (weshayutin)
Changed in tripleo:
assignee: nobody → wes hayutin (weshayutin)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.