overcloud-prep-containers fails w/ read timeout

Bug #1819979 reported by wes hayutin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Unassigned

Bug Description

oooq keeps failing when deploying on Rocky with this error: [1]

UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.

I'm trying to debug why and find a workaround and need help, I've
exhausted my expertise and I'm hoping someone more knowledgable can
get me past this gating issue.

It is faling in the task overcloud-prep-containers:

Which generates this script (attached)

/home/stack/overcloud-prep-containers.sh

When the script executes on the undercloud it generates the
overcloud_prep_containers.log (attached)

The task exits and oooq aborts with this output:

     MSG:

     non-zero return code

Which I'm pretty sure is the result of the read time out mentioned
above.

I believe the code that is executing is in
tripleo-common/tripleo_common/image/image_uploader.py.
The tripleo_common version is 9.4.1

Because things seem to run in parallel I'm not exactly sure who is
catching the timeout error and emitting the error message. This
version of image_uploader seems to using the class
DockerImageUploader. Which leads me to believe the issue discussed in
[1] with defeating the use of a proxy with Docker may be the same root
cause. If it is, the article does not help me resolve the problem
because none of the systemd files discussed exist on the undercloud
nor do I understand how to defeat Docker proxy in oooq when the
undercloud node is yet to be created.

[1] This Red Hat Knowlege Base article seems to describe an almost
identical issue and offers a workaround, but I'm guessing thinngs have
changed because none of the file discussed in exist on the
undercloud. Also this article seems to be aimed at a deployment where
the undercloud is up and running and can be manually tweaked, but I
don't know how one would even apply the workaround in the middle of a
oooq deployment.

https://access.redhat.com/solutions/3356971

Tags: alert ci
Revision history for this message
wes hayutin (weshayutin) wrote :

Hey John,
Thanks for the report on the mailing list, but the best place to continue the conversation is a launchpad bug. I've gone ahead and opened it for you.
https://bugs.launchpad.net/tripleo/+bug/1819979

To apply the patch, simply run the same quickstart.sh command w/o tags "all". The default tags will setup your undercloud and lay down the scripts required for the rest of the deployment in /home/stack you should find the containers prep script in that dir and you can test the fix there.

Please move the conversation to the above bug.
Thanks

Revision history for this message
John Dennis (jdennis-a) wrote :

Thanks Wes, but I'm a little confused. What patch are you talking about? Are you saying if I don't use --tags all (which I was) then something somewhere in oooq patches something and proxy timeout is fixed?

Revision history for this message
wes hayutin (weshayutin) wrote :

sorry.. when I said patch I meant attempt the fix you found in the red hat docs.
We're seeing container registry issues all over the place atm. RDO registry is throwing 500's. Can you paste me your invocation w/ quickstart.sh so I can attempt the same thing you are?

Changed in tripleo:
milestone: stein-3 → stein-rc1
Revision history for this message
John Dennis (jdennis-a) wrote :

Here is the script I was using to run oooq, I was executing from the root of the tripleo-quickstart directory.

Revision history for this message
John Dennis (jdennis-a) wrote :
Revision history for this message
John Dennis (jdennis-a) wrote :
Revision history for this message
wes hayutin (weshayutin) wrote :

There are reports that ip's from rdo-cloud were blacklisted from docker.io yesterday, internal RH may also be blocked. We think it's resolved however we're looking into it.

wes hayutin (weshayutin)
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.