periodic-tripleo--master-containers-build is failing with "Failed to import docker-py - cannot import name __version__"

Bug #1829736 reported by Marios Andreou
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Sorin Sbarnea

Bug Description

The periodic-tripleo-centos-7-master-containers-build-push is failing consistently at [1][2][3] and more from [4] with a trace like:

        2019-05-20 00:27:40.446462 | LOOP [Login to RDO registry]
        2019-05-20 00:27:40.827524 | primary | ERROR: Item: True
        2019-05-20 00:27:40.828070 | primary | {
        2019-05-20 00:27:40.828163 | primary | "item": true,
        2019-05-20 00:27:40.828230 | primary | "msg": "Failed to import docker-py - cannot import name __version__. Try `pip install docker-py`"
        2019-05-20 00:27:40.828298 | primary | }
        2019-05-20 00:27:41.020116 | primary | ERROR: Item: False

This is obviously promotion blocker as the other jobs depend on this for the containers

[1] http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/85eee73/job-output.txt.gz
[2] http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/104725f/job-output.txt.gz
[3] http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/0825ed1/job-output.txt.gz
[4] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-7-master-containers-build-push

Tags: ci
Changed in tripleo:
milestone: none → train-1
Revision history for this message
wes hayutin (weshayutin) wrote :

https://cbs.centos.org/koji/buildinfo?buildID=19767 docker-py not supported by the rpm

Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :

looks like ansible 2.8 is the problem - ykarel has it green at https://review.rdoproject.org/r/#/c/20171/ with depends-on that blacklists 2.8

Revision history for this message
Alan Pevec (apevec) wrote :

@Yatin can we more details how is docker-py related with Ansible 2.8 ?

Revision history for this message
Alan Pevec (apevec) wrote :

from https://docs.ansible.com/ansible/latest/modules/docker_login_module.html

> Docker SDK for Python: Please note that the docker-py Python module has been superseded by docker (see here for details). For Python 2.6, docker-py must be used. Otherwise, it is recommended to install the docker Python module. Note that both modules should not be installed at the same time. Also note that when both modules are installed and one of them is uninstalled, the other might no longer function and a reinstall of it is required.

Revision history for this message
Alan Pevec (apevec) wrote :
Revision history for this message
Alan Pevec (apevec) wrote :

Where exactly is Failed to import docker-py - cannot import name __version__. Try `pip install docker-py`" coming from?

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

I've reproduced the issue locally with a minimal docker_login playbook. The actual problem is using ansible 2.5.15 in zuul, which is the actual version used by zuul and having 2.8 installed locally.

We'll revert 2.8 in RDO and keep inestigating this issue.

Revision history for this message
Marios Andreou (marios-b) wrote :

i note the similarity in the error message from https://bugs.launchpad.net/tripleo/+bug/1824301 "Failed to import docker or docker-py - No module named docker. Try `pip install docker` or `pip install docker-py` (Python 2.6)"}"

in that case though it was a matter of checking /usr/bin/rpm -q python2-docker and skipping when it wasn't there https://review.opendev.org/#/c/651789/6/roles/tripleo-docker-rm/tasks/main.yaml

Revision history for this message
Marios Andreou (marios-b) wrote :

Revert "Bump Ansible to 2.8 for Train" https://review.rdoproject.org/r/#/c/20860/

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

ansible-2.8.0 has been removed from deps repo in train.

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

Some technical details about the issue:

Yhe problem is related to PYTHONPATH and versions mess. When using ansible<2.8 to run the playbook (as in zuul-executor) but have ansible=2.8 installed in the test server it tries to import docker from /usr/lib/python2.7/site-packages/ansible/module_utils/docker/ instead of /usr/lib/python2.7/site-packages/docker.

ansible has refactored docker module to be a supbackage in https://github.com/ansible/ansible/commit/0c2bb3da043e38e779f91b6037d0999c1e60b862#diff-5f3137ef16fa6f5a89b619f82d4e6c57

This has broken it as apparently load order is wrong in previous versions of ansible. Having 2.8 in both systems works fine so it seems that loading order from paths has been fixed.

The options to get this fixed with ansible-2.8 are:

1. Update ansible to 2.8 in zuul executors. This depends on SF.
2. Remove ansible from the test server (it's installed before as part of the build-containers role which runs a nested playbook, see "TASK [build-containers : Run ansible playbook to configure docker]" in [1]).
3. Running the docker_login as a nested playbook too.

I'd say option 2 is better but i'm not sure if we could break something removing the system ansible for a later task.

[1] http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/85eee73/job-output.txt.gz

Revision history for this message
Marios Andreou (marios-b) wrote :

@alfredo I think option 3 sounds like the least disruptive assuming there are no technical blockers to doing that

thanks ykarel and amoralej - we have 3 green runs today on the periodic https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-7-master-containers-build-push

I am going to remove 'promotion blocker' from this. I am not sure if it is more or less confusing to close this and switch to a new bug with comment #13 to track bumping ansible back to 2.8

tags: removed: promotion-blocker
Revision history for this message
Marios Andreou (marios-b) wrote :

trying to bump ansible today with https://review.rdoproject.org/r/#/c/20989/ (& details on the fix in the commit message from ykarel there)

Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
assignee: nobody → Sorin Sbarnea (ssbarnea)
Revision history for this message
Sorin Sbarnea (ssbarnea) wrote :

Closing as http://cistatus.tripleo.org/promotion/ reports all those jobs as passing now as this can no longer be reproduced.

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.