tl;dr _something_ is removing the zuul user from the docker groups and that is the root cause of the bug.
Looking at undercloud journal just before the failing overcloud prep containers you can see (this is for the originally reference change in the bug description @ http://logs.openstack.org/87/572187/2/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/ca1ee14/logs/undercloud):
Jun 05 12:19:36 centos-7-rax-dfw-0004294805 sudo[17190]: zuul : TTY=pts/1 ; PWD=/home/zuul ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-banjdiooyiqxxhrkxjetvzfpsuqhlbcj; /usr/bin/python /home/zuul/.ansible/tmp/ansible-tmp-1528201174.53-40600594451444/user.py; rm -rf "/home/zuul/.ansible/tmp/ansible-tmp-1528201174.53-40600594451444/" > /dev/null 2>&1 Jun 05 12:19:36 centos-7-rax-dfw-0004294805 python[17193]: ansible-user Invoked with comment=None ssh_key_bits=0 update_password=always non_unique=False force=False ssh_key_type=rsa ssh_key_passphrase=NOT_LOGGING_PARAMETER createhome=True uid=None home=None append=False skeleton=None ssh_key_comment=ansible-generated on centos-7-rax-dfw-0004294805 group=None system=False state=present local=None shell=None expires=None ssh_key_file=None groups=['mock'] move_home=False password=NOT_LOGGING_PARAMETER name=zuul seuser=None remove=False login_class=None generate_ssh_key=None Jun 05 12:19:37 centos-7-rax-dfw-0004294805 usermod[17201]: delete 'zuul' from group 'docker' Jun 05 12:19:37 centos-7-rax-dfw-0004294805 usermod[17201]: delete 'zuul' from shadow group 'docker'
(the prep containers, which fails is just after this starting @ 12:20 http://logs.openstack.org/87/572187/2/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/ca1ee14/logs/undercloud/home/zuul/upgrade_overcloud_prep_containers.log.txt.gz#_2018-06-05_12_21_08 ).
You can see the same thing on sbaker test review (undercloud journal @ http://logs.openstack.org/63/572663/1/experimental/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/01a5631/logs/undercloud/var/log/journal.txt.gz)
Jun 06 09:46:59 centos-7-rax-iad-0004313964 sudo[18175]: zuul : TTY=pts/1 ; PWD=/home/zuul ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-aqycodoijcsiqhjqpomuilxhhwadxqxi; /usr/bin/python /home/zuul/.ansible/tmp/ansible-tmp-1528278417.64-135284418991881/user.py; rm -rf "/home/zuul/.ansible/tmp/ansible-tmp-1528278417.64-135284418991881/" > /dev/null 2>&1 Jun 06 09:46:59 centos-7-rax-iad-0004313964 python[18178]: ansible-user Invoked with comment=None ssh_key_bits=0 update_password=always non_unique=False force=False ssh_key_type=rsa ssh_key_passphrase=NOT_LOGGING_PARAMETER createhome=True uid=None home=None append=False skeleton=None ssh_key_comment=ansible-generated on centos-7-rax-iad-0004313964 group=None system=False state=present local=None shell=None expires=None ssh_key_file=None groups=['mock'] move_home=False password=NOT_LOGGING_PARAMETER name=zuul seuser=None remove=False login_class=None generate_ssh_key=None Jun 06 09:46:59 centos-7-rax-iad-0004313964 usermod[18186]: delete 'zuul' from group 'docker' Jun 06 09:46:59 centos-7-rax-iad-0004313964 usermod[18186]: delete 'zuul' from shadow group 'docker'
and then shortly after @09:48 that the upgrade container prep fails http://logs.openstack.org/63/572663/1/experimental/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/01a5631/logs/undercloud/home/zuul/upgrade_overcloud_prep_containers.log.txt.gz
tl;dr _something_ is removing the zuul user from the docker groups and that is the root cause of the bug.
Looking at undercloud journal just before the failing overcloud prep containers you can see (this is for the originally reference change in the bug description @ http:// logs.openstack. org/87/ 572187/ 2/check/ tripleo- ci-centos- 7-scenario000- multinode- oooq-container- upgrades/ ca1ee14/ logs/undercloud):
Jun 05 12:19:36 centos- 7-rax-dfw- 0004294805 sudo[17190]: zuul : TTY=pts/1 ; PWD=/home/zuul ; USER=root ; COMMAND=/bin/sh -c echo BECOME- SUCCESS- banjdiooyiqxxhr kxjetvzfpsuqhlb cj; /usr/bin/python /home/zuul/ .ansible/ tmp/ansible- tmp-1528201174. 53-406005944514 44/user. py; rm -rf "/home/ zuul/.ansible/ tmp/ansible- tmp-1528201174. 53-406005944514 44/" > /dev/null 2>&1 7-rax-dfw- 0004294805 python[17193]: ansible-user Invoked with comment=None ssh_key_bits=0 update_ password= always non_unique=False force=False ssh_key_type=rsa ssh_key_ passphrase= NOT_LOGGING_ PARAMETER createhome=True uid=None home=None append=False skeleton=None ssh_key_ comment= ansible- generated on centos- 7-rax-dfw- 0004294805 group=None system=False state=present local=None shell=None expires=None ssh_key_file=None groups=['mock'] move_home=False password= NOT_LOGGING_ PARAMETER name=zuul seuser=None remove=False login_class=None generate_ ssh_key= None 7-rax-dfw- 0004294805 usermod[17201]: delete 'zuul' from group 'docker' 7-rax-dfw- 0004294805 usermod[17201]: delete 'zuul' from shadow group 'docker'
Jun 05 12:19:36 centos-
Jun 05 12:19:37 centos-
Jun 05 12:19:37 centos-
(the prep containers, which fails is just after this starting @ 12:20 http:// logs.openstack. org/87/ 572187/ 2/check/ tripleo- ci-centos- 7-scenario000- multinode- oooq-container- upgrades/ ca1ee14/ logs/undercloud /home/zuul/ upgrade_ overcloud_ prep_containers .log.txt. gz#_2018- 06-05_12_ 21_08 ).
You can see the same thing on sbaker test review (undercloud journal @ http:// logs.openstack. org/63/ 572663/ 1/experimental/ tripleo- ci-centos- 7-scenario000- multinode- oooq-container- upgrades/ 01a5631/ logs/undercloud /var/log/ journal. txt.gz)
Jun 06 09:46:59 centos- 7-rax-iad- 0004313964 sudo[18175]: zuul : TTY=pts/1 ; PWD=/home/zuul ; USER=root ; COMMAND=/bin/sh -c echo BECOME- SUCCESS- aqycodoijcsiqhj qpomuilxhhwadxq xi; /usr/bin/python /home/zuul/ .ansible/ tmp/ansible- tmp-1528278417. 64-135284418991 881/user. py; rm -rf "/home/ zuul/.ansible/ tmp/ansible- tmp-1528278417. 64-135284418991 881/" > /dev/null 2>&1 7-rax-iad- 0004313964 python[18178]: ansible-user Invoked with comment=None ssh_key_bits=0 update_ password= always non_unique=False force=False ssh_key_type=rsa ssh_key_ passphrase= NOT_LOGGING_ PARAMETER createhome=True uid=None home=None append=False skeleton=None ssh_key_ comment= ansible- generated on centos- 7-rax-iad- 0004313964 group=None system=False state=present local=None shell=None expires=None ssh_key_file=None groups=['mock'] move_home=False password= NOT_LOGGING_ PARAMETER name=zuul seuser=None remove=False login_class=None generate_ ssh_key= None 7-rax-iad- 0004313964 usermod[18186]: delete 'zuul' from group 'docker' 7-rax-iad- 0004313964 usermod[18186]: delete 'zuul' from shadow group 'docker'
Jun 06 09:46:59 centos-
Jun 06 09:46:59 centos-
Jun 06 09:46:59 centos-
and then shortly after @09:48 that the upgrade container prep fails http:// logs.openstack. org/63/ 572663/ 1/experimental/ tripleo- ci-centos- 7-scenario000- multinode- oooq-container- upgrades/ 01a5631/ logs/undercloud /home/zuul/ upgrade_ overcloud_ prep_containers .log.txt. gz