overcloud stack update attempts to redeploy servers which have already been deployed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
While using rocky rc1 deployed 3controller, 1 compute and 3 ceph without errors. I then simply re-ran the same 'openstack overcloud deploy ...' command to run a stack update like you would to reassert a configuration change and the stack update failed.
Though the failure resulted in placement errors like "No valid host was found" this seems only to be a side effect of a larger problem outside of Nova because Nova shouldn't have been asked to create a new resources. I.e. a `openstack server list` after the stack update showed 6controller, 2 compute and 6 ceph nodes where the new ones all were in status ERROR while the existing ones were in status ACTIVE. The deployment tries to create new nodes rather than reuse them.
An idempotence test where you just just reassert the configurations by re-running 'openstack overcloud deploy' would pick this up but for that test it cannot be reproduced using already-deployed servers [1] which might explain why upstream CI didn't reproduce this bug.
[1] https:/
Changed in tripleo: | |
status: | New → Triaged |
So I noticed that we do not install openstack- tripleo- common in the heat containers and so we miss the undercloud heat plugins:
(undercloud) [root@undercloud-0 nova]# for i in $(docker ps |grep heat|awk '{print $1}'); do docker exec -it $i sh -c 'ls -l /usr/lib/heat/'; done
ls: cannot access /usr/lib/heat/: No such file or directory
ls: cannot access /usr/lib/heat/: No such file or directory
ls: cannot access /usr/lib/heat/: No such file or directory
ls: cannot access /usr/lib/heat/: No such file or directory
I checked on queens on a non-containerized undercloud and we still ship them: heat_plugins] # ls -l /usr/lib/ heat/undercloud _heat_plugins/ resources. py resources. pyc resources. pyo update_ allowed. py update_ allowed. pyc update_ allowed. pyo
[root@undercloud-0 undercloud_
total 44
-rw-r--r--. 1 root root 1155 Jul 5 13:29 config.py
-rw-r--r--. 2 root root 1079 Aug 13 21:25 config.pyc
-rw-r--r--. 2 root root 1079 Aug 13 21:25 config.pyo
-rw-r--r--. 1 root root 1745 Jul 5 13:29 immutable_
-rw-r--r--. 2 root root 2477 Aug 13 21:25 immutable_
-rw-r--r--. 2 root root 2477 Aug 13 21:25 immutable_
-rw-r--r--. 1 root root 0 Jul 5 13:29 __init__.py
-rw-r--r--. 2 root root 136 Aug 13 21:25 __init__.pyc
-rw-r--r--. 2 root root 136 Aug 13 21:25 __init__.pyo
-rw-r--r--. 1 root root 1390 Jul 5 13:29 server_
-rw-r--r--. 2 root root 1418 Aug 13 21:25 server_
-rw-r--r--. 2 root root 1418 Aug 13 21:25 server_
Not 100% sure this is the culprit though