upgrade-non-controller.sh getting stuck cleaning up openstack-nova-compute package
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Michele Baldessari |
Bug Description
Attempting to do a major upgrade (liberty->mitaka). During `upgrade-
+ echo 'get compute uuid and update compute'
get compute uuid and update compute
++ nova list
++ grep compute
++ awk '{ print $2; }'
+ for compute in '$(nova list | grep compute | awk '\''{ print $2; }'\'')'
+ echo 'Run upgrade on 629e3a10-
Run upgrade on 629e3a10-
+ /bin/upgrade-
Tue Oct 4 19:23:50 UTC 2016 upgrade-
Tue Oct 4 19:23:53 UTC 2016 upgrade-
Warning: Permanently added '192.0.2.7' (ECDSA) to the list of known hosts.^M
Tue Oct 4 19:23:54 UTC 2016 upgrade-
Tue Oct 4 19:23:57 UTC 2016 upgrade-
Loaded plugins: fastestmirror, priorities
Determining fastest mirrors
* base: mirror.cogentco.com
* extras: mirror.
* updates: mirror.vcu.edu
771 packages excluded due to repository priority protections
Package python-zaqarclient is obsoleted by python2-
Resolving Dependencies
<snip>
Cleanup : 1:python-
Cleanup : 1:openstack-
Cleanup : 1:openstack-
Deployed via TripleO-Quickstart mimicking a ci.centos periodic job[1] that is getting killed while stuck in the same part of the upgrade. Logs were not collected as Jenkins killed the job prior to collection it's not known if it's getting caught attempting to clean up the same package.
Changed in tripleo: | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in tripleo: | |
importance: | High → Critical |
milestone: | none → newton-rc3 |
assignee: | nobody → Michele Baldessari (michele) |
tags: | added: newton-backport-potential |
Have you checked that rabbit/galera on the controllers are correctly up and running when this script runs on the compute nodes?
openstack- nova-compute gets upgraded and somewhere along the %post in rpm does a "systemctl try-restart openstack-nova compute", since the service was started it will be restarted but this is unusually taking a long time because rabbit on the controller is not reachable.
I have seen this phenomenon happen quite a few times, so I thought I'd throw it in here