nova-compute fails to resume after upgrade from Queens to Rocky
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Compute Charm |
Fix Released
|
Undecided
|
Corey Bryant |
Bug Description
I am trying to upgrade a deployment from Queens to Rocky.
The process I'm following is:
juju run-action nova-compute-kvm/12 --wait pause
juju run-action nova-compute-kvm/12 --wait openstack-upgrade
juju run-action nova-compute-kvm/12 --wait resume
The upgrade succeeds but the unit fails to resume, saying the nova-compute service isn't running.
The nova-compute service failed to restart with the same privsep helper error in nova-compute.log as mentioned in https:/
Because the environment is offline and I don't have keyboard access, I've typed this from the screen. This is the bottom of the trace.
2019-02-14 15:56:01.849 43535 ERROR os_vif [req-179a3b10-
I was able to recover from this by purging the python2 rootwrap and privsep packages before resuming the unit. The scripted upgrade looks something like this:
juju run-action nova-compute-kvm/12 --wait pause
juju run-action nova-compute-kvm/12 --wait openstack-upgrade
juju ssh nova-compute-kvm/12 'sudo apt purge -y python-
sudo apt autoremove -y'
juju run-action nova-compute-kvm/12 --wait resume
# ceilometer-agent and neutron-openvswitch services break after the above apt commands
# the units here are the units subordinate to the particular nova-compute unit
juju run --unit ceilometer-
I thought this might be resolved by the fix for https:/
The revisions of those charms we're running are:
e464bba nova-compute-292
ee81e0e neutron-
6650284 ceilometer-
Changed in charm-nova-compute: | |
assignee: | nobody → Corey Bryant (corey.bryant) |
status: | New → In Progress |
I was able to reproduce this issue on my orangebox.
Attaching juju crashdump and the bundle to this bug.
Steps:
Deploy queens bundle with enable-dvr: true.
Create an instance.
Try upgrade (in this case unit where the instance is)
alex@xx:$ juju run-action nova-compute/2 --wait pause compute- 2: 94f4-41ae- 84e7-7abf3b37bb c4
unit-nova-
id: d30a0370-
status: completed
timing:
completed: 2019-02-19 11:57:50 +0000 UTC
alex@xx:$ juju run-action nova-compute/2 --wait openstack-upgrade compute- 2: 8408-4f5b- 8a38-59cacdd103 57
unit-nova-
id: b93a1681-
results:
outcome: success, upgrade completed.
status: completed
timing:
completed: 2019-02-19 12:01:03 +0000 UTC
alex@xx :$ juju run-action nova-compute/2 --wait resume compute- 2: 3802-4540- 83a3-174bef2cbc 2f
unit-nova-
id: 9a339a2c-
message: 'Action resume failed: Couldn''t resume: Services not running that should
be: nova-compute'
status: failed
timing:
completed: 2019-02-19 12:01:33 +0000 UTC
cat nova-compute.log
2019-02-19 12:01:29.240 338412 INFO oslo.privsep.daemon [req-17603cb1- f634-41c2- 9699-1b2ad109a1 43 - - - - -] Running privsep helper: ['sudo', 'nova-rootwrap', '/etc/nova/ rootwrap. conf', 'privsep-helper', '--config-file', '/etc/nova/ nova.conf' , '--config-file', '/etc/nova/ nova-compute. conf', '--privsep_ context' , 'vif_plug_ ovs.privsep. vif_plug' , '--privsep_ sock_path' , '/tmp/tmpi55osj tc/privsep. sock'] f634-41c2- 9699-1b2ad109a1 43 - - - - -] privsep helper command exited non-zero (1) f634-41c2- 9699-1b2ad109a1 43 - - - - -] Failed to plug vif VIFBridge( active= True,address= fa:16:3e: 3e:bd:25, bridge_ name='qbr2b2c95 6a-cb', has_traffic_ filtering= True,id= 2b2c956a- cb49-4131- b2c8-fb124d775c 73,network= Network( 874247e5- fa3a-4421- 8986-d6f9bb8f13 b4),plugin= 'ovs',port_ profile= VIFPortProfileO penVSwitch, preserve_ on_delete= False,vif_ name='tap2b2c95 6a-cb') : oslo_privsep. daemon. FailedToDropPri vileges: privsep helper command exited non-zero (1)
2019-02-19 12:01:29.788 338412 WARNING oslo.privsep.daemon [-] privsep log: Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
2019-02-19 12:01:29.880 338412 CRITICAL oslo.privsep.daemon [req-17603cb1-
2019-02-19 12:01:29.880 338412 ERROR os_vif [req-17603cb1-