nova-compute fails to start if is_power_on raises exception

Bug #1168610 reported by aeva black
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Low
Unassigned

Bug Description

During startup of nova-compute, if virtual_power_driver is_power_on() raises an exception, it will prevent nova-compute from starting. This can happen when a compute instance exists in the database but the SSH credentials that vpd uses are wrong.

I think a solution is for vpd to convert exception.PowerVMConnectionFailed into an error and log it.

Here is a trace of this error:

2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 147, in run_server
2013-04-13 06:52:02,213.213 4164 TRACE nova server.start()
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/service.py", line 429, in start
2013-04-13 06:52:02,213.213 4164 TRACE nova self.manager.init_host()
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 630, in init_host
2013-04-13 06:52:02,213.213 4164 TRACE nova self._init_instance(context, instance)
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 539, in _init_instance
2013-04-13 06:52:02,213.213 4164 TRACE nova drv_state = self._get_power_state(context, instance)
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 649, in _get_power_state
2013-04-13 06:52:02,213.213 4164 TRACE nova return self.driver.get_info(instance)["state"]
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 363, in get_info
2013-04-13 06:52:02,213.213 4164 TRACE nova if pm.is_power_on():
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/virtual_power_driver.py", line 198, in is_power_on
2013-04-13 06:52:02,213.213 4164 TRACE nova if not self._check_for_node():
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/virtual_power_driver.py", line 147, in _check_for_node
2013-04-13 06:52:02,213.213 4164 TRACE nova full_node_list = self._get_full_node_list()
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/virtual_power_driver.py", line 141, in _get_full_node_list
2013-04-13 06:52:02,213.213 4164 TRACE nova full_list = self._run_command(cmd)
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/virtual_power_driver.py", line 225, in _run_command
2013-04-13 06:52:02,213.213 4164 TRACE nova self._set_connection()
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/baremetal/virtual_power_driver.py", line 136, in _set_connection
2013-04-13 06:52:02,213.213 4164 TRACE nova self._connection = connection.ssh_connect(self.connection_data)
2013-04-13 06:52:02,213.213 4164 TRACE nova File "/opt/stack/venvs/nova/local/lib/python2.7/site-packages/nova/virt/powervm/common.py", line 62, in ssh_connect
2013-04-13 06:52:02,213.213 4164 TRACE nova raise exception.PowerVMConnectionFailed()
2013-04-13 06:52:02,213.213 4164 TRACE nova PowerVMConnectionFailed: Connection to PowerVM manager failed
2013-04-13 06:52:02,213.213 4164 TRACE nova
2013-04-13 06:52:02,648.648 4217 INFO nova.manager [-] Skipping periodic task _periodic_update_dns because its interval is negative
2013-04-13 06:52:02,705.705 4217 INFO nova.virt.driver [-] Loading compute driver 'baremetal.driver.BareMetalDriver'
2013-04-13 06:52:02,754.754 INFO nova.openstack.common.rpc.common [req-235af7f1-4c57-42b8-8cc7-cbdd1327f12a None None] Connected to AMQP server on 127.0.0.1:5672
2013-04-13 06:52:02,839.839 4217 AUDIT nova.service [-] Starting compute node (version 2013.2)
2013-04-13 06:52:03,176.176 ERROR nova.compute.manager [req-e06f7f34-a2df-44f2-b3f2-ac299d4087d6 None None] Instance bmtest found in the hypervisor, but not in the database

Tags: baremetal
aeva black (tenbrae)
Changed in nova:
status: New → Triaged
importance: Undecided → Low
tags: added: baremetal
Revision history for this message
Robert Collins (lifeless) wrote :

Similarly to the root disk size one, this isn't really a tripleo issue : we can deliver a great tripleo experience without it.

no longer affects: tripleo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/27973

Changed in nova:
assignee: nobody → Tim Miller (tim-miller-0)
status: Triaged → In Progress
Revision history for this message
Joe Gordon (jogo) wrote :

since we are in the process of deprecating and removing nova baremetal and the patch is abandoned, closing this bug.

Changed in nova:
status: In Progress → Won't Fix
assignee: Tim Miller (tim-miller-0) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.