VMware: unable to spin up instance as network not created on host

Bug #1532750 reported by Gary Kotton on 2016-01-11
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)

Bug Description

When using Neutron there are edge cases when the network created by Neutron has not yet been created on the actual host. This results in the VM creation failing as the network is still to be created on the host:

2016-01-08 20:56:29.486 ^[[00;32mDEBUG oslo_vmware.exceptions [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mFault InvalidDeviceSpec not matched.^[[00m ^[[00;33mfrom (pid=28979) get_fault_class /usr/local/lib/python2.7/dist-packages/oslo_vmware/exceptions.py:295^[[00m
2016-01-08 20:56:29.486 ^[[01;31mERROR oslo_vmware.common.loopingcall [^[[00;36m-^[[01;31m] ^[[01;35m^[[01;31min fixed duration looping call^[[00m
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00mTraceback (most recent call last):
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_vmware/common/loopingcall.py", line 76, in _inner
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00m self.f(*self.args, **self.kw)
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_vmware/api.py", line 428, in _poll_task
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00m raise task_ex
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00mVimFaultException: Invalid configuration for device '0'.
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00mFaults: ['InvalidDeviceSpec']
^[[01;31m2016-01-08 20:56:29.486 TRACE oslo_vmware.common.loopingcall ^[[01;35m^[[00m

Adding a retry will successfully address this - giving the actual host time to create the network

Gary Kotton (garyk) on 2016-01-11
Changed in nova:
importance: Undecided → High
tags: added: liberty-backport-potential vmware

Fix proposed to branch: master
Review: https://review.openstack.org/265764

Changed in nova:
assignee: nobody → Gary Kotton (garyk)
status: New → In Progress

Change abandoned by garyk (<email address hidden>) on branch: master
Review: https://review.openstack.org/265764

Changed in nova:
assignee: Gary Kotton (garyk) → nobody
status: In Progress → Confirmed
Giridhar Jayavelu (gjayavelu) wrote :

Instead of retrying the entire vm creation method, it would be good to retry get_network_with_the_name() which is specific to the issue mentioned here.

Changed in nova:
assignee: nobody → Giridhar Jayavelu (gjayavelu)
Sarafraj Singh (sarafraj-singh) wrote :

Are you working on the fix? Please change status to Inprogress if you are, otherwise remove yourself as assignee so someone else can pick it up.

Giridhar Jayavelu (gjayavelu) wrote :

I didn't get chance to test and complete the patch. I'll revisit after few weeks if no one has picked up this bug. Thanks!

Changed in nova:
assignee: Giridhar Jayavelu (gjayavelu) → nobody
Hong Hui Xiao (xiaohhui) wrote :

But, according to the code, if get_network_with_the_name returns None, we should get exception "NetworkNotFoundForBridge", instead of the exception reported in bug description. I think the bug happens when network has been created in compute cluster, however, has not been provisioned to cluster hosts. I would proposal another solution...

Changed in nova:
assignee: nobody → Hong Hui Xiao (xiaohhui)
status: Confirmed → In Progress
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing
the status back to the previous state and unassigning. If
there are active reviews related to this bug, please include
links in comments.

Changed in nova:
status: In Progress → Confirmed
assignee: Hong Hui Xiao (xiaohhui) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers