nova compute on hyperv don't wait for vif plugged event from neutron

Bug #1473291 reported by Sonu
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Claudiu Belu
compute-hyperv
Fix Committed
Medium
Claudiu Belu

Bug Description

1. Exact version of Nova/OpenStack you are running:
Juno/stable

2. Reproduce steps:

* Launch a VM on a Hyper-V cloud.
* Note the boot time of the Virtual Machine in nova-compute.log.
* Note the port up status time in neutron DB for the port of the VM.

The boot time of the Virtual machine is earlier than the port UP status, which should not be the case.

Expected result:
* The boot time of the Virtual Machine should be later than port UP status.
* All port rules are applied before the VM is booted and presented to the user.

Actual result:
* VM boots before the port rules are applied by neutron and it results in VM not getting IP, missing rules to communicate etc.

4. Description
When the port binding is complete, neutron uses notifier to notify the port UP event to Nova, so that Nova could power ON or resume the VM instance. On Nova compute for hyper-v, Nova doesn't wait for the Port Up event from neutron and boots the instance immediately once the disk preparation is done.
This causes VM instances to boot without proper security rules and hence would result in VMs not getting IP or connectivity as desired by the user. To prevent this, Nova compute should wait for the vif plugged event from neutron and then do power ON operation.

tags: added: hyper-v network
Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

Usually by the time the instance reaches a point in which network connectivity is established, the neutron agent already configured the vnic / vswitch connection, but in some cases the described behaviour can happen as it's subject to a race condition between components.

The proposed solution is aligned with how the libvirt driver handles the same scenario and can be consider valid for a fix.

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Sonu (sonu-sudhakaran) wrote :

This behavior is quite evident when running at high scale (200 VMs/tenant with default security group), where hyper-V neutron agent's WMI operations take little over 60 secs (dhcp timeout) mark.

Claudiu Belu (cbelu)
Changed in nova:
assignee: nobody → Claudiu Belu (cbelu)
Changed in compute-hyperv:
assignee: nobody → Claudiu Belu (cbelu)
importance: Undecided → Medium
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.