Race condition in nova/neutron when booting instance with XenAPI driver

Bug #1512955 reported by huan
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
huan

Bug Description

1. My environment is:
    Xenserver 6.5
    OpenStack latest master branch
    Neutron network with ML2 plugin, OVS driver

2. When I ran tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops it will always fail.
It failed at finding the new created instance's port before assign a floating ip to this port.

Relevant log file of tempest:
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops[compute,id-f323b3ba-82f8-4db7-8ea6-6a895869ec49,network,smoke]
-------------------------------------------------------------------------------------------------------------------------------------------------
Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "tempest/test.py", line 127, in wrapper
        return f(self, *func_args, **func_kwargs)
      File "tempest/scenario/test_network_basic_ops.py", line 398, in test_network_basic_ops
        self._setup_network_and_servers()
      File "tempest/scenario/test_network_basic_ops.py", line 123, in _setup_network_and_servers
        floating_ip = self.create_floating_ip(server)
      File "tempest/scenario/manager.py", line 774, in create_floating_ip
        port_id, ip4 = self._get_server_port_id_and_ip4(thing)
      File "tempest/scenario/manager.py", line 755, in _get_server_port_id_and_ip4
        % port_map)
      File "/opt/stack/tempest/.tox/all/local/lib/python2.7/site-packages/testtools/testcase.py", line 350, in assertEqual
        self.assertThat(observed, matcher, message)
      File "/opt/stack/tempest/.tox/all/local/lib/python2.7/site-packages/testtools/testcase.py", line 435, in assertThat
        raise mismatch_error
    testtools.matchers._impl.MismatchError: 0 != 1: Found multiple IPv4 addresses: []. Unable to determine which port to target.
----------------------------------------------------------------------------------------------------------------------------------------------------

3. This is a failure that can be reproduced each time as long as we use xen and neutron.

4. I tried investagating with this problem, it dues to nova/neutron race condition when booting an instance under xen driver, since xen driver doesn't handle neutron's "network-vif-plugged" notification event.

Tags: xenserver
Changed in nova:
assignee: nobody → huan (huan-xie)
status: New → In Progress
Revision history for this message
Václav Hejral (vhejral) wrote :
Revision history for this message
Václav Hejral (vhejral) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Is this a duplicate of bug 1514935?

Changed in nova:
assignee: huan (huan-xie) → Jianghua Wang (wjh-fresh)
Changed in nova:
assignee: Jianghua Wang (wjh-fresh) → huan (huan-xie)
Changed in nova:
assignee: huan (huan-xie) → Jianghua Wang (wjh-fresh)
Bob Ball (bob-ball)
Changed in nova:
importance: Undecided → High
importance: High → Medium
Bob Ball (bob-ball)
tags: added: xenserver
Changed in nova:
assignee: Jianghua Wang (wjh-fresh) → huan (huan-xie)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/268258

Changed in nova:
assignee: huan (huan-xie) → Rossella Sblendido (rossella-o)
Changed in nova:
assignee: Rossella Sblendido (rossella-o) → huan (huan-xie)
Changed in nova:
assignee: huan (huan-xie) → Rossella Sblendido (rossella-o)
Changed in nova:
assignee: Rossella Sblendido (rossella-o) → Bob Ball (bob-ball)
Changed in nova:
assignee: Bob Ball (bob-ball) → huan (huan-xie)
Bob Ball (bob-ball)
tags: added: mitaka-rc-potential
summary: - Race condition in nova/neutron when booting instance with xen driver
+ Race condition in nova/neutron when booting instance with XenAPI driver
Revision history for this message
Bob Ball (bob-ball) wrote :

Requesting this to be RC potential because it's a long standing race condition in XenAPI driver with a relatively straight forward port of a fix over from the libvirt driver.

Revision history for this message
Andrew Laski (alaski) wrote :

I agree that this would be nice to fix in M if possible, but I don't think it should block a release as this is essentially new functionality that the XenAPI driver has never had.

Revision history for this message
Andrew Laski (alaski) wrote :

Is there a fix up for this? I see a patch for libvirt+xen but nothing for the xenapi driver.

Revision history for this message
Bob Ball (bob-ball) wrote :

Fix is https://review.openstack.org/#/c/241127/ - not sure why launchpad didn't show it up.

The gerrit query from the mitaka-nova-priorities-tracking etherpad did find it though (https://review.openstack.org/#/q/status:open+project:openstack/nova+%28nil+OR+message:%22#1553099%22+OR+message:%22#1552888%22+OR+message:%22#1552303%22+OR+message:%22#1549814%22+OR+message:%22#1536513%22+OR+message:%22#1512955%22+%29,n,z)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/241127
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ecd3eb7c945a7b7939c29617c2000a30a48c55dd
Submitter: Jenkins
Branch: master

commit ecd3eb7c945a7b7939c29617c2000a30a48c55dd
Author: Huan Xie <email address hidden>
Date: Tue Nov 3 06:47:28 2015 +0000

    XenAPI:Resolve Nova/Neutron race condition

    When booting an instance, nova and neutron has race condition because
    nova don't know whether vif(port) is ready in neutron. There is a
    mechenism that letting neutron notify nova when port status changed
    from down to active. This fix is for xen driver to add usage of this
    event notification to avoid race condition

    Closes-Bug: #1512955

    Change-Id: I77be3bb728db72e01701c94ee292fa0f237358ed

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/nova 13.0.0.0rc1

This issue was fixed in the openstack/nova 13.0.0.0rc1 release candidate.

Matt Riedemann (mriedem)
tags: removed: mitaka-rc-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.