OVS agent updates the wrong port when using XenAPI + Neutron with HVM or PVHVM

Bug #1268955 reported by Simon Pasquier on 2014-01-14
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Unassigned

Bug Description

Environment
==========
- Xen Server 6.2
- OpenStack Havana installed with Packstack
- Neutron OVS agent using VLAN

From time to time, when an instance is started, it fails to get network connectivity. As a result the instance cannot get its IP address from DHCP and it remains unreachable.

After further investigation, it appears that the OVS agent running on the compute node is updating the wrong OVS port because on startup, 2 ports exist for the same instance: vifX.0 and tapX.0. The agent updates whatever port is returned in first position (see logs below). Note that the tapX.0 port is only transient and disappears after a few seconds.

Workaround
==========

Manually update the OVS port on dom0:

$ ovs-vsctl set Port vif17.0 tag=1

OVS Agent logs
============

2014-01-14 14:15:11.382 18268 DEBUG neutron.agent.linux.utils [-] Running command: ['/usr/bin/neutron-rootwrap-xen-dom0', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', '--', '--columns=external_ids,name,ofport', 'find', 'Interface', 'external_ids:iface-id="98679ab6-b879-4b1b-a524-01696959d468"'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43
2014-01-14 14:15:11.403 18268 DEBUG qpid.messaging.io.raw [-] SENT[3350c68]: '\x0f\x01\x00\x19\x00\x01\x00\x00\x00\x00\x00\x00\x04\n\x01\x00\x07\x00\x010\x00\x00\x00\x00\x01\x0f\x00\x00\x1a\x00\x00\x00\x00\x00\x00\x00\x00\x02\n\x01\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x81' writeable /usr/lib/python2.6/site-packages/qpid/messaging/driver.py:480
2014-01-14 14:15:11.649 18268 DEBUG neutron.agent.linux.utils [-]
Command: ['/usr/bin/neutron-rootwrap-xen-dom0', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', '--', '--columns=external_ids,name,ofport', 'find', 'Interface', 'external_ids:iface-id="98679ab6-b879-4b1b-a524-01696959d468"']
Exit code: 0
Stdout: 'external_ids : {attached-mac="fa:16:3e:46:1e:91", iface-id="98679ab6-b879-4b1b-a524-01696959d468", iface-status=active, xs-network-uuid="b2bf90df-be17-a4ff-5c1e-3d69851f508a", xs-vif-uuid="2d2718d8-6064-e734-2737-cdcb4e06efc4", xs-vm-uuid="7f7f1918-3773-d97c-673a-37843797f70a"}\nname : "tap29.0"\nofport : 52\n\nexternal_ids : {attached-mac="fa:16:3e:46:1e:91", iface-id="98679ab6-b879-4b1b-a524-01696959d468", iface-status=inactive, xs-network-uuid="b2bf90df-be17-a4ff-5c1e-3d69851f508a", xs-vif-uuid="2d2718d8-6064-e734-2737-cdcb4e06efc4", xs-vm-uuid="7f7f1918-3773-d97c-673a-37843797f70a"}\nname : "vif29.0"\nofport : 51\n\n'
Stderr: '' execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:60
2014-01-14 14:15:11.650 18268 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [-] Port 98679ab6-b879-4b1b-a524-01696959d468 updated. Details: {u'admin_state_up': True, u'network_id': u'ad37f107-074b-4c58-8f36-4705533afb8d', u'segmentation_id': 100, u'physical_network': u'default', u'device': u'98679ab6-b879-4b1b-a524-01696959d468', u'port_id': u'98679ab6-b879-4b1b-a524-01696959d468', u'network_type': u'vlan'}
2014-01-14 14:15:11.650 18268 DEBUG neutron.agent.linux.utils [-] Running command: ['/usr/bin/neutron-rootwrap-xen-dom0', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tap29.0', 'tag=1'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43
2014-01-14 14:15:11.913 18268 DEBUG neutron.agent.linux.utils [-]
Command: ['/usr/bin/neutron-rootwrap-xen-dom0', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=2', 'set', 'Port', 'tap29.0', 'tag=1']
Exit code: 0
Stdout: '\n'
Stderr: '' execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:60

tags: added: xenserver
Simon Pasquier (simon-pasquier) wrote :

I just realized that this is because I'm using HVM instances. In that case, Xen Server creates 1 vif and 1 tap and it deletes the tap interface if the domU is able to load the virtualized Xen driver for the NIC.
See http://lists.xen.org/archives/html/xen-devel/2011-12/msg02085.html

no longer affects: nova
summary: - OVS agent updates the wrong port when using Xen + Neutron
+ OVS agent updates the wrong port when using Xen + Neutron with HVM

Can this happen only on initial VM creation or is it possible during every startup?

Have you been able to isolate which file contains the errant code?

As I see it, these are the candidates (my nodes are Debian 7.3, so slightly different locations):
/usr/share/pyshared/neutron/agent/linux/utils.py
/usr/share/pyshared/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py
/usr/bin/neutron-rootwrap-xen-dom0
/etc/xapi.d/plugins/netwrap in dom0

Something is possibly using too loose a filtering method when selecting which interface to tag.

Simon Pasquier (simon-pasquier) wrote :

The issue might happen every time the instance starts.

AFAIU, there is no problem with the neutron OVS agent code. The issue is that the command executed by OVSBridge.get_vif_port_by_id() method [1] returns 2 rows. In all other cases (KVM, Xen PV, ...), this command returns only one row because there is only one port matching a given "external_ids:iface-id". The get_vif_port_by_id() method is called by the OVS agent on every added or updated port [2].

[1] https://github.com/openstack/neutron/blob/aa85a97ca2dcb06996ed133d864705f1dca722b1/neutron/agent/linux/ovs_lib.py#L379
[2] https://github.com/openstack/neutron/blob/aa85a97ca2dcb06996ed133d864705f1dca722b1/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py#L934

Andrew Kinney (andykinney) wrote :

This also appears to affect Debian VMs created from the package openstack-debian-image. It appears that it starts out in HVM mode and then switches to PV drivers mid-way through boot, manifesting with an untagged OVS port in dom0. If I restart the OVS agent that manages OVS ports in dom0, the tag gets placed properly. By then, it's too late for dhcp and cloud-init, so the VM never gets network or metadata.

I suspect that it's because the flags supplied with the image in glance aren't set properly, but I'm still investigating. I thought I remembered reading that you can set something in glance along with the image to define the target hypervisor and virtualization method.

Andrew Kinney (andykinney) wrote :

I'm checking to see if adding metadata to the image resolves the issue. It doesn't "fix" the bug, but it's a much more viable workaround than manually updating the OVS port tag on every creation/reboot. I'll report back soon.
http://docs.openstack.org/image-guide/content/image-metadata.html

Andrew Kinney (andykinney) wrote :

VM image properties do not seem to make a difference. Some "nova boot" commands result in the OVS port in dom0 getting the proper tag (everything works) and some show the results detailed in this bug report (dhcp and metadata retrieval fail due to missing VLAN tag on OVS port in dom0).

Andrew Kinney (andykinney) wrote :

I did note, however, that restarting the VM or restarting the OVS agent plugin that works in dom0 applies the proper tags at that moment. VM creation (vs reboot) still experiences the same issue for the same reasons outlined above (HVM to PV change).

Andrew Kinney (andykinney) wrote :

I guess the issue with Debian Wheezy (7.3) is that it uses a PVHVM kernel. PVHVM seems to be subject to this issue as well. Debian being incapable of running on OpenStack if XenServer is the hypervisor seems like a pretty big issue. Not to mention Windows. It makes OpenStack with XenServer practically unusable with neutron and VLANs.

I'm not a coder. I can look around in code and get a general idea what is happening, but I wouldn't know how to trace or patch this issue. How do we get someone with skills to look at this issue?

Andrew Kinney (andykinney) wrote :

It's just random dumb luck when the OVS port gets tagged on VM reboot.

Debian 7.3 VM OVS ports in dom0 at creation or boot/reboot (ovs-vsctl show):
        Port "tap33.0"
            Interface "tap33.0"
        Port "vif33.0"
            Interface "vif33.0"

After some time has elapsed (a few seconds), that changes to:
        Port "vif33.0"
            tag: 5
            Interface "vif33.0"
or
        Port "vif33.0"
            Interface "vif33.0"

depending on which of the two initial ports got the tag.

Considering that conventional linux kernel wisdom of the day is that PVHVM is better than straight PV and most distributions are working to utilize upstream built-in PVHVM support, this bug affects pretty much every new operating system. The only ones immune are older operating systems that are still using the paradigm of strictly PV. Strictly PV isn't even available in the kernel packages for Debian 7 (wheezy) like it was in Debian 6 (squeeze) or Debian 5 (lenny).

@Andrew

What you described in the previous comment is exactly what I explained in the original description. I'll try to propose a fix in the next weeks.

Andrew Kinney (andykinney) wrote :

@Simon

I thought it worthwhile to demonstrate that the failure mode for PVHVM was the same as HVM. PVHVM is becoming ubiquitous in the Linux world since it is more performant than plain PV or plain HVM. It's also how the mainline Linux kernel supports Xen. Nobody even makes PV only Linux kernels any more unless they're done directly from xensource.

I'm going to be examining a potential temporary workaround. Since the timing of the call to get_vif_port_by_id() method can impact whether it gets one or two rows in response, I'm looking into inserting a delay of a few seconds just before that call. I know it's not a real fix, but it's a possible workaround that I think I can muddle through with my limited knowledge of python.

Andrew Kinney (andykinney) wrote :

Workaround (not a real fix!):
On the node with the OVS agent controlling network in dom0 in /usr/share/pyshared/neutron/agent/linux/ovs_lib.py

add to the top near the other import statements:
import time

Change this:
    def get_vif_port_by_id(self, port_id):
         args = ['--', '--columns=external_ids,name,ofport',
                'find', 'Interface',
                'external_ids:iface-id="%s"' % port_id]
        result = self.run_vsctl(args)

to this:
    def get_vif_port_by_id(self, port_id):
        time.sleep(8)
        args = ['--', '--columns=external_ids,name,ofport',
                'find', 'Interface',
                'external_ids:iface-id="%s"' % port_id]
        result = self.run_vsctl(args)

On my system, the tap interface has a life of 5 to 6 seconds at current load levels (mostly idle). If the sleep is 10 seconds, it interferes with the DHCP request emitted from within the VM, so I split the difference and set it to 8 seconds. Obviously, this will vary by machine and load levels, so it is by no means adequate as a real fix. It just increases the success rate of getting the tag on the proper port in dom0 before DHCP sends its first request from within the VM. A side effect is that it takes that number of seconds longer before networking is functional again when restarting the OVS agent that controls dom0.

@Andrew

I've modified the title of the bug to mention that it would happen with PVHVM too.

summary: - OVS agent updates the wrong port when using Xen + Neutron with HVM
+ OVS agent updates the wrong port when using Xen + Neutron with HVM or
+ PVHVM
Andrew Kinney (andykinney) wrote :

Will the following touch the code path for this bug as well?
https://review.openstack.org/#/c/66375/

Andrew Kinney (andykinney) wrote :

This also badly affects HVM instances where both the vif and the tap interface remain instead of one of the two being removed. You can see from the following output from "ovs-vsctl show" (trimmed and grouped for easier reading) that it is indeterminate which of the vif or tap will get the tag.

        Port "tap81.0"
            tag: 3
            Interface "tap81.0"
        Port "vif81.0"
            Interface "vif81.0"

        Port "tap81.1"
            Interface "tap81.1"
        Port "vif81.1"
            tag: 2
            Interface "vif81.1"

        Port "tap81.2"
            Interface "tap81.2"
        Port "vif81.2"
            tag: 4
            Interface "vif81.2"

        Port "tap81.3"
            Interface "tap81.3"
        Port "vif81.3"
            tag: 5
            Interface "vif81.3"

Yes, the review you mentioned in comment #14 would probably fix the issue when only the vif interface remains. In which case would the vif and tap interfaces remain together?

Andrew Kinney (andykinney) wrote :

An HVM domU without any PV drivers keeps the tap *and* vif interfaces.

XenServer provides both tap and vif interfaces to an HVM domU. If the HVM domU employs PV drivers, activating the PV drivers tells XenServer to rip out the tap interfaces and keep the vif interfaces. If no PV drivers are employed in the HVM domU, the tap interfaces are used by the domU and XenServer continues to supply the vif interfaces "just in case" the domU installs PV drivers later.

Unfortunately, in this situation, my workaround does not correct the problem. Sometimes the vif gets the tag and sometimes the tap gets the tag. For an HVM domU without PV drivers, the network interface inside the domU can only pass traffic if the tap interface gets the tag. It is extremely "lucky" when multiple interfaces of such a domU get tags on all the tap interfaces all at the same time.

Andrew Kinney (andykinney) wrote :

https://review.openstack.org/#/c/66375/
actually makes the problem worse and it's about to be merged. With those patches, no ports get tags now. I think this comment is key in the new ovs_lib.py in get_vif_port_by_id():
# We won't deal with the possibility of ovs-vsctl return multiple
# rows since the interface identifier is unique

It typifies the current thinking, which incorrect in the case of XenServer. How do we get them to change course on this?

Andrew Kinney (andykinney) wrote :

https://review.openstack.org/#/c/66375/
apparently requires other changes and silently crashed the ovs agent with no message to the log. That's why no tags were applied. I'm in the process of tracking down where the error messages might have landed to see what went wrong.

Andrew Kinney (andykinney) wrote :

There were no messages in any logs about why the ovs agent crashed, so I've rolled back the changes from https://review.openstack.org/#/c/66375/

Short of doing a git pull and building everything from source, I think I'm SOL until I see it come down the pipeline for Debian packages.

Thanks for the deep dive, Andrew :)

FWIW, the comment you mentioned ("We won't deal with the possibility of ovs-vsctl return multiple rows since the interface identifier is unique") only states something that was already true before (eg Havana code). For reference, here is the associated commit => https://github.com/openstack/neutron/commit/3d24fe5710cbea6d7d1f88c3476f4a856347ab5e

Bob Ball (bob-ball) wrote :

From a XenServer perspective, I think the right answer is that both ports must be tagged. I'd love some input from the Neutron guys around this. The rationale is that the two interfaces are actually the same from the VMs perspective - it can only use one of them and will negotiate up to the PV interface if it can.

HVM and PVHVM are effectively the same in this case as they are both using an HVM container and negotiate around the use of PV drivers.

We might also fix it by only tagging the tap port, then watching in dom0 for this going away and tagging the corresponding vif - but that might be racy since I think the driver negotiation doesn't have a hook that we could grab on to.

Andrew Kinney (andykinney) wrote :

I like tagging both the vif and tap. It feels safe with less chance for odd corner cases.

Andrew Kinney (andykinney) wrote :

Simon,
Can you add the tag havana-backport-potential to this bug report? It doesn't look like I have sufficient rights to do so or I'm missing something. I'd like to avoid upgrading to icehouse to get this fixed if I can avoid it.

Andrew,
This is done. I notice that you removed your previous comment saying that tagging the 2 interfaces didn't work. Does it mean that it works indeed?

tags: added: havana-backport-potential
Andrew Kinney (andykinney) wrote :

I don't know if it works. My previous test results were invalidated. I had done a re-install and defined the networks within the web gui (horizon) on the re-install versus defining the networks from command line previously. That caused the wrong physnet to be attached to the network definitions, resulting in a lack of proper OVS flow definition which caused *nothing* to work, no matter whether tags were there or not. I discovered this after making my comment and thought it prudent to retract the comment until I could retest.

Andrew Kinney (andykinney) wrote :

Our workaround so far has been to restart the "neutron-plugin-openvswitch-agent" service in the compute node after every VM reboot/start.

Any progress on a real fix? Our clients don't have access to the "neutron-plugin-openvswitch-agent" service directly, so they have to open a support ticket with our staff every time they reboot/start any VM. As you can imagine, it's a royal pain for everyone and downright problematic when dealing with third parties that don't realize they need to schedule the reboot of the VM with our support staff.

tags: added: ovs
Eugene Nikanorov (enikanorov) wrote :

I think this will not be fixed in Havana.
I wonder if Icehouse works fine for the described case.

Changed in neutron:
status: New → Opinion
Andrew Kinney (andykinney) wrote :

Eugene,
It's a crying shame that it won't be fixed in Havana. The process to go from Havana to Icehouse is so frigging onerous you'd have to be high to attempt it. It's only practical to do a completely fresh install, but who has duplicate clusters sitting around to do those types of transitions with production systems?! Bovine feces!

Tom Carroll (h-thomas-carroll) wrote :

I can confirm that that problem remains in 2014.2.1. A race condition exists that if both the tap and vif device exists, there is a chance that ovs will try to add the tag the tap port.

Changed in neutron:
assignee: nobody → Bob Ball (bob-ball)
Changed in neutron:
status: Opinion → Incomplete

Fix proposed to branch: master
Review: https://review.openstack.org/233498

Changed in neutron:
assignee: Bob Ball (bob-ball) → Jianghua Wang (wjh-fresh)
status: Incomplete → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/237900

Change abandoned by Jianghua Wang (<email address hidden>) on branch: master
Review: https://review.openstack.org/233498
Reason: design changed; the new patch set is: https://review.openstack.org/#/c/237900/

Change abandoned by Jianghua Wang (<email address hidden>) on branch: master
Review: https://review.openstack.org/237900
Reason: After some discussion, it's not a good solution by tracking two ports in neutron. it's thought that may break the neutron design. So we will go with the solution with change in nova to add a interim bridge between VM and the integration bridge. So in the world of neutron, it need only monitor the patch port connecting the interim bridge to the integration bridge. In this way, the switch over case between active and inactive port is transparent to neutron.
  So this patch set will be abandoned and the new solution will be covered by https://review.openstack.org/#/c/242846/
  Thanks all.

Alan Pevec (apevec) on 2015-11-24
tags: removed: havana-backport-potential

I've developed a work around that works at least for my environment. Note that I have security groups disabled. Following ovs-xapi-sync that copies external_ids between vif and taps, neutron-ovs-tag-sync monitors the ovsdb for changes to Port. tag and copies the tag from vifx to tapx and vice versa. In dom0, copy the script to dom0's /usr/share/openvswitch/scripts and execute it using

PYTHONPATH=/usr/share/openvswitch/python /usr/share/openvswitch/scripts/neutron-ovs-tag-sync --log-file --pidfile --detach --monitor unix:/var/run/openvswitch/db.sock

A startup script will be necessary to start the script after a reboot

Changed in neutron:
status: In Progress → Incomplete
assignee: Jianghua Wang (wjh-fresh) → nobody

This issue was fixed in the openstack/nova 14.0.0.0b2 development milestone.

Bob Ball (bob-ball) on 2016-07-25
Changed in neutron:
status: Incomplete → Fix Committed
affects: neutron → nova
summary: - OVS agent updates the wrong port when using Xen + Neutron with HVM or
+ OVS agent updates the wrong port when using XenAPI + Neutron with HVM or
PVHVM
Changed in nova:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers