Potential regression on handing over DHCP addresses to VMs

Bug #1599329 reported by Armando Migliaccio
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Kevin Benton
Tags: l3-ipam-dhcp
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

As the traces show:

info: initramfs: up at 2.08
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
info: copying initramfs to /dev/vda
info: initramfs loading root from /dev/vda
info: /etc/init.d/rc.sysinit: up at 4.50
info: container: none
Starting logging: OK
modprobe: module virtio_blk not found in modules.dep
modprobe: module virtio_net not found in modules.dep
WARN: /etc/rc3.d/S10-load-modules failed
Initializing random number generator... done.
Starting acpid: OK
cirros-ds 'local' up at 5.90
found datasource (configdrive, local)
Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending discover...

2016-07-05 22:37:40,487 28998 INFO [tempest.lib.common.rest_client] Request (TestNetworkBasicOps:_run_cleanups): 204 DELETE http://127.0.0.1:9696/v2.0/floatingips/fa755c01-9f6d-47cb-bfa5-27bba175624f 0.247s
2016-07-05 22:37:40,487 28998 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
        Body: None
    Response - Headers: {'x-openstack-request-id': 'req-947ba53e-e239-43f3-ace9-c068f224f8ee', 'content-length': '0', 'date': 'Tue, 05 Jul 2016 22:37:40 GMT', 'content-location': 'http://127.0.0.1:9696/v2.0/floatingips/fa755c01-9f6d-47cb-bfa5-27bba175624f', 'connection': 'close', 'status': '204'}
        Body:
2016-07-05 22:37:40,635 28998 INFO [tempest.lib.common.rest_client] Request (TestNetworkBasicOps:_run_cleanups): 204 DELETE http://127.0.0.1:8774/v2.1/servers/0747306e-8b5d-415e-8e70-faa19fe83c1c 0.146s
2016-07-05 22:37:40,635 28998 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
        Body: None
    Response - Headers: {'vary': 'X-OpenStack-Nova-API-Version', 'x-compute-request-id': 'req-48e30348-70c5-4c9e-8668-33768e43a89d', 'date': 'Tue, 05 Jul 2016 22:37:40 GMT', 'x-openstack-nova-api-version': '2.1', 'content-length': '0', 'connection': 'close', 'content-type': 'application/json', 'content-location': 'http://127.0.0.1:8774/v2.1/servers/0747306e-8b5d-415e-8e70-faa19fe83c1c', 'status': '204', 'openstack-api-version': 'compute 2.1'}
        Body:
2016-07-05 22:37:40,781 28998 INFO [tempest.lib.common.rest_client] Request (TestNetworkBasicOps:_run_cleanups): 200 GET http://127.0.0.1:8774/v2.1/servers/0747306e-8b5d-415e-8e70-faa19fe83c1c 0.145s
2016-07-05 22:37:40,781 28998 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
        Body: None
    Response - Headers: {'vary': 'X-Ope

There's no lease received.

Changed in neutron:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Kevin Benton (kevinbenton)
tags: added: l3-bgp
tags: added: l3-ipam-dhcp
removed: l3-bgp
Revision history for this message
Kevin Benton (kevinbenton) wrote :

The cause of this is that multiple ports are getting the same ofport assignment in OVS. This can be seen the vswitchd log:

2016-07-05T22:35:33.365Z|01321|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tap72cf6622-f4 device failed: No such device
2016-07-05T22:35:34.508Z|01322|bridge|INFO|bridge br-int: added interface tapc576ea66-4a on port 372
2016-07-05T22:35:38.347Z|01323|bridge|INFO|bridge br-int: added interface qvocec23074-06 on port 372
2016-07-05T22:35:38.355Z|01324|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tapc576ea66-4a device failed: No such device
2016-07-05T22:35:41.467Z|01325|bridge|INFO|bridge br-int: added interface tapc576ea66-4a on port 372
2016-07-05T22:35:41.483Z|01326|bridge|INFO|bridge br-int: added interface tapf227de76-76 on port 378
2016-07-05T22:35:41.683Z|01327|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tapf227de76-76 device failed: No such device
2016-07-05T22:35:42.608Z|01328|bridge|INFO|bridge br-int: added interface qvocec23074-06 on port 372
2016-07-05T22:35:42.621Z|01329|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tapc576ea66-4a device failed: No such device
2016-07-05T22:35:43.171Z|01330|bridge|INFO|bridge br-int: added interface tapc576ea66-4a on port 372
2016-07-05T22:35:46.147Z|01331|bridge|INFO|bridge br-int: added interface qvocec23074-06 on port 372

My suspicion is that this is a direct result of us ignoring the guideline not to use OVS devices that get moved into namespaces that was outlined in http://lists.openstack.org/pipermail/openstack-dev/2015-February/056834.html and https://bugzilla.redhat.com/show_bug.cgi?id=1160340

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/344545

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/344731

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

It looks like a bug which may be it's related to the non-netns support for internal ports in ovs. If that's the case, an alternative could be moving to veths, while that is not desirable (specially for router ports).

With a single host, cpu-pegged test run: [1]

I've seen that overhead of using a veth is around +10% in CPU consumption. Please note that the above test tests passing through two veth pairs (so it's 110% * 110% = 121% increase, or ~21% decrease in bandwidth, 20G.5bps to 16Gbps TCP_STREAM or 93k conn/sec TCP_RR to 93k conn/sec TCP_RR)

Talking to some ovs developers, they're willing to fix this (even if it was related to netns usage on internal ports, which is not supported), so I'm continuing kevinbenton's work trying to recreate a bash reproducer.

[1] https://github.com/mangelajo/ovs-experiments/blob/master/veth_vs_intport/veth_vs_intport_speed.sh

[2] https://gist.github.com/kevinbenton/e8a323b61ac6145f9293ef90dbdec0b1

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kevin Benton (<email address hidden>) on branch: master
Review: https://review.openstack.org/344731

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Ok, I spent most of the day looking at how thing works on openvswitch under the hood.

It seems that the kernel datapath assigns a ofport number for the kernel datapath, and then the user level assigns another port number, and maps it to the first one [1]

In the middle of that, it happens to try to fetch netdev details, I suspect again of a netns related, because if the device is moved into the netns then that mechanism fails (can't find the device and finish the mapping [1]) although we had code to avoid that by fetching the ofport number (with retries), may be the ovsdb switch broke it.

Btw, the second competing port (qvo, is a veth, and not namespaces, so still the whole picture isn't clear to me)

[1] https://github.com/openvswitch/ovs/blob/branch-2.0/ofproto/ofproto.c#L1714

[2] https://github.com/openstack/neutron/blob/master/neutron/agent/common/ovs_lib.py#L230

Changed in neutron:
assignee: Kevin Benton (kevinbenton) → Terry Wilson (otherwiseguy)
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

@Ajo: are you thinking that the switch to ovsdb native might have induced the fault?

Revision history for this message
Kevin Benton (kevinbenton) wrote :

Failures as far as the eye can see:

http://paste.openstack.org/show/539671/

Revision history for this message
Inessa Vasilevskaya (ivasilevskaya) wrote :

I observe similar log on devstack while running dsvm-functional job:

tox -e dsvm-functional neutron.tests.functional.agent.test_l2_ovs_agent.TestOVSAgent.test_ancillary_port_creation_and_deletion

results in multiple same ports/Device not found errors in vswitchd.log

2016-07-26T13:33:15.071Z|00066|bridge|INFO|bridge br-intf218d1f17: added interface int-br-phd56447 on port 1
2016-07-26T13:33:15.134Z|00067|bridge|INFO|bridge br-phys6c5ab5cf: added interface phy-br-phd56447 on port 1
2016-07-26T13:33:15.409Z|00072|bridge|INFO|bridge br-intf218d1f17: added interface tf218d1f17-patch-tun on port 2
2016-07-26T13:33:15.457Z|00073|bridge|INFO|bridge br-tunf805184a6: added interface nf805184a6-patch-int on port 1
2016-07-26T13:33:17.039Z|00074|bridge|INFO|bridge test-br81758532: added interface tap25323a581e8 on port 1
2016-07-26T13:33:17.207Z|00075|netdev_linux|INFO|ioctl(SIOCGIFHWADDR) on tap25323a581e8 device failed: No such device
2016-07-26T13:33:17.613Z|00076|bridge|INFO|bridge test-br81758532: added interface tap74091c71013 on port 2

Full log here - http://paste.openstack.org/show/542112/

Revision history for this message
Inessa Vasilevskaya (ivasilevskaya) wrote :

My bad, the ports are actually different among the bridges. Seems my observation is not relevant, sorry.

Revision history for this message
Inessa Vasilevskaya (ivasilevskaya) wrote :

@armando-migliaccio, I analysed Kevin's list of test failures and all of them have openvswitch agent with ovsdb_interface = vsctl enabled.

See for yourself (not all of test logs are alive, but still) - http://paste.openstack.org/show/542548/

Changed in neutron:
assignee: Terry Wilson (otherwiseguy) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Kevin Benton (<email address hidden>) on branch: master
Review: https://review.openstack.org/344545
Reason: abandoning for now assuming Terry's fix addressed it. Will watch for gate failures to confirm.

Revision history for this message
Brian Haley (brian-haley) wrote :

I've seen this in the multinode dvr jobs lately (VM not getting DHCP response). Take a look in the logs from the initial report, similar issue:

http://logs.openstack.org/92/320092/43/check/gate-tempest-dsvm-neutron-dvr/cf32111/logs/screen-q-dhcp.txt.gz?#_2016-07-05_22_39_17_220

RuntimeError: Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Network is unreachable

This is another occurrence I saw just yesterday:

http://logs.openstack.org/51/337851/19/check/gate-tempest-dsvm-neutron-dvr-multinode-full/c944b3d/logs/screen-q-dhcp.txt.gz

So maybe there's a case where the IP isn't getting added to the DHCP namespace properly, so trying to add a route via it isn't working? I'm assuming the interface is there already.

Miguel Lavalle (minsel)
Changed in neutron:
assignee: nobody → Miguel Lavalle (minsel)
Revision history for this message
Miguel Lavalle (minsel) wrote :

I can still see 28 occurrences of this bug over the past 7 days. Using the following query:

message:"AssertionError: False is not true : Timed out waiting for" AND message:"in check_vm_connectivity" AND build_queue:"check" AND project:"openstack/neutron"

I traced this failure: http://paste.openstack.org/show/567665/

The network id is aef3be78-6d90-46cf-b1ea-b27e947eebc3

Couldn't find errors related to this network in the DHCP screen:

http://logs.openstack.org/91/359591/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/0e1f116/logs/screen-q-dhcp.txt.gz#_2016-09-07_10_38_07_190

Will keep digging in

Revision history for this message
Miguel Lavalle (minsel) wrote :

We have a condition where the data plane is not ready when the Neutron's control plane and the instance deem the port ready. As a consequence, the instance's DHCPDISCOVER messages are never received by the corresponding dnsmasq instance, because data place is not ready yet. However, at the end, dnsmasq receives a DHCPRELEASE for an unknown lease for the port's instance, when presumably, data plane is ready.

Out of 35 occurrences over the past 7 days, 34 are with gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial. The other one is with gate-grenade-dsvm-neutron-dvr-multinode

This is an example that I traced all the way (patchset 367191):

1) Test case that failed:

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops

2) Tenant network tempest-network-smoke--949101106, id 51b4c899-f196-4561-839b-773303f73fd0 is created:
http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/tempest.txt.gz#_2016-09-08_14_14_10_998

3) DHCP agent creates and reloads allocations including instance's port. There are no DHCP agent failures related to this network in the log file:
http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/screen-q-dhcp.txt.gz#_2016-09-08_14_14_25_297

4) VM gets address:
http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/tempest.txt.gz#_2016-09-08_14_14_33_449

5) Test failure:
http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/tempest.txt.gz#_2016-09-08_14_16_38_313

Instance doesn't get response to DHCPDISCOVER:

udhcpc (v1.20.1) started
Sending discover...
Sending discover...
Sending discover...

6) dnsmasq instance never receives DHCPDISCOVER from instance. However, it receives a DHCPRELEASE for unknown lease from instance:

http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/syslog.txt.gz#_Sep_08_14_16_41

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

@Miguel: As of today the port is marked active only when both L2 and DHCP are setup properly. Your comment baffles me a bit. Digging...

Revision history for this message
Kevin Benton (kevinbenton) wrote :
Revision history for this message
Kevin Benton (kevinbenton) wrote :
Revision history for this message
Kevin Benton (kevinbenton) wrote :

Interestingly there is a kernel iptables drop message that mentions the instance's MAC on the node with the dhcp server. Could be related. It's the only drop on br-ex in the whole log.

http://logs.openstack.org/91/367191/2/check/gate-tempest-dsvm-neutron-dvr-multinode-full-ubuntu-xenial/7fd0e33/logs/syslog.txt.gz#_Sep_08_14_15_35

Revision history for this message
Kevin Benton (kevinbenton) wrote :
Revision history for this message
Kevin Benton (kevinbenton) wrote :

br-ex shouldn't even see this broadcast actually. it should be encapped in a VXLAN tunnel to br-int since this is a tenant network

Revision history for this message
Kevin Benton (kevinbenton) wrote :

I believe this is because the VNI being used by infra is the same as the one being allocated to the network. So when the network node receives the packet, it demultiplexes it out of the VNI onto the incorrect bridge, which is why we are seeing br-ex drops. Patch to infra will confirm.

Changed in neutron:
assignee: Miguel Lavalle (minsel) → Kevin Benton (kevinbenton)
Revision history for this message
Kevin Benton (kevinbenton) wrote :
Revision history for this message
Miguel Lavalle (minsel) wrote :

Fix (https://review.openstack.org/#/c/367716/) merged 2 days ago. Using this query, I don't see any occurrences since the fix merged: message:"AssertionError: False is not true : Timed out waiting for" AND message:"in check_vm_connectivity" AND project:"openstack/neutron"

Will keep an eye on it for a few more days

Miguel Lavalle (minsel)
Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
Miguel Lavalle (minsel) wrote :

Nope.... Ocurrences went away for almost a month and are showing up again since 10/3. Will take a look

Changed in neutron:
status: Fix Released → In Progress
Revision history for this message
Kevin Benton (kevinbenton) wrote :

I looked into all of the failures and they are either caused by the patch they appeared on (broke lots of other things as well), or they occurred on the linux bridge job. There may be a separate bug for linux bridge, but it's definitely not the same issue as this bug. We can't keep re-opening this every time there is a connectivity issue.

Changed in neutron:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.