duplicate ping packets from dhcp namespace when pinging across DVR subnet VMs

Bug #1358718 reported by Sarada
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Armando Migliaccio

Bug Description

1. have a multi node devstack setup in which 1 Controller, 1 NN & 2CNs
2. Create two networks & subnets within it.
net1 - 10.1.10.0/24
net2 10.1.8.0/24

3. Create a distributed router. Add two interfaces to the DVR.
4. Spawn VM1 in net1 & host it on CN1.
5. Spawn VM2 in net2 & host it on CN2.
6. login to NN & from net1 dhcp namespace try to ping VM2 which is part of net2.

As shown below we can see duplicate ping packets.

stack@NN:~/devstack$ sudo ip netns exec qdhcp-111de30b-cedf-492d-88b3-5a5fc2a92f4d ifconfig
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:328 (328.0 B) TX bytes:328 (328.0 B)

tap68b11c40-f9 Link encap:Ethernet HWaddr fa:16:3e:87:67:20
          inet addr:10.1.10.3 Bcast:10.1.10.255 Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe87:6720/64 Scope:Link
          UP BROADCAST RUNNING MTU:1500 Metric:1
          RX packets:179 errors:0 dropped:0 overruns:0 frame:0
          TX packets:104 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:16358 (16.3 KB) TX bytes:10284 (10.2 KB)

stack@NN:~/devstack$ sudo ip netns exec qdhcp-111de30b-cedf-492d-88b3-5a5fc2a92f4d ping 10.1.8.2
PING 10.1.8.2 (10.1.8.2) 56(84) bytes of data.
64 bytes from 10.1.8.2: icmp_req=1 ttl=63 time=3.11 ms
64 bytes from 10.1.8.2: icmp_req=1 ttl=63 time=3.13 ms (DUP!)
64 bytes from 10.1.8.2: icmp_req=2 ttl=63 time=0.515 ms
64 bytes from 10.1.8.2: icmp_req=2 ttl=63 time=0.537 ms (DUP!)
64 bytes from 10.1.8.2: icmp_req=3 ttl=63 time=0.362 ms
64 bytes from 10.1.8.2: icmp_req=3 ttl=63 time=0.385 ms (DUP!)
64 bytes from 10.1.8.2: icmp_req=4 ttl=63 time=0.262 ms
64 bytes from 10.1.8.2: icmp_req=4 ttl=63 time=0.452 ms (DUP!)
^C
--- 10.1.8.2 ping statistics ---
4 packets transmitted, 4 received, +4 duplicates, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.262/1.094/3.132/1.174 ms
stack@qatst231:~/devstack$

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

On a side note, can you confirm that in the steps above you had preliminarily allowed ICMP for the VM's security groups?

tags: added: l3-dvr-backlog
Revision history for this message
Sarada (sarada-a) wrote :

yes. I used the default security group to spawn VMs & added icmp rule to the Security group

Changed in neutron:
assignee: nobody → Vivekanandan Narasimhan (vivekanandan-narasimhan)
Revision history for this message
Vivekanandan Narasimhan (vivekanandan-narasimhan) wrote :

With the current architecture of DVR, when we excite traffic from DHCP Namespaces (on one dvr subnet) to VMs (on other subnet in same dvr), will result in Duplicate responses generated for the requests.

The same situation will happen vice-versa also wherein if we excite traffic from a VM(on one dvr-routed subnet) to a DHCP Server IP (on another dvr -routed subnet)

When a PING request is initiated by the DHCP Server towards the VM on another subnet, the DHCP Server will first transmit the packet to default gateway port available on DVR. Since the default gateway port for the DVR is replicated on all the other Compute Nodes, the PING request will reach the DVR interfaces on all the compute nodes including the target compute node where the VM resides, as here l2-pop will replicate the PING request packet to all the compute nodes via Table 22.

Now all the existing compute nodes (which does not have the target VM), will route the Ping Packet and re-transmit the packet to the data network to the right destination compute node. So the target VM will receive X PING Requests (if there are X compute nodes) and hence it will respond to all such PING requests, thereby generating X PING replies.

So if we have x compute nodes. Then the PING request will yield X PING replies. In the scenario tried by Sarada , there were two compute nodes (excluding the NN that ran the DHCP Servr), so there were two PING replies. One PING reply to the original PING request that landed in the right compute node first itself. And the second PING reply is for the shadow PING request that was routed to the correct compute node, by the other compute node.

Since DHCP Server is not accessible by the tenants, tenants explicitly generating traffic to DHCP Servers on another subnet (and vice-versa) is not a frequent phenomenon.

As a result, I request this to be triaged as a LOW bug.

Revision history for this message
Vivekanandan Narasimhan (vivekanandan-narasimhan) wrote :

This problem can occur on any port that is available on NN and if it tries to access an entity resident on another dvr-routed subnet.

This issue can be addressed by explicitly blocking traffic from DHCP Servers to their respective default gateways in the NN.
But this may lead to problems when there is genuine need to allow traffic to/from other subnets to the DHCP Server.
Further to it, DVR has no presence in NNs today (unless it is also a Service Node). If we need to add rules, then all the l2-agents in the cloud must be configured to be DVR Capable.

Changed in neutron:
importance: Undecided → Medium
Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/117185

Changed in neutron:
assignee: Vivekanandan Narasimhan (vivekanandan-narasimhan) → Armando Migliaccio (armando-migliaccio)
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → juno-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/117185
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bb1e89efbdbafed7275d01409bf3f55cbe187da5
Submitter: Jenkins
Branch: master

commit bb1e89efbdbafed7275d01409bf3f55cbe187da5
Author: Vivekanandan Narasimhan <email address hidden>
Date: Wed Aug 27 02:07:37 2014 -0700

    Fix DVR to service DHCP Ports

    This fix ensures that DHCP Ports that are
    available on DVR routed subnets, are serviced
    by DVR neutron infrastructure.

    Here servicing by DVR means, creation of
    DVR namespaces on such nodes holding DHCP
    Ports and also applying DVR specific OVS
    Rules to the br-int and br-tun bridges on
    such nodes, to enable traffic to be routed
    via DVR to such DHCP Ports.

    Closes-Bug: #1358718

    Change-Id: Ib6d5fbf883d6698f34f3a3b722e426e3285a5736

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.