allowed-address-pairs broken with l2pop/arp responder and LinuxBridge/VXLAN
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Undecided
|
Mark McClain |
Bug Description
Problem:
In Icehouse/Juno, when using ML2/LinuxBridge and VXLAN networks, allowed-
Steps to reproduce:
1. Create two instances in the same VXLAN network on two different hosts
2. Add a secondary IP address to instance #1, and add it to the port using --allowed-
3. Ping from instance #1 to instance #2 using the secondary IP address
4. On the compute node hosting instance #2, observe that the ARP request can be seen on the vxlan interface, but not the parent interface
Steps to resolve:
1. Add static ARP entry to instance #2
2. -OR- Add static ARP entry/neighbor entry to compute node hosting instance #2
The resolutions above become problematic when the allowed addresses are networks rather than single IPs, as in the cases where instances are acting as routers or NFV devices of some kind.
-------------------
Example:
Create network:
neutron net-create testnet
neutron subnet-create testnet 192.168.100.0/24
Create ports, one for each instance:
neutron port-create 56c413ca-
neutron port-create 56c413ca-
Add security group and allowed-
neutron port-update 6d6796cd-
neutron port-update 0715121b-
Boot instances:
nova boot --flavor 2 --image 0af87835-
nova boot --flavor 2 --image 0af87835-
Observe that the proper iptables rules are in place on the compute nodes:
root@Compute001:~# iptables-save | grep 6d6796cd
-A neutron-
-A neutron-
-A neutron-
root@Compute002:~# iptables-save | grep 0715121b
-A neutron-
-A neutron-
-A neutron-
Verify that ARP entries exist on the compute nodes (instances can ping each other at fixed IP as expected):
root@Compute001:~# arp -an | grep 192.168.100
? (192.168.100.4) at fa:16:3e:4d:73:7b [ether] PERM on vxlan-2
? (192.168.100.6) at fa:16:3e:1c:9d:55 [ether] PERM on vxlan-2
? (192.168.100.2) at fa:16:3e:d4:53:75 [ether] PERM on vxlan-2
? (192.168.100.3) at fa:16:3e:a6:a4:03 [ether] PERM on vxlan-2
root@Compute002:~# arp -an | grep 192.168.100
? (192.168.100.3) at fa:16:3e:a6:a4:03 [ether] PERM on vxlan-2
? (192.168.100.4) at fa:16:3e:4d:73:7b [ether] PERM on vxlan-2
? (192.168.100.2) at fa:16:3e:d4:53:75 [ether] PERM on vxlan-2
? (192.168.100.5) at fa:16:3e:bf:b0:a1 [ether] PERM on vxlan-2
!!!!! TEST !!!!!
Test: Configure 192.168.100.254 as a secondary address on INSTANCE#1 and ping out to INSTANCE#2
root@20150331-
root@20150331-
PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
^C
--- 192.168.100.6 ping statistics ---
26 packets transmitted, 0 received, 100% packet loss, time 25200ms
Result: Failure to reach destination
!!!!! TROUBLESHOOT !!!!!
Process:
1. Start ping:
root@20150331-
PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
2. Dump on vxlan interface on local compute node:
root@Compute001:~# tcpdump -i vxlan-2 -ne
tcpdump: WARNING: vxlan-2: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vxlan-2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:22:06.595700 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 28, length 64
14:22:07.603721 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 29, length 64
14:22:08.611701 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 30, length 64
14:22:09.619712 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 31, length 64
3. Dump on parent interface of local compute node:
root@Compute001:~# tcpdump -i bond1.206 -ne
tcpdump: WARNING: bond1.206: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond1.206, link-type EN10MB (Ethernet), capture size 65535 bytes
14:31:15.655396 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 4, length 64
14:31:16.663468 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 5, length 64
14:31:17.671412 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 6, length 64
14:31:18.679443 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 7, length 64
14:31:19.687445 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 8, length 64
^C
NOTE: ICMP requests are being sent to 192.168.100.6 from 192.168.100.254 with no response.
4. Dump on parent interface on remote compute node:
root@Compute002:~# tcpdump -i bond1.206 -ne
tcpdump: WARNING: bond1.206: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond1.206, link-type EN10MB (Ethernet), capture size 65535 bytes
14:27:12.889311 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 333, length 64
14:27:13.889318 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 334, length 64
14:27:14.889392 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 335, length 64
14:27:15.889315 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 336, length 64
14:27:16.889357 90:e2:ba:73:71:cd > 90:e2:ba:71:3b:1d, ethertype IPv4 (0x0800), length 148: 172.28.240.20.37449 > 172.28.240.21.8472: OTV, flags [I] (0x08), overlay 0, instance 2
fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 337, length 64
5. Dump on bridge interface on remote compute node:
root@Compute002:~# tcpdump -i brq56c413ca-6e -ne
tcpdump: WARNING: brq56c413ca-6e: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on brq56c413ca-6e, link-type EN10MB (Ethernet), capture size 65535 bytes
14:34:00.950062 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:00.969137 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 168, length 64
14:34:01.977167 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 169, length 64
14:34:01.977443 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:02.974092 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:02.985166 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 170, length 64
14:34:03.974131 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:03.993172 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 171, length 64
14:34:05.001197 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 172, length 64
14:34:05.001449 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:05.998187 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:34:06.009204 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1527, seq 173, length 64
6. Dump on vxlan interface on remote compute node:
root@Compute002:~# tcpdump -i vxlan-2 -ne
tcpdump: WARNING: vxlan-2: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vxlan-2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:23:04.052320 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 85, length 64
14:23:04.052704 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:23:05.049944 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:23:05.060333 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 86, length 64
14:23:06.049961 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:23:06.068312 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 87, length 64
14:23:07.076355 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 88, length 64
14:23:07.076655 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:23:08.074033 fa:16:3e:1c:9d:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.100.254 tell 192.168.100.6, length 28
14:23:08.084299 fa:16:3e:bf:b0:a1 > fa:16:3e:1c:9d:55, ethertype IPv4 (0x0800), length 98: 192.168.100.254 > 192.168.100.6: ICMP echo request, id 1521, seq 89, length 64
NOTE: The remote instance is attempting ARP requests for source addr but is getting no response. In fact, the request appears to be dropped through vxlan-2 to its parent, bond1.206..
!!!!! GETTING IT TO WORK !!!!!
1a. Add an ARP entry on instance2
arp -s 192.168.100.254 fa:16:3e:bf:b0:a1
Result: Success!
root@20150331-
PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
64 bytes from 192.168.100.6: icmp_seq=455 ttl=64 time=2014 ms
64 bytes from 192.168.100.6: icmp_seq=456 ttl=64 time=1014 ms
64 bytes from 192.168.100.6: icmp_seq=457 ttl=64 time=14.9 ms
64 bytes from 192.168.100.6: icmp_seq=458 ttl=64 time=0.939 ms
1b. -OR- Add an ARP entry to compute02
arp -s 192.168.100.254 fa:16:3e:bf:b0:a1 -i vxlan-2
Result: Success!
root@20150331-
PING 192.168.100.6 (192.168.100.6) from 192.168.100.254 : 56(84) bytes of data.
64 bytes from 192.168.100.6: icmp_seq=543 ttl=64 time=1.17 ms
64 bytes from 192.168.100.6: icmp_seq=544 ttl=64 time=0.812 ms
64 bytes from 192.168.100.6: icmp_seq=545 ttl=64 time=0.819 ms
64 bytes from 192.168.100.6: icmp_seq=546 ttl=64 time=0.810 ms
64 bytes from 192.168.100.6: icmp_seq=547 ttl=64 time=0.794 ms
64 bytes from 192.168.100.6: icmp_seq=548 ttl=64 time=0.820 ms
tags: |
added: l2pop removed: l2population |
tags: |
added: l2-pop removed: l2pop |
Changed in neutron: | |
status: | New → Confirmed |
Changed in neutron: | |
assignee: | nobody → yalei wang (yalei-wang) |
tags: | added: liberty-backport-potential |
tags: | removed: liberty-backport-potential |
Yeah, broadcasts are working okay, ping -b 255.255.255.255 is seen everywhere. But the VxLAN devices are intercepting all ARP requests, and unfortunately they don't pass them on when they don't know the answer. I can't see how to change this behavior in the ip-link man page. The only solution I can see would be to provide a new option that would allow you to disable proxy ARP when l2_population is enabled.