ebtables ARP rules don't account for floating IPs on LinuxBridge

Bug #1483315 reported by Evan Callicoat
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Kevin Benton

Bug Description

The new ebtables ARP filtering rules don't account for floating IPs, which blocks ARP replies from the qrouter netns the float lives in, effectively blocking traffic to the float and thus the instance. Looking at the ebtables code, rules are currently only added for ports with port security enabled (port_filter:True), IPs in the fixed_ips list and IPs in the allowed-address pairs list for a given port. Floating IPs do not have port security enabled, aren't fixed_ips and aren't automatically inserted into router gateway port AAPs.

This is an example ebtables -L --Lc list of the filter table on the root namespace where the router is:
http://paste.openstack.org/show/412384/

192.168.74.0/24 is the private instance network
172.29.248.0/22 is the public network

192.168.74.1 is the router inside IP
192.168.74.2 is the DHCP server IP
192.168.74.3 is the instance IP

172.29.248.2 is the router gateway/outside IP
172.29.248.3 is the DHCP server IP (forgot to disable for the public)
172.29.248.8 is the floating IP

As you can see, the floating IP is not in the rules, which results in ARP replies from the qrouter namespace being dropped.

Adding the exception to ebtables results in working traffic, like this (line 18):
http://paste.openstack.org/show/412386/

For reference, here's ebtables from the compute node along with the instance information:
http://paste.openstack.org/show/412387/

Revision history for this message
Evan Callicoat (diopter) wrote :

Assuming there's no issues I'm not aware of by doing so, it seems like the easiest fix is to enable port security for floating IP ports, so they get rules added. Barring that, I imagine the ebtables code needs to add another case to explicitly add floats.

Revision history for this message
Matt Kassawara (ionosphere80) wrote :

I confirm this issue.

Revision history for this message
Matt Kassawara (ionosphere80) wrote :

I suspect this also impacts the Open vSwitch agent.

Revision history for this message
Sean M. Collins (scollins) wrote :

Matt and Evan, this was with the Linux Bridge agent correct?

tags: added: linuxbridge
Revision history for this message
Evan Callicoat (diopter) wrote :

Yes, but given that it's an issue which occurs in the ebtables rules, and the ebtables manager code is the same for OVS as LB, I suspect it affects OVS as well.

yalei wang (yalei-wang)
Changed in neutron:
assignee: nobody → yalei wang (yalei-wang)
Revision history for this message
yalei wang (yalei-wang) wrote :

I just try to reproduce it in OVS agent, but failed, is there some other special configuration ? I just use a all-in-one env.

Revision history for this message
yalei wang (yalei-wang) wrote :

hi Callicoat, could you provide more info on how to reproduce? I just use linuxbridge but don't find this issue.

Revision history for this message
Evan Callicoat (diopter) wrote :

Hi @yalei-wang,

All we did was setup two networks, an external (named "public") and an internal (named "private"), a single router (named "router"), hook both networks to that router, create an instance on the internal network, create a floating IP on the external network, and associate the floating IP with the instance.

From this point, you should be able to run "ebtables -L --Lc" to observe the lack of an ebtables rule for the floating IP on the host which contains the qdhcp/qrouter namespaces (not in the namespaces, just on that host). The actual issue is that when attempting to talk to that floating IP externally (*not* from inside the qrouter namespace where it lives), ARP replies are filtered due to the lack of an ACCEPT rule in ebtables, which results in not being able to communicate through the floating IP.

Revision history for this message
yalei wang (yalei-wang) wrote :
Download full text (3.2 KiB)

thanks for your detailed Evan. I deployed in a ALL_IN_ONE node, use lx bridge, and boot up a VM with fixed_ip 192.168.100.2, and assign 172.24.4.3 float_ip to it.

I can ping or arping the floatip from the host.

stack@stack-FFF2:~/devstack/devstack$ nova list
+--------------------------------------+------+--------+------------+-------------+------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+------------------------------------+
| 950fd8f3-0245-40c3-9f09-c6380c2f6bc1 | tmp1 | ACTIVE | - | Running | ext_net1=192.168.100.2, 172.24.4.3 |
+--------------------------------------+------+--------+------------+-------------+------------------------------------+

stack@stack-FFF2:~/devstack/devstack$ neutron floatingip-list
+--------------------------------------+------------------+---------------------+--------------------------------------+
| id | fixed_ip_address | floating_ip_address | port_id |
+--------------------------------------+------------------+---------------------+--------------------------------------+
| 1bab4413-6b59-4683-bc62-4f49e7747fd2 | 192.168.100.2 | 172.24.4.3 | 552a0fec-72c3-410d-9d23-3f1d550f0b70 |
+--------------------------------------+------------------+---------------------+--------------------------------------+

^C
--- 172.24.4.3 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.619/131.819/290.937/120.155 ms
stack@stack-FFF2:~/devstack/devstack$ sudo ebtables -L --Lc
Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 1, policy: ACCEPT
-p ARP -i tap552a0fec-72 -j neutronARP-tap552a0fec-72, pcnt = 23 -- bcnt = 644

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: neutronARP-tap552a0fec-72, entries: 1, policy: DROP
-p ARP --arp-ip-src 192.168.100.2 -j ACCEPT , pcnt = 23 -- bcnt = 644

stack@stack-FFF2:~/devstack/devstack$ ping 172.24.4.3
PING 172.24.4.3 (172.24.4.3) 56(84) bytes of data.
64 bytes from 172.24.4.3: icmp_seq=1 ttl=63 time=290 ms
64 bytes from 172.24.4.3: icmp_seq=2 ttl=63 time=103 ms
64 bytes from 172.24.4.3: icmp_seq=3 ttl=63 time=0.619 ms

stack@stack-FFF2:~/devstack/devstack$ arping -I brq7839d782-9f 172.24.4.3
ARPING 172.24.4.3 from 172.24.4.1 brq7839d782-9f
Unicast reply from 172.24.4.3 [FA:16:3E:E2:AE:DD] 0.630ms
Unicast reply from 172.24.4.3 [FA:16:3E:E2:AE:DD] 0.559ms
^CSent 2 probes (1 broadcast(s))
Received 2 response(s)

stack@stack-FFF2:~/devstack/devstack$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.140.1 0.0.0.0 UG 0 0 0 eth0
10.0.0.0 172.24.4.2 255.255.255.0 UG 0 0 0 brq7839d782-9f
172.24.4.0 0.0.0.0 255.255.255.0 U 0 0 0 brq7839d782-9f
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0...

Read more...

Revision history for this message
yalei wang (yalei-wang) wrote :

I got it. Thanks.

I think simple apply port-sec to the internal port could align with the semantics. I will upload a patch.

Revision history for this message
yalei wang (yalei-wang) wrote :

Hi evan, probably my operation is wrong, I don't find a similar line in your env

Bridge chain: neutronARP-tap1e686ab0-ab, entries: 1, policy: DROP
-p ARP --arp-ip-src 172.29.248.2 -j ACCEPT , pcnt = 14 -- bcnt = 392

this line should be installed for the router gateway port, but it should not be installed because it's port-sec should be false, I don't understand why it could be here.

Revision history for this message
Evan Callicoat (diopter) wrote :

port-security (binding:vif-details = {"port_filter": True}) was definitely enabled on the router gateway port. I don't know why or how, but that is the behavior we observed.

Revision history for this message
yalei wang (yalei-wang) wrote :

hi Evan, in my env, the route-gateway port is like this:

the binding:vif_details is {"port_filter": true}, but port-sec is false. I think the key to reproduce it is create a gateway port like your env.

Do you use the master code?

stack@stack-FFF2:~/devstack/devstack$ neutron port-show 077321de-3935-4e4f-99c5-efa49de2f0a0
+-----------------------+------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | stack-FFF2 |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| device_id | e3db7e8b-8e74-46cb-9a08-9d805469f7a8 |
| device_owner | network:router_gateway |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "e7a2bcfa-860b-458a-b7ae-9e3dab519e45", "ip_address": "172.24.4.4"} |
| | {"subnet_id": "5ac4ab7f-a6b9-40d4-bef6-6634763607f4", "ip_address": "2001:db8::5"} |
| id | 077321de-3935-4e4f-99c5-efa49de2f0a0 |
| mac_address | fa:16:3e:e2:ae:dd |
| name | |
| network_id | 7839d782-9fdf-42e2-b774-ddd4c32d7e54 |
| port_security_enabled | False |
| security_groups | |
| status | ACTIVE |
| tenant_id | |
+-----------------------+------------------------------------------------------------------------------------+

Revision history for this message
Matt Kassawara (ionosphere80) wrote :
Download full text (6.3 KiB)

Running master of neutron and python-neutronclient, I do not see a 'port_security_enabled' field in the output of 'neutron port-show' on the router gateway port nor the floating IP port (or the equivalent curl commands). Does this field only pertain to the Open vSwitch agent? Anyway, I see ebtables rules for the router gateway IP, but not for the floating IP. I cannot ping the floating IP unless I flush the ebtables rules.

Router gateway port:

# neutron port-show 54775162-cd04-4bee-9bcd-579576d1d2d0
+-----------------------+--------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | aio1_neutron_agents_container-24e2f457 |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| device_id | 8adef732-ab38-4454-ae8b-3308cc1d0738 |
| device_owner | network:router_gateway |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "40bcafbc-46a8-4aa9-a288-ccfd6b1925fd", "ip_address": "172.29.248.11"} |
| id | 54775162-cd04-4bee-9bcd-579576d1d2d0 |
| mac_address | fa:16:3e:b1:25:19 |
| name | |
| network_id | b5459514-0d1c-461b-ad5c-ea36446a1d40 |
| security_groups | |
| status | ACTIVE |
| tenant_id | |
+-----------------------+--------------------------------------------------------------------------------------+

Floating IP port:

# neutron port-show d586edba-c933-4afd-926a-48f47a989645
+-----------------------+--------------------------------------------------------------------------------------+
| Field ...

Read more...

Revision history for this message
Matt Kassawara (ionosphere80) wrote :
Download full text (6.5 KiB)

I found the problem with my deployment. Apparently the 'port_security_enabled' field requires the ML2 'port_security' extension. After enabling it, neutron only applies ebtables rules on VM ports. None of the router gateway nor floating IP ports on the node with the L3 agent contain ebtables rules. Also, I can connect to the floating IP on the instance.

At least with the Linux bridge agent, enabling the 'prevent_arp_spoofing' option without the 'port_security' extension seems to yield an indeterminate state of operation. The patch for this bug should probably address this situation rather than the original observation. I am curious if/how this impacts the OVS agent.

Router gateway port:

# neutron port-show 95df1d7a-2449-42f3-b079-cd790cc784c3
+-----------------------+--------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | aio1_neutron_agents_container-24e2f457 |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| device_id | 262fb407-ca21-444d-91dd-70b70c28fbcc |
| device_owner | network:router_gateway |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id": "50a81a17-5d7d-4bd4-a8dc-a4f175fba521", "ip_address": "172.29.248.11"} |
| id | 95df1d7a-2449-42f3-b079-cd790cc784c3 |
| mac_address | fa:16:3e:42:14:cf |
| name | |
| network_id | 0b5c8016-6865-4bd1-8d19-045108df0717 |
| port_security_enabled | False |
| security_groups | |
| status | ACTIVE |
| tenant_id | ...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/215491

Changed in neutron:
assignee: yalei wang (yalei-wang) → Kevin Benton (kevinbenton)
status: New → In Progress
Revision history for this message
yalei wang (yalei-wang) wrote :

thanks Matt, yes, in your way I can reproduce the phenomenon that router-gateway has a port-sec True,

That's don't use port-sec ext, and then add the ext and restart the neutron service.

I will submit another patch for this one.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/216143

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/221983

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/221983
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=051ff13771026b015c893a19a89654bf2ca4d018
Submitter: Jenkins
Branch: master

commit 051ff13771026b015c893a19a89654bf2ca4d018
Author: Kevin Benton <email address hidden>
Date: Wed Sep 2 07:04:55 2015 -0700

    Don't setup ARP protection on LB for network ports

    Skip adding ARP spoofing protection on Linux bridge ports
    with a device_owner field starting with 'network:'. This is
    already the case for the other iptables-based spoofing
    protection and is necessary for floating IPs to function
    correctly on router gateway ports.

    Change-Id: If53733fb3060e5ab44bac5388f42bdc384bcdb93
    Closes-Bug: #1483315

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/224334

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/224357

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (73.6 KiB)

Reviewed: https://review.openstack.org/224357
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fdc3431ccd219accf6a795079d9b67b8656eed8e
Submitter: Jenkins
Branch: feature/pecan

commit fe236bdaadb949661a0bfb9b62ddbe432b4cf5f1
Author: Miguel Angel Ajo <email address hidden>
Date: Thu Sep 3 15:40:12 2015 +0200

    No network devices on network attached qos policies

    Network devices, like internal router legs, or dhcp ports
    should not be affected by bandwidth limiting rules.

    This patch disables application of network attached policies
    to network/neutron owned ports.

    Closes-bug: #1486039
    DocImpact

    Change-Id: I75d80227f1e6c4b3f5fa7762b8dc3b0c0f1abd46

commit db4a06f7caa20a4c7879b58b20e95b223ed8eeaf
Author: Ken'ichi Ohmichi <email address hidden>
Date: Wed Sep 16 10:04:32 2015 +0000

    Use tempest-lib's token_client

    Now tempest-lib provides token_client modules as library and the
    interface is stable. So neutron repogitory doesn't need to contain
    these modules.
    This patch makes neutron use tempest-lib's token_client and removes
    the own modules for the maintenance.

    Change-Id: Ieff7eb003f6e8257d83368dbc80e332aa66a156c

commit 78aed58edbe6eb8a71339c7add491fe9de9a0546
Author: Jakub Libosvar <email address hidden>
Date: Thu Aug 13 09:08:20 2015 +0000

    Fix establishing UDP connection

    Previously, in establish_connection() for UDP protocol data were sent
    but never read on peer socket. That lead to successful read on peer side
    if this connection was filtered. Having constant testing string masked
    this issue as we can't distinguish to which test of connectivity data
    belong.

    This patch makes unique data string per test_connectivity() and
    also makes establish_connection() to create an ASSURED entry in
    conntrack table. Finally, in last test after firewall filter was
    removed, connection is re-established in order to avoid troubles with
    terminated processes or TCP continuing sending packets which weren't
    successfully delivered.

    Closes-Bug: 1478847
    Change-Id: I2920d587d8df8d96dc1c752c28f48ba495f3cf0f

commit e6292fcdd6262434a7b713ad8802db6bc8a6d3dc
Author: YAMAMOTO Takashi <email address hidden>
Date: Wed Sep 16 13:20:51 2015 +0900

    ovsdb: Fix a few docstring

    Change-Id: I53e1e21655b28fe5da60e58aeeb7cbbd103ae014

commit c22949a4449d96a67caa616290cf76b67b182917
Author: fumihiko kakuma <email address hidden>
Date: Wed Sep 16 11:52:59 2015 +0900

    Remove requirements.txt for the ofagent mechanism driver

    It is no longer used.

    Related-Blueprint: core-vendor-decomposition
    https://blueprints.launchpad.net/neutron/+spec/core-vendor-decomposition

    Change-Id: Ib31fb3febf8968e50d86dd66e1e6e1ea2313f8ac

commit d1d4de19d85f961d388c91e70f31b3bafec418c5
Author: Kevin Benton <email address hidden>
Date: Thu Sep 3 20:25:57 2015 -0700

    Always return iterables in L3 get_candidates

    The caller of this function expects iterables.

    Closes-Bug: #1494996
    Change-Id: I3d103e63f4e127a77268502415c0ddb0d804b54a

commit 1ad6ac448067306...

tags: added: in-feature-pecan
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (feature/pecan)

Change abandoned by Doug Wiegley (<email address hidden>) on branch: feature/pecan
Review: https://review.openstack.org/224334

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-rc1 → 7.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/216143
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.