Security group filters for all ports are refreshed on any DHCP port change

Bug #1653830 reported by Mike Dorman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Kevin Benton

Bug Description

Whenever any change is made to a DHCP agent port, a refresh of all security group filters for all ports on that network is triggered. This is unnecessary as all instance ports automatically get a blanket allow rule for DHCP port numbers. So changes to DHCP ports in no way require updates to any filters.

For networks with a large number of ports, this also generates significant load against neutron-server and the backend database.

Steps to reproduce:

- Network with some number of instance ports
- Add or remove a DHCP agent from that network (constitutes a change of DHCP ports)
- A refresh for all ports on that network is triggered

See: https://github.com/openstack/neutron/blob/master/neutron/db/securitygroups_rpc_base.py#L138-L140

We experience this issue in Liberty, and it's still present in master.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/416380

Changed in neutron:
assignee: nobody → Mike Dorman (mdorman-m)
status: New → In Progress
Revision history for this message
Akihiro Motoki (amotoki) wrote :

This can happen in case there are many ports on a compute node.

However, currently neutron allows users to change IP address of DHCP ports, so we cannot skip iptable changes as the proposed review does.

Actually there is no need to define iptables rules for DHCP ports per port.
DHCP ports are prepared per network, so we can use a common iptables rule per network or a common ipset per network.
My vote is to use an ipset per network because we don't need to reload iptables rules when IP address of any DHCP port is changed.

tags: added: sg-fw
Changed in neutron:
importance: Undecided → Medium
tags: added: loadimpact
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/432506

Changed in neutron:
assignee: Mike Dorman (mdorman-m) → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Mike Dorman (<email address hidden>) on branch: master
Review: https://review.openstack.org/416380
Reason: In favor of: https://review.openstack.org/#/c/432506/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)
Download full text (3.3 KiB)

Reviewed: https://review.openstack.org/432506
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ae9d1160bdaedc23dd6a1a0a85c09b8e65422a13
Submitter: Jenkins
Branch: master

commit ae9d1160bdaedc23dd6a1a0a85c09b8e65422a13
Author: Kevin Benton <email address hidden>
Date: Thu Feb 9 00:27:20 2017 -0800

    Stop making IP-specific provider rules in SG code

    Setting up rules to allow DHCPv6, DHCP, and RAs from specific
    IP addresses based on Neutron resources has a few issues:

    1. It violates separation of concerns. We are implementing logic to
       calculate where an IPv6 RA advertisement or DHCP advertisement
       should be coming from in the security group code. This code should
       not be trying to guess IPv6 LLAs, know about subnet modes, DHCP server
       implementations, or the type of L3 plugin being used. Currently all
       of these assumptions are baked into code that should only be
       filtering, which makes it very rigid and brittle when it comes to
       other implementations for DHCP and/or RAs.
    2. It has scaling issues on large networks. Every time one of these
       provider rules is updated, it triggers every L2 agent to refresh
       all of the security group rules for ports in that network, which puts
       significant load on the server.
    3. It's main purpose: preventing spoofing of RA[1,2] and DHCP packets,
       has long been superceded by preventing VMs from acting as DHCP/RA
       servers[3][4].

    This patch completely removes all of this logic and just returns
    static provider rules to the agents that allow all DHCP server
    and RA traffic ingress to the client. This addresses the issues
    highlighted above since the code is significantly simplified and
    the provider rules don't require refreshes on the agents.

    Now that the provider rules never change, the RPC notification
    listener on the agent-side for 'notify_provider_updated' is now
    just a NOOP that doesn't trigger any refreshes. The notification
    was left in place on the server side for older version agents
    that have stale IP-specific provider rules. The entire notification
    can be removed in the future.

    The one open concern with this approach is that VMs will now be
    able to receive DHCP offers from other DHCP servers on the same
    network that aren't being filtered (e.g. a VM with port security
    disabled or another device on a provider network). In order to
    address this for DHCP, this patch adds two rules that only allow
    DHCP offers targeted to either the broadcast or the correct client
    IP. This prevents incorrect offers from ever reaching the client.
    For RAs, this patch just allows all RAs so we may pick up
    advertisements from other v6 routers attached to a network;
    however, the instance won't actually be allowed to use bad addresses.

    1. https://bugs.launchpad.net/neutron/+bug/1262759
    2. I1d5c7aaa8e4cf057204eb746c0faab2c70409a94
    3. Ice1c9dd349864da28806c5053e38ef86f43b7771
    4. https://git.openstack.org/cgit/openstack/neutron/tree/
       neutron/agent/linux/iptables_firewall.py
       ?h=521b1074...

Read more...

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/456745

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b1

This issue was fixed in the openstack/neutron 11.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata)
Download full text (3.3 KiB)

Reviewed: https://review.openstack.org/456745
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c5fa966e8fe8e770ee9a367428df4be698e3edc0
Submitter: Jenkins
Branch: stable/ocata

commit c5fa966e8fe8e770ee9a367428df4be698e3edc0
Author: Kevin Benton <email address hidden>
Date: Thu Feb 9 00:27:20 2017 -0800

    Stop making IP-specific provider rules in SG code

    Setting up rules to allow DHCPv6, DHCP, and RAs from specific
    IP addresses based on Neutron resources has a few issues:

    1. It violates separation of concerns. We are implementing logic to
       calculate where an IPv6 RA advertisement or DHCP advertisement
       should be coming from in the security group code. This code should
       not be trying to guess IPv6 LLAs, know about subnet modes, DHCP server
       implementations, or the type of L3 plugin being used. Currently all
       of these assumptions are baked into code that should only be
       filtering, which makes it very rigid and brittle when it comes to
       other implementations for DHCP and/or RAs.
    2. It has scaling issues on large networks. Every time one of these
       provider rules is updated, it triggers every L2 agent to refresh
       all of the security group rules for ports in that network, which puts
       significant load on the server.
    3. It's main purpose: preventing spoofing of RA[1,2] and DHCP packets,
       has long been superceded by preventing VMs from acting as DHCP/RA
       servers[3][4].

    This patch completely removes all of this logic and just returns
    static provider rules to the agents that allow all DHCP server
    and RA traffic ingress to the client. This addresses the issues
    highlighted above since the code is significantly simplified and
    the provider rules don't require refreshes on the agents.

    Now that the provider rules never change, the RPC notification
    listener on the agent-side for 'notify_provider_updated' is now
    just a NOOP that doesn't trigger any refreshes. The notification
    was left in place on the server side for older version agents
    that have stale IP-specific provider rules. The entire notification
    can be removed in the future.

    The one open concern with this approach is that VMs will now be
    able to receive DHCP offers from other DHCP servers on the same
    network that aren't being filtered (e.g. a VM with port security
    disabled or another device on a provider network). In order to
    address this for DHCP, this patch adds two rules that only allow
    DHCP offers targeted to either the broadcast or the correct client
    IP. This prevents incorrect offers from ever reaching the client.
    For RAs, this patch just allows all RAs so we may pick up
    advertisements from other v6 routers attached to a network;
    however, the instance won't actually be allowed to use bad addresses.

    1. https://bugs.launchpad.net/neutron/+bug/1262759
    2. I1d5c7aaa8e4cf057204eb746c0faab2c70409a94
    3. Ice1c9dd349864da28806c5053e38ef86f43b7771
    4. https://git.openstack.org/cgit/openstack/neutron/tree/
       neutron/agent/linux/iptables_firewall.py
       ?h=52...

Read more...

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.2

This issue was fixed in the openstack/neutron 10.0.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.