neutron should validate gateway_ip is in subnet

Bug #1304181 reported by Aaron Rosen on 2014-04-08
276
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Security Advisory
Undecided
Unassigned
neutron
Undecided
Assaf Muller
Havana
Undecided
Unassigned
Icehouse
Undecided
Unassigned

Bug Description

I don't believe this is actually a valid network configuration:

arosen@arosen-MacBookPro:~/devstack$ neutron subnet-show be0a602b-ea52-4b13-8003-207be20187da
+------------------+------------------------------------------------+
| Field | Value |
+------------------+------------------------------------------------+
| allocation_pools | {"start": "10.11.12.1", "end": "10.11.12.254"} |
| cidr | 10.11.12.0/24 |
| dns_nameservers | |
| enable_dhcp | True |
| gateway_ip | 10.0.0.1 |
| host_routes | |
| id | be0a602b-ea52-4b13-8003-207be20187da |
| ip_version | 4 |
| name | private-subnet |
| network_id | 53ec3eac-9404-41d4-a899-da4f32045abd |
| tenant_id | f2d9c1726aa940d3bd5a8ee529ea2480 |
+------------------+------------------------------------------------+

Kevin Benton (kevinbenton) wrote :

I think this is config controlled right now with cfg.CONF.force_gateway_on_subnet.

https://github.com/openstack/neutron/blob/master/neutron/db/db_base_plugin_v2.py#L1068

Sridhar Gaddam (sridhargaddam) wrote :

A side-effect of such a configuration is mentioned in the following bug.
https://bugs.launchpad.net/neutron/+bug/1302275

Assaf Muller (amuller) on 2014-05-07
Changed in neutron:
assignee: nobody → Assaf Muller (amuller)

Fix proposed to branch: master
Review: https://review.openstack.org/92619

Changed in neutron:
status: New → In Progress

Reviewed: https://review.openstack.org/92619
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0e44b7b5418c400af7358a5391f350f2f737929e
Submitter: Jenkins
Branch: master

commit 0e44b7b5418c400af7358a5391f350f2f737929e
Author: Assaf Muller <email address hidden>
Date: Wed May 7 18:05:42 2014 +0300

    Make sure that gateway is in CIDR range by default

    Git commit c3706fa2 introduced the force_gateway_on_subnet
    option that verified that the defined gateway is in the CIDR
    range of a newly created or updated subnet. However, the default
    value was False for backwards compatability reasons. The default
    will change to True and the option will be marked as deprecated.

    For IPv6, the gateway must be in the CIDR only if the gateway
    is not a link local address.

    DocImpact
    Change-Id: I04fd1caec6da5dceee3f736b3f91f2468150ba2a
    Closes-Bug: #1304181

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx) wrote :

Backport reviews claim there is a DoS here to justify bypassing stable branch rules. Adding security to investigate that

information type: Public → Public Security
Assaf Muller (amuller) wrote :

Clarification:
If tenant A creates a network and a subnet with a gateway not in the subnet, and (On master only) disables force_gateway_on_subnet, then the server will happily accept the request, yet all instances on that network will fail to get an IP address. As for how this affects other networks and tenants: The DHCP agent responds to RPC events for when a network, subnet or port is created/updated/deleted. It also performs a full sync every so often. The full sync gets all networks from the server, and configures them one by one. This full sync will now fail, and the DHCP agent will raise an exception when trying to configure the problematic network. It won't attempt to configure the other networks. When the exception is raised, a full sync is flagged again. This means that the DHCP agent will never leave this infinite full-cycle loop. What does this mean? If a network by another tenant was created, but its RPC notification to the server was lost, then the next full sync should have configured this network, but if that network comes after the problematic network during the full sync, then that network will never be configured. To summarize, one bad network ruins the entire bunch.

The bug has a short term and a long term solution. The short term solution (Was) to force all networks to be in the subnet, IE: Fail fast. The server should never have accepted the creation of such a network knowing that the agents can't handle it. This is the fix I'm trying to backport. The long term solution is to fix the agents (This affects both the DHCP agent and the L3 agent).

Thierry Carrez (ttx) wrote :

That sounds like a valid DoS avenue to me. Let's see what my fellow members at the VMT think of it.

Changed in ossa:
status: New → Incomplete
Jeremy Stanley (fungi) wrote :

This report looks like a duplicate of bug 1333134 which Aaron confirmed "does not break dhcp from continuing from working."

Assaf Muller (amuller) wrote :

It is a duplicate, but the issue remains as detailed in comment 8.

Jeremy Stanley (fungi) wrote :

Assaf, you imply in comment 8 that the denial of service condition only affects the master branch... or are you suggesting that it always affects stable releases but only affects master when force_gateway_on_subnet is explicitly disabled?

Assaf Muller (amuller) wrote :

Master has force_gateway_on_subnet enabled, so you'd have to explicitly turn it off to be affected. Havana and Icehouse don't have it enabled (Yet) by default.

Thierry Carrez (ttx) on 2014-09-04
Changed in neutron:
milestone: none → juno-3
status: Fix Committed → Fix Released
Aaron Rosen (arosen) wrote :

Assaf - sorry why doesn't the next resync fix it? We call self.safe_configure_dhcp_for_network() https://github.com/openstack/neutron/blob/stable/icehouse/neutron/agent/dhcp_agent.py#L167 which catches the exception so it can continue syncing. I agree that it sucks that we get in a state where we keep resyncing. I think we might as well backport this patch to avoid this. Though I don't see how this causes a DOS.

Assaf Muller (amuller) wrote :

Aaron - I see now that we backported a fix for that. I was still under the mindset that one rotten apple ruins the bunch. In this case the constant resyncing only affects the problematic network itself, and not other networks. It's hard to classify this as a DoS as the error domain is limited to the tenant performing the 'attack'.

In this case this should not be considered a security issue. I still think there's significant value in backporting the fix as I've supported multiple users battling the issue, and it's a painful one.

Jeremy Stanley (fungi) wrote :

Thanks for the follow-up confirmation. I've switched our security advisory task to "won't fix" since this sounds like it's not an issue for which we need to publish an advisory.

Changed in ossa:
status: Incomplete → Won't Fix
Alan Pevec (apevec) wrote :

Too late for Havana, Ihar has provided a relnote https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.4#Neutron

Change abandoned by Alan Pevec (<email address hidden>) on branch: stable/havana
Review: https://review.openstack.org/113129
Reason: Final Havana release 2013.2.4 has been cut and stable/havana is going to be removed in a week.

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/113130
Reason: Resolution was not to backport this.

Thierry Carrez (ttx) on 2014-10-16
Changed in neutron:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public Security information  Edit
Everyone can see this security related information.

Other bug subscribers