L3 agent allows multiple gateway ports in fip namespace

Bug #1597561 reported by Carl Baldwin on 2016-06-30
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Brian Haley

Bug Description

At the end of deleting a GW port for a router, l3_dvr_db.py will look
for any more router gw ports on the external network. If there are
none, then it calls delete_floatingip_agent_gateway_port [1]. This
should fan out to all l3 agents on all compute nodes [2]. Each agent
should then delete the port [3].

In some cases, the fip namespace and the gateway port are not deleted.
I don't know where things are going wrong. This seems pretty
straight-forward. Do some agents miss the fanout? We know at least
some of them are getting the fanout. So, it is definitely being sent.

When I checked, the port had been deleted from the database. The fact
that a new one is created supports this because if one existed in the DB
already then it would be returned.

[1] https://github.com/openstack/neutron/blob/d3cd20151a67289f023875de682a6d3c4ccee645/neutron/db/l3_dvr_db.py#L179
[2] https://github.com/openstack/neutron/blob/d3cd20151a67289f023875de682a6d3c4ccee645/neutron/api/rpc/agentnotifiers/l3_rpc_agent_api.py#L166
[3] https://github.com/openstack/neutron/blob/d3cd20151a67289f023875de682a6d3c4ccee645/neutron/agent/l3/dvr.py#L73

Changed in neutron:
importance: Undecided → High
status: New → Confirmed
tags: added: l3-dvr-backlog l3-ipam-dhcp
description: updated
Carl Baldwin (carl-baldwin) wrote :

I'm marking this High because of what happens when there are multiple fg ports in the fip namespace. Because DVR uses proxy_arp on the fg port, having two of them with the same route to the external network makes the host essentially reply to any arp request on the subnet, receive the traffic, and then spit it right back out the other fg interface.

This happens because proxy_arp works by responding to any arp request for an IP address it thinks it can route to on another interface. With two fg interfaces with the same route, it thinks it can always route the packet to another interface, regardless of the IP address.

With one of these fip namespaces on the network, it manifests as a performance degradation because traffic passes through an extra host. With two or three, things get really ugly. These hosts can form a routing loop and packets go round and round until TTL expires. Yikes!

Fix proposed to branch: master
Review: https://review.openstack.org/335755

Changed in neutron:
assignee: nobody → Carl Baldwin (carl-baldwin)
status: Confirmed → In Progress
Changed in neutron:
milestone: none → newton-2
Changed in neutron:
assignee: Carl Baldwin (carl-baldwin) → Brian Haley (brian-haley)

Reviewed: https://review.openstack.org/335755
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=537e2f540a5c4cedc657bd1b996a367ac9c3ec65
Submitter: Jenkins
Branch: master

commit 537e2f540a5c4cedc657bd1b996a367ac9c3ec65
Author: Carl Baldwin <email address hidden>
Date: Thu Jun 30 01:19:38 2016 +0000

    DVR: Ensure that only one fg device can exist at a time in fip ns

    Change-Id: I3e78c8d497f9187ba64dcce317f8c709067a11e6
    Closes-Bug: #1597561

Changed in neutron:
status: In Progress → Fix Released
tags: added: mitaka-backport-potential

Reviewed: https://review.openstack.org/338442
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c72a2404f6aa3d5d42ce2b66abeaa16bda5c0348
Submitter: Jenkins
Branch: stable/mitaka

commit c72a2404f6aa3d5d42ce2b66abeaa16bda5c0348
Author: Carl Baldwin <email address hidden>
Date: Thu Jun 30 01:19:38 2016 +0000

    DVR: Ensure that only one fg device can exist at a time in fip ns

    Change-Id: I3e78c8d497f9187ba64dcce317f8c709067a11e6
    Closes-Bug: #1597561
    (cherry picked from commit 537e2f540a5c4cedc657bd1b996a367ac9c3ec65)

tags: added: in-stable-mitaka

This issue was fixed in the openstack/neutron 9.0.0.0b2 development milestone.

tags: added: neutron-proactive-backport-potential

Reviewed: https://review.openstack.org/341779
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9d4f497827351745483363b696376c457d0c2281
Submitter: Jenkins
Branch: stable/liberty

commit 9d4f497827351745483363b696376c457d0c2281
Author: Carl Baldwin <email address hidden>
Date: Thu Jun 30 01:19:38 2016 +0000

    DVR: Ensure that only one fg device can exist at a time in fip ns

    Closes-Bug: #1597561
    (cherry picked from commit 537e2f540a5c4cedc657bd1b996a367ac9c3ec65)

    Conflicts:
     neutron/tests/functional/agent/l3/test_dvr_router.py
    Change-Id: I3e78c8d497f9187ba64dcce317f8c709067a11e6

tags: added: in-stable-liberty

This issue was fixed in the openstack/neutron 7.1.2 release.

This issue was fixed in the openstack/neutron 8.2.0 release.

tags: removed: mitaka-backport-potential neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers