DVR: floating IPs not working if initially associated with non-bound port

Bug #1447034 reported by Oleg Bondarev
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Lynn
Kilo
New
Undecided
Unassigned

Bug Description

Floating agent gw port is only created for compute host when floating ip is associated with a VM resided on this host [1].
If associate neutron port with floating ip before booting a VM with that port, floating agent gw port won't be created (in case this is the first VM scheduled to a compute host).
In that case l3 agent on compute host will receive router info with floating ip but no floating agent gw port: it will subscribe the router for fip namespace [2] but namespace itself won't be created [3]:
 [dvr_router.py]

    def create_dvr_fip_interfaces(self, ex_gw_port):
        floating_ips = self.get_floating_ips()
        fip_agent_port = self.get_floating_agent_gw_interface(
            ex_gw_port['network_id'])
        LOG.debug("FloatingIP agent gateway port received from the plugin: "
                  "%s", fip_agent_port)
        if floating_ips:
            is_first = self.fip_ns.subscribe(self.router_id)
            if is_first and fip_agent_port:
                if 'subnets' not in fip_agent_port:
                    LOG.error(_LE('Missing subnet/agent_gateway_port'))
                else:
                    self.fip_ns.create_gateway_port(fip_agent_port)
        ...

Since l3 agent already subscribed the router for fip_ns it won't ever create fip namespace for that router - this results in floating ips not working anymore for ANY subsequent VMs on that compute host, no matter if floating ip was associated with a VM or with a non-binded port (later associated with a VM).

I see two possible fixes:
 - add callback for PORT UDATE event to dvr server code to react on port with floating ip being associated with a VM.
This seems not optimal given lots of checks needed in the callback which will be called fairly often.

 - l3 agent on a compute host should request floating agent gw creation by rpc in case it receives router info with floating ips but no floating agent gateway. There is already such a method in agent to plugin rpc interface which now seems not used anywhere except tests. I'm not seeing any cons here so that's what I'm going to propose.

[1] https://github.com/openstack/neutron/blob/master/neutron/db/l3_dvr_db.py#L214-L225
[2] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_router.py#L502
[3] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_router.py#L503-L507

summary: - DVR: floating IPs not working if initially associated with non-binded
+ DVR: floating IPs not working if initially associated with non-bound
port
Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

This bug seems to be a duplicate of #1445255

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

Callback is already there for port-update.
As mentioned by you, the second option to call the "rpc" call to create the floatingip-agent-gateway port will be the best option in case if there is fip association that happens for an unbound port.

Again this will only work for the port that is bound to a host.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I don't think this is a duplicate of bug #1445255, let's treat them differently and at different priorities.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/177507

Changed in neutron:
assignee: Oleg Bondarev (obondarev) → Swaminathan Vasudevan (swaminathan-vasudevan)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/177882

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

from what I can tell this is a regression of:

https://review.openstack.org/#/c/153735

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

Armando yes we made changes to address the timing issue with respect to the agent gateway port getting created.
But with the case of late-binding we need to address this with the RPC call again.

Changed in neutron:
assignee: Swaminathan Vasudevan (swaminathan-vasudevan) → Lynn (lynn-li)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/181599

Changed in neutron:
assignee: Lynn (lynn-li) → Swaminathan Vasudevan (swaminathan-vasudevan)
Revision history for this message
Adolfo Duarte (adolfo-duarte) wrote :

Quick question.

Are you saying that
https://review.openstack.org/181599
migth fix this on it's own, or do you need:
https://review.openstack.org/177882

as well?

Changed in neutron:
assignee: Swaminathan Vasudevan (swaminathan-vasudevan) → Lynn (lynn-li)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/kilo)

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/177882
Reason: Abandoning since there is no way the patch will merge with wrong Change-Id.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/196572

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196701

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (feature/pecan)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: feature/pecan
Review: https://review.openstack.org/196701
Reason: This is lacking the functional fix [1], so I'll propose a new merge commit which includes that one.

[1] https://review.openstack.org/#/c/196711/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/196920

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (171.5 KiB)

Reviewed: https://review.openstack.org/196920
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7f759c077f8f860c13db92d2ea6b353ef6b70900
Submitter: Jenkins
Branch: feature/pecan

commit 8123144fadd7c5d5e6e56a76ea860512619a2cf6
Author: Moshe Levi <email address hidden>
Date: Sun Jun 28 14:37:14 2015 +0300

    Fix Consolidate sriov agent and driver code

    This patch add mising __init to mech_sriov/mech_driver/
    and update the setup.cfg to the new agent entrypoint

    Trivial Fix

    Change-Id: I53a527081feb78472f496675bbb3c5121d38a14a

commit 8942fccf02e6e179d47582fdb2792a1ca972da21
Author: Assaf Muller <email address hidden>
Date: Mon Jun 29 11:38:51 2015 -0400

    Remove failing SafeFixture tests

    The fixtures 1.3 release attempted to fix the fixtures resource
    leak issue, but failed to do so completely. Our own SafeFixture
    is still needed: The 1.3 release broke our SafeFixture tests,
    but not the usage of SafeFixture itself. This patch removes
    those failing tests for now to unbreak the gate. Jakub reported
    a bug on fixtures 1.3:
    https://bugs.launchpad.net/python-fixtures/+bug/1469759

    We will continue to use SafeFixture until that bug is fixed
    in fixtures, at which point we will be able to require
    fixtures > 1.3.

    Change-Id: I59457c3bb198ff86d5ad55a1e623d008f0034b8f
    Closes-Bug: #1469734

commit 71dffb0a2c1720cd8233a329d32958a0160dd6f5
Author: Kevin Benton <email address hidden>
Date: Mon Jun 29 08:27:41 2015 +0000

    Revert "Removed test_lib module"

    This reverts commit 9a6536de6e1a7fe9b2552adc142e254426b82b6f.

    We pulled all of the plugins out of the tree, many of which still inherit
    from neutron test classes. This change then stated that we no longer
    support testing other plugins. I think this is a bit premature and should
    have been discussed under the subject
    "Neutron plugins can't use neutron plugin unit tests" or something
    similar.

    Change-Id: I68318589f010b731574ea3bfa8df98492bab31fc

commit b20fd81dbd497e058384a0af065dd0f1fdc4c728
Author: Jakub Libosvar <email address hidden>
Date: Fri Jun 5 14:32:51 2015 +0000

    Refactor NetcatTester class

    Following capabilities were added:
       - used transport protocol is passed as a constant instead of bool
       - src port for testing was added
       - connection can be established explicitly
       - change constructor parameters of NetcatTester

    As a part of removing bool for protocol definition
    get_free_namespace_port() was also modified to match the behavior.

    Change-Id: Id2ec322e7f731c05a3754a65411c9a5d8b258126

commit 83e37980dcd0b2bad6d64dd2cb23bcd2891cafca
Author: jingliuqing <email address hidden>
Date: Sat Jun 27 13:41:54 2015 +0800

    Use REST rather than ReST

    Change-Id: I06c9deaab58c5ec13bfeec39fb8fd4b1fe21f42d

commit 1b60df85ba3ad442c2e4e7e52538e1b9a1bf9378
Author: Kevin Benton <email address hidden>
Date: Thu Jun 25 18:34:38 2015 -0700

    Add a double-mock guard to the base test case

    Use mock to patch mock with a check to prevent multiple active
    patches to the...

tags: added: in-feature-pecan
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/qos)

Fix proposed to branch: feature/qos
Review: https://review.openstack.org/197751

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/qos)
Download full text (19.6 KiB)

Reviewed: https://review.openstack.org/197751
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d7e60d59a34a7415a0b0185dbaa99bdc84723608
Submitter: Jenkins
Branch: feature/qos

commit 3da491cf5fe629559281507f65f12a0e34eaedf7
Author: Assaf Muller <email address hidden>
Date: Tue Jun 30 13:22:17 2015 -0400

    Disable pylint job

    Disabling pylint until it gets unbroken. Pylint 1.4.1 is using
    logilab-common, which had a release on the 30th, breaking pylint.
    Pylint developers are planning a logilab-common release tomorrow
    which should unbreak pylint once again, at which point I'll
    re-enable pylint.

    Change-Id: I5d8aaab8192168946c2a0b74abc1a56848ca51a2
    Related-Bug: #1470186

commit 2bbfe6f8253659ebf6951b6426ffc446baacd420
Author: Russell Bryant <email address hidden>
Date: Tue May 26 17:07:37 2015 -0400

    Move windows requirements to requirements.txt

    Commit 276028cca26af573c14938255e40c58358eabd4a added these
    requirements to setup.py from a custom build hook. These requirements
    can now be expressed in requirements.txt. We need to move them there
    so that the global requirements sync job can continue to keep setup.py
    in sync with the global version.

    Depends-on: I2369971d306c10dc39a1b89698cec95cf7551d07
    Change-Id: I3c07c279d33f6aed46c3a97dd9ba81251e51429a

commit 21ff82d9d33313bb88e5970c7b1829a65f195d33
Author: Rossella Sblendido <email address hidden>
Date: Fri Dec 5 17:34:23 2014 +0100

    Adds base in-tree functional testing of the ovs_neutron_agent

    Base setup and utility methods for functional testing of the
    OVS L2 agent.

    Partially-Implements: blueprint restructure-l2-agent
    Co-Authored-By: Rossella Sblendido <email address hidden>

    Change-Id: I5b3149b2b8502b9b9a36d3e20d909872cc17f8e8

commit 1ac7581c6b7d343d2ee22e6c562871c0465d9735
Author: Livnat Peer <email address hidden>
Date: Tue Jun 30 16:25:57 2015 +0300

    fix spelling mistakes

    Change-Id: If063f111fa42a6644a1dadc7f0c0b9bbfb359294

commit 9b23617111706ef6a89e8ba45457238acaea26e2
Author: Kevin Benton <email address hidden>
Date: Mon Jun 29 22:24:22 2015 -0700

    Increase ping count on ARP spoof test

    The other IPv4 tests all have a count of 2 to tolerate
    ping failures due to slow ARP response/interface setup/etc.
    This patch increases test_arp_spoof_allowed_address_pairs_0cidr
    to 2 to match.

    Closes-Bug: #1470234
    Change-Id: I82bd8397672194f6162eef5392d4f19d57450552

commit 8123144fadd7c5d5e6e56a76ea860512619a2cf6
Author: Moshe Levi <email address hidden>
Date: Sun Jun 28 14:37:14 2015 +0300

    Fix Consolidate sriov agent and driver code

    This patch add mising __init to mech_sriov/mech_driver/
    and update the setup.cfg to the new agent entrypoint

    Trivial Fix

    Change-Id: I53a527081feb78472f496675bbb3c5121d38a14a

commit 8942fccf02e6e179d47582fdb2792a1ca972da21
Author: Assaf Muller <email address hidden>
Date: Mon Jun 29 11:38:51 2015 -0400

    Remove failing SafeFixture tests

    The fixtures 1.3 release attempted to fix the fixtures resource
    leak issue, but failed to do so ...

tags: added: in-feature-qos
Revision history for this message
Oleg Bondarev (obondarev) wrote :
Changed in neutron:
status: In Progress → Fix Committed
tags: removed: kilo-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/196572
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4e2148e68cd733b9010bd1886049c298f68cc648
Submitter: Jenkins
Branch: stable/kilo

commit 4e2148e68cd733b9010bd1886049c298f68cc648
Author: Swaminathan Vasudevan <email address hidden>
Date: Fri Apr 24 16:58:48 2015 -0700

    Fix FloatingIP Namespace creation in DVR for Late Binding

    DVR has dependency on the portbinding host to determine
    where to start the FloatingIP Namespace when floatingip
    is configured. But when we assign a floatingip to a port
    that is not bound, even though the API will succeed, the
    FloatingIP Namespace will not be created by the Agent and
    so the FloatingIP will not be functional.

    This patch addresses the issue by creating the Namespace
    and configuring the rules when the late binding happens.

    The agent will be requesting the FIP agent gateway port,
    if required and then proceed to configure the FloatingIP
    Namespace.

    Change-Id: I9b9158bddb626c2bb535acd709452560546fd184
    Closes-Bug: #1447034
    Closes-Bug: #1460408
    (cherry picked from commit 0a82b8ae1951073ff5f9b096485b3acf1a541428)

tags: added: in-stable-kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Carl Baldwin (<email address hidden>) on branch: master
Review: https://review.openstack.org/181599
Reason: Restore if necessary.

Changed in neutron:
milestone: none → liberty-2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/181599
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-2 → 7.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/179439
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.