DVR floatingip to rule priority association lost on Agent restart

Bug #1414779 reported by Rajeev Grover
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Ryan Moats

Bug Description

The rule priority associated with a floatingip is saved in the ri.floatingips_dict dictionary. This dictionary lives in the agent and therefore, if the agent is restarted such associations are lost and subsequent operations could cause inconsistent use of rule priorities.

Changed in neutron:
assignee: nobody → Rajeev Grover (rajeev-grover)
tags: added: l3-dvr-backlog
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/150194

Changed in neutron:
status: New → In Progress
Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

For the benefit of the casual bug reader, could you describe the ultimate effect of using inconsistent rule priorities?

Revision history for this message
Rajeev Grover (rajeev-grover) wrote :

One of the failures I have seen is:

1. Spawn a VM and associate a floatingip1
2. Restart the Agent
3. Spawn a second VM with floatingip2
 4. Disassociate the floatingip2

   --> Floatingip1 stops working sometimes

Revision history for this message
ZongKai LI (zongkai) wrote :

Hi, Rajeev. I followed your steps, and I found another issue, I report is @ https://bugs.launchpad.net/neutron/+bug/1434824.
After I added my patch in my env, I tested your bug again, seems not reproduced.
Maybe you have interesting to test that.

Revision history for this message
Rajeev Grover (rajeev-grover) wrote :

ZongKa LI,

 The issue here is that the rule priority associated to a floatingip is not saved over agent restart. The fix I am providing preserves such associations over agent restart so that future allocations of rule priorities do not collide with previous allocations.

 1. Spawn a VM and associate a floatingip1
2. Restart the Agent
3. Spawn a second VM with floatingip2

if both the floatingips are on the same external network, without the fix there will be duplicate ip rule priorities would be the same for both the floating ips, with the fix they won't

I had one router, I created one VM , associated a floating ip to the VM. Thereafter restarted the agent. Created another VM and associated a floating ip. This is what appears:

sdn@rg-oscv-cn2:~/devstack$ sudo ip netns exec qrouter-b5fb2ae4-ae83-4b46-b0b2-9960c853e61f ip rule s
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
32768: from 152.2.0.10 lookup 16
32768: from 152.2.0.21 lookup 16
2550267905: from 152.2.0.1/16 lookup 2550267905

After the fix that has been submitted for this bug report:

sdn@rg-oscv-cn2:~/devstack$ sudo ip netns exec qrouter-006f497f-86db-4685-b305-c0dcb203dd00 ip rule s
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
65937: from 152.2.0.10 lookup 16
71986: from 152.2.0.21 lookup 16

Revision history for this message
ZongKai LI (zongkai) wrote :

Hi, Rajeev, thanks for your comments, I get clear about this issue now.

Changed in neutron:
importance: Undecided → Medium
Changed in neutron:
assignee: Rajeev Grover (rajeev-grover) → Ryan Moats (rmoats)
Revision history for this message
Ryan Moats (rmoats) wrote :

Since, the automatic update didn't catch this...

Fix proposed to branch: master
Review: https://review.openstack.org/193711

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/210128

Revision history for this message
Ryan Moats (rmoats) wrote :

210128 factors out the changes to LinkLocalAllocator, so that 193711 can rebase on top of it and only involve the FIP side of that mechanism

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/210128
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2093d8727070829064b1a6005a9b411a383d246d
Submitter: Jenkins
Branch: master

commit 2093d8727070829064b1a6005a9b411a383d246d
Author: Ryan Moats <email address hidden>
Date: Thu Aug 6 16:24:55 2015 -0500

    Introduce ItemAllocator class

    The ItemAllocator class is used as the base class for
    LinkLocalAllocator in preparation for adding
    FipRulePriorityAllocator as a child class.

    Change-Id: I2c77e5a895f750845b46d3e8a2326e01ea87ee78
    Partial-Bug: #1414779
    Signed-off-by: Ryan Moats <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/211492

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (37.3 KiB)

Reviewed: https://review.openstack.org/211492
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a7b91632fc65ab9d2687298c68b1d715866d0356
Submitter: Jenkins
Branch: feature/pecan

commit 966203f89dee8fe61fb2dce654e36e510e80380f
Author: Sukhdev Kapur <email address hidden>
Date: Wed Jul 1 16:30:44 2015 -0700

    Neutron-Ironic integration patch

    This patch is in preparation for the integration
    of Ironic and Neutron. A new vnic_type is being
    added so that ML2 drivers can filter for all
    Ironic ports based upon match for 'baremetal'.
    Nova/Ironic will set this vnic_type when issuing
    port-create request to neutron.
    (e.g. binding:vnic_type = 'baremetal' )

    Change-Id: I25dc9472b31db052719db503a10c1fb1a55572ef
    Partial-Implements: blueprint neutron-ironic-integration

commit 236e408272bcb9b8e957524864e571b5afdc4623
Author: Oleg Bondarev <email address hidden>
Date: Tue Jul 7 12:02:58 2015 +0300

    DVR: fix router scheduling

    Fix scheduling of DVR routers to not stop scheduling once
    csnat portion was scheduled. See bug report for failing
    scenario.

    This partially reverts
    commit 3794b4a83e68041e24b715135f0ccf09a5631178
    and fixes bug 1374473 by moving csnat scheduling
    after general dvr router scheduling, so double binding does
    not happen.

    Closes-Bug: #1472163
    Related-Bug: #1374473
    Change-Id: I57c06e2be732e47b6cce7c724f6b255ea2d8fa32

commit e152f93878b9bb6af7cfedc9e045892fcf7d0615
Author: Assaf Muller <email address hidden>
Date: Sat Aug 8 21:15:03 2015 +0300

    TESTING.rst love

    Change-Id: I64b569048f8f87ea2fe63d861302b4020d36493d

commit 633c52cca1b383af2c900e1663c8682114acd177
Author: sridhargaddam <email address hidden>
Date: Wed Aug 5 10:49:33 2015 +0000

    Avoid dhcp_release for ipv6 addresses

    dhcp_release is only supported for IPv4 addresses [1] and not for
    IPv6 addresses [2]. There will be no effect when it is called with
    IPv6 address. This patch adds a corresponding note and avoids calling
    dhcp_release for IPv6 addresses.

    [1] http://manpages.ubuntu.com/manpages/trusty/man1/dhcp_release.1.html
    [2] http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2013q2/007084.html

    Change-Id: I8b8316c9d3d011c2a687a3a1e2a4da5cf1b5d604

commit 2de8fad17402f38bbc30204ee2f4f99cf21cb69d
Author: OpenStack Proposal Bot <email address hidden>
Date: Mon Aug 10 06:11:06 2015 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I2b423e83a7d0ac8b23239f81fe33dd8382c6fff6

commit fef79dc7b9162e03c8891645494c115b52d4d014
Author: Henry Gessau <email address hidden>
Date: Mon Aug 3 23:30:34 2015 -0400

    Consistent layout and headings for devref

    The lack of convention for heading levels among the independently
    written devref documents was starting to make the Table of Contents
    look rather messy when rendered in HTML.

    This patch does not cover the "Neutron Internals" section since its
    layo...

tags: added: in-feature-pecan
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/193711
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=de81ab8385e8490b2320e23ee7dd86b43e22fd32
Submitter: Jenkins
Branch: master

commit de81ab8385e8490b2320e23ee7dd86b43e22fd32
Author: Adolfo Duarte <email address hidden>
Date: Thu Jun 18 19:50:13 2015 -0700

    Preserve DVR FIP rule priority over Agent restarts

    IP rule priorities assigned to DVR floating IPs need
    to be preserved over L3 agent restarts. Reuse
    the ItemAllocator class decomposed from Link Local IP
    address allocation. Also move commn unit tests to
    ItemAllocator class.

    Closes-Bug: #1414779
    Change-Id: I6a75aa8ad612ee80b391f0a27a8a7e29519c3f8d
    Co-Authored-By: Rajeev Grover <email address hidden>
    Co-Authored-By: Ryan Moats <email address hidden>

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/pecan)

Fix proposed to branch: feature/pecan
Review: https://review.openstack.org/218710

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/pecan)
Download full text (155.6 KiB)

Reviewed: https://review.openstack.org/218710
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2c5f44e1b3bd4ed8a0b7232fd293b576cc8c1c87
Submitter: Jenkins
Branch: feature/pecan

commit f35d1c5c50dccbef1a2e079f967b82f0df0e22e9
Author: Adelina Tuvenie <email address hidden>
Date: Thu Aug 27 02:27:28 2015 -0700

    Fixes wrong neutron Hyper-V Agent name in constants

    Change Id03fb147e11541be309c1cd22ce27e70fadc28b5 moved the
    AGENT_TYPE_HYPERV constant from common.constants to
    plugins.ml2.drivers.hyperv.constants but change the value of the
    constant from 'HyperV agent' to 'hyperv'. This patch changes
    the name back to 'HyperV agent'

    Change-Id: If74b4b2a84811e266c8b12e70bf6bfe74ed4ea21
    Partial-Bug: #1487598

commit de604de334854e2eb6b4312ff57920564cbd4459
Author: OpenStack Proposal Bot <email address hidden>
Date: Sun Aug 30 01:39:06 2015 +0000

    Updated from global requirements

    Change-Id: Ie52aa3b59784722806726e4046bd07f4a4d97328

commit f0415ac20eaf5ab4abb9bd4839bf6d04ceee85d0
Author: armando-migliaccio <email address hidden>
Date: Fri Aug 28 13:53:04 2015 -0700

    Revert "Add support for unaddressed port"

    This implementation may expose a vulnerability where a malicious
    user can sieze the opportunity of a time window where a port
    may land unaddressed on a shared network, thus allowing him/her
    to suck up all the tenant traffic he/she wants....oh the shivers.

    This reverts commit d4c52b7f5a36a103a92bf9dcda7f371959112292.

    Change-Id: I7ebdaa8d3defa80eab90e460fde541a5bdd8864c

commit 013fdcd2a6d45dbe4de5d6e7077e5e9b60985ef9
Author: Assaf Muller <email address hidden>
Date: Fri Aug 28 16:41:07 2015 -0400

    Improve logging upon failure in iptables functional tests

    This will help us nail down a more accurate and efficient logstash
    query.

    Change-Id: Iee4238e358f7b056e373c7be8d6aa3202117a680
    Related-Bug: #1478847

commit 622dea818d851224a43d5276a81d5ce8a6eebb76
Author: Ivar Lazzaro <email address hidden>
Date: Mon Aug 17 17:17:42 2015 -0700

    handle gw_info outside of the db transaction on router creation

    Move the gateway interface creation outside the DB transaction
    to avoid lock timeout.

    Change-Id: I5a78d7f32e8ca912016978105221d5f34618af19
    Closes-bug: 1485809

commit 5b27d290a0a95f6247fc5a0fe6da1e7d905e6b2d
Author: Assaf Muller <email address hidden>
Date: Wed Aug 26 10:07:03 2015 -0400

    Remove ml2 resource extension success logging

    This is the cause of a tremendous amount of logs, for no
    perceivable gain. A normal dvr run in the gate shows this debug
    message around 120K times, which is way too much.

    Closes-Bug: #1489952

    Change-Id: I26fca8515d866a7cc1638d07fa33bc04479ae221

commit 8d3faf549cba2f58c872ef4121b2481e73464010
Author: huangpengtao <email address hidden>
Date: Fri Aug 28 23:20:46 2015 +0800

    Replace "prt" variable by "port"

    the local variable prt is meaningless,
    and port is used popular.

    Change-Id: I20849102cf5b4d84433c46791b4b1e2a22dc4739

commit ee374e7a5f4dea538fcd942f5...

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-3 → 7.0.0
Revision history for this message
Kevin Benton (kevinbenton) wrote :

Upgraded priority of this to High because it makes floating IPs randomly break after an agent restarts in a busy system.

Changed in neutron:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/312253

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/312254

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/kilo)

Change abandoned by Dave Walker (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/312254
Reason:
stable/kilo closed for 2015.1.4

This release is now pending its final release and no freeze exception has
been seen for this changeset. Therefore, I am now abandoning this change.

If this is not correct, please urgently raise a thread on openstack-dev.

More details at: https://wiki.openstack.org/wiki/StableBranch

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Dave Walker (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/312253
Reason:
stable/kilo closed for 2015.1.4

This release is now pending its final release and no freeze exception has
been seen for this changeset. Therefore, I am now abandoning this change.

If this is not correct, please urgently raise a thread on openstack-dev.

More details at: https://wiki.openstack.org/wiki/StableBranch

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.