L3 agent can't handle updates that change floating ip id

Bug #1209011 reported by Carl Baldwin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Carl Baldwin
Havana
Fix Released
High
Carl Baldwin

Bug Description

The problem occurs when a network update comes along where a new floating ip id carries the same (reused) IP address as an old floating IP. In short, same address, different floating ip id. We've seen this occur in testing where the floating ip free pool has gotten small and creates/deletes come quickly.

What happens is the agent skips calling "ip addr add" for the address since the address already appears. It then calls "ip addr del" to remove the address from the qrouter's gateway interface. It shouldn't have done this and the floating ip is left in a non-working state.

Later, when the floating ip is disassociated from the port, the agent attempts to remove the address from the device which results in an exception which is caught above. The exception prevents the iptables code from removing the DNAT address for the floating ip.

2013-07-23 09:20:06.094 3109 DEBUG quantum.agent.linux.utils [-] Running command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-2b75022a-3721-443f-af99-ec648819d080', 'ip', '-4', 'addr', 'del', '15.184.103.155/32', 'dev', 'qg-c847c5a7-62'] execute /usr/lib/python2.7/dist-packages/quantum/agent/linux/utils.py:42
2013-07-23 09:20:06.179 3109 DEBUG quantum.agent.linux.utils [-]
Command: ['sudo', 'quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-2b75022a-3721-443f-af99-ec648819d080', 'ip', '-4', 'addr', 'del', '15.184.103.155/32', 'dev', 'qg-c847c5a7-62']
Exit code: 2
Stdout: ''
Stderr: 'RTNETLINK answers: Cannot assign requested address\n' execute /usr/lib/python2.7/dist-packages/quantum/agent/linux/utils.py:59

The DNAT entries in the iptables stay in a bad state from this point on sometimes preventing other floating ip addresses from being attached to the same instance.

I have a fix for this that is currently in testing. Will submit for review when it is ready.

Tags: l3-ipam-dhcp
Changed in neutron:
assignee: nobody → Carl Baldwin (carl-baldwin)
Changed in neutron:
importance: Undecided → High
status: New → Triaged
tags: added: l3-ipam-dhcp
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/40797

Changed in neutron:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/41584

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/41584
Committed: http://github.com/openstack/neutron/commit/9382ee659212285a203550cf60476dd146d27a29
Submitter: Jenkins
Branch: master

commit 9382ee659212285a203550cf60476dd146d27a29
Author: Carl Baldwin <email address hidden>
Date: Tue Aug 13 00:11:29 2013 +0000

    Refactor configuring of floating ips on a router.

    This approach to configuring floating ips is stateless and idempotent.
    This allows it to handle corner cases, such as reusing a floating ip
    address with a different floating ip id in a way that is easier to
    understand.

    The concept is to wipe the floating ips clean and rebuild them each
    time with the following optimizations. To avoid bad performance in
    manipulating iptables, it is called in the context of a call to
    defer_apply_on. To avoid a disruption in network flow a set
    difference is use to determine the set of addresses that no longer
    belong on the inteface rather than removing them all blindly.

    Change-Id: I0cfb58d487b1925e0a0db2a701c5ea3c56a0b2b5
    Fixes: Bug #1209011

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/42412

Changed in neutron:
milestone: none → havana-3
Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

Mark, note that the fix committed status is wrong since the initial fix was reverted.

Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

Please reopen. The fix has not been released. The first fix was reverted due to a gate breakage and the follow-on fix has suffered from lack of reviews.

Changed in neutron:
milestone: havana-3 → havana-rc1
status: Fix Released → In Progress
tags: added: havana-rc-potential
Changed in neutron:
milestone: havana-rc1 → none
Thierry Carrez (ttx)
tags: added: havana-backport-potential
removed: havana-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/53689

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/42412
Committed: http://github.com/openstack/neutron/commit/a65188fab01f29d095031abbc8d1d194548cd8be
Submitter: Jenkins
Branch: master

commit a65188fab01f29d095031abbc8d1d194548cd8be
Author: Carl Baldwin <email address hidden>
Date: Fri Sep 27 04:04:31 2013 +0000

    Refactor configuring of floating ips on a router

    This approach to configuring floating ips is stateless and idempotent.
    This allows it to handle corner cases, such as reusing a floating ip
    address with a different floating ip id in a way that is easier to
    understand.

    The concept is to wipe the floating ips clean and rebuild them each
    time with the following optimizations. To avoid bad performance in
    manipulating iptables, it is called in the context of a call to
    defer_apply_on. To avoid a disruption in network flow a set
    difference is use to determine the set of addresses that no longer
    belong on the inteface rather than removing them all blindly.

    Change-Id: I98aacbbb52b35688036990961d02e0b273504a77
    Fixes: Bug #1209011

Changed in neutron:
status: In Progress → Fix Committed
Changed in neutron:
milestone: none → icehouse-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/havana)

Reviewed: https://review.openstack.org/53689
Committed: http://github.com/openstack/neutron/commit/38211a2f4167ca12850f6e771899fffce6da72f2
Submitter: Jenkins
Branch: stable/havana

commit 38211a2f4167ca12850f6e771899fffce6da72f2
Author: Carl Baldwin <email address hidden>
Date: Fri Sep 27 04:04:31 2013 +0000

    Refactor configuring of floating ips on a router

    This approach to configuring floating ips is stateless and idempotent.
    This allows it to handle corner cases, such as reusing a floating ip
    address with a different floating ip id in a way that is easier to
    understand.

    The concept is to wipe the floating ips clean and rebuild them each
    time with the following optimizations. To avoid bad performance in
    manipulating iptables, it is called in the context of a call to
    defer_apply_on. To avoid a disruption in network flow a set
    difference is use to determine the set of addresses that no longer
    belong on the inteface rather than removing them all blindly.

    Change-Id: I0cfb58d487b1925e0a0db2a701c5ea3c56a0b2b5
    Fixes: Bug #1209011

tags: added: in-stable-havana
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Alan Pevec (apevec)
tags: removed: havana-backport-potential in-stable-havana
Thierry Carrez (ttx)
Changed in neutron:
milestone: icehouse-1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.