Associating Floating IP with a port can result in duplicate Floating IPs, due to the original FIP not being removed from the SNAT namespace.

Bug #1821299 reported by piotrrr
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Associating Floating IP with a port can result in duplicate Floating IPs, due to the original FIP not being removed from the SNAT namespace. This is likely specific to using DVR.

We're creating a Heat stack containing, among other things, a Floating IP and a Port.

  head_a_floating_ip:
    properties:
      floating_network_id: cf0a6df9-b533-457b-8cd7-0336f0649213
      port_id:
        get_resource: head_a_external_port
    type: OS::Neutron::FloatingIP
  head_a_external_port:
    properties:
      network_id:
        get_resource: external_net
      port_security_enabled: true
      replacement_policy: AUTO
      security_groups:
      - get_resource: head_sec_group
    type: OS::Neutron::Port

During this initial stack creation, we are not creating any VMs. So, the port is not attached to any device.

It looks like because of those two lines in floating-ip definition
  port_id:
        get_resource: head_a_external_port
after the initial stack creation the Floating IP gets allocated in a SNAT namespace of one of the hypervisors, and starts to respond to ARP requests.

However, as soon as we update this stack, adding a VM, and making the above mentioned port part of a VM, something weird happens. As expected, Neutron then allocates that FIP on the hypervisor hosting the VM (as expected, we're running DVR), however Neutron fails to remove the FIP it had created initially in the SNAT namespace, after the initial stack creation.

This results in FIP being present on two different hypervisors, causing duplicate ARP replies (one MAC being in the SNAT namespace, the other in the floating ip namespace), and obvious connectivity issues.

Note that the issues does not appear if the initial FIP happens to land in the SNAT namespace of the same hypervisors which will later (after stack update) also host the VM.

Simple, confirmed, workaround is to NOT include those two lines during the initial heat stack creation, and only include them in the stack update during which we add the VM.
  port_id:
        get_resource: head_a_external_port
Not including those lines initially in the stack results in Neutron not allocating the FIP anywhere.

Environment: Neutron Pike (11.0.5), with DVR, OVS, VLAN-based isolation.

piotrrr (piotrrr)
description: updated
Revision history for this message
Miguel Lavalle (minsel) wrote :

Hi Piotrrr,

The behaviour of creating the FIP in a snat namespace is expected. What is not expected is that the FIP is not migrated to the corresponding compute when the port is associated with a VM. There is a previously filed bug that describes the same situation as yours (without the heat templates): https://bugs.launchpad.net/neutron/+bug/1718788. It was fixed with this patch https://review.openstack.org/#/q/I6b1f3ffc3c3336035632f6a82d3a87b3be57b403 during the Queens cycle. As you can see, it was backported to Pike. Do you have this fix?

Marking incomplete until we hear back from submitter

Changed in neutron:
status: New → Incomplete
Revision history for this message
piotrrr (piotrrr) wrote :

Hi Miguel,

Thanks for your reply.

We have confirmed that we do have the patch you mentioned.

Additionally, please note that the bug you mentioned describes a slightly different situation.

In the bug you mentioned the FIP was never migrated to the host where the VM resides.

This is *not* what we're seeing.

In our case the FIP *is* properly brought up in the host where the VM resides (as expected), but it is never removed from the snat namespace (not expected).

So, that does look like a slightly different bug.

For what it's worth, we have several hypervisors, most of them run with 'agent_mode=dvr', but two of those are running in 'agent_mode=dvr_snat'.

Is there anything specific we can check on our end to help with troubleshooting this?

Thanks!

Miguel Lavalle (minsel)
Changed in neutron:
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.