[L3] floating IP failed to bind due to no agent gateway port(fip-ns)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Ussuri |
Fix Released
|
Undecided
|
Unassigned | ||
Victoria |
Fix Released
|
Undecided
|
Unassigned | ||
neutron |
Fix Released
|
Medium
|
Unassigned | ||
neutron (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
Undecided
|
Hemanth Nakkina | ||
Groovy |
Fix Released
|
Undecided
|
Unassigned | ||
Hirsute |
Fix Released
|
Undecided
|
Unassigned | ||
Impish |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
In patch [1] it introduced a binding of DB uniq constraint for L3
agent gateway. In some extreme case the DvrFipGatewayPo
is in DB while the gateway port not. The current code path only checks
the binding existence which will pass a "None" port to the following
code path that results an AttributeError.
[1] https:/
Exception log:
2020-06-11 15:39:28.361 1285214 INFO neutron.
2020-06-11 15:39:28.370 1285214 DEBUG neutron.
2020-06-11 15:39:28.390 1285214 DEBUG neutron.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
2020-06-11 15:39:28.391 1285214 ERROR oslo_messaging.
-------
[SRU]
[Impact]
In some cases the DvrFipGatewayPo
This resulted in connectivity issues to FIP for the new VMs launched on that compute node.
The fix creates the gateway port if it does not exist.
[Test Plan]
This is a race condition and difficult to reproduce. The test case simulated the error condition to verify the fix.
* Deploy openstack with dvr l3ha and centralised snat on neutron nodes
* Deploy instances and delete them. This step is to ensure FIP Agent gateway's are created on compute nodes
Check the following command to see FIP Agent gateway information
openstack port list --network ext_net -c id -c device_id -c binding_host_id -c device_owner -c fixed_ips | grep floatingip_
* Pick one of the compute node that has no instances and delete the FIP Agent gateway port (port id can be determined from above command)
openstack port delete <port id>
* Launch an instance on the compute node
openstack server create --wait --image cirros --flavor m1.cirros --nic net-id=<network id> --availability-zone nova:<hostname> cirros-test1
* Verify neutron-server logs for error
ERROR oslo_messaging.
* Assign floating ip and tried to ping fip and the ping fails
[Where problems could occur]
The fix itself adds an extra check to determine the cases when the gateway port needs to be created.
And hence it is not expected to cause any regression.
Changed in neutron: | |
assignee: | nobody → LIU Yulong (dragon889) |
tags: | added: l3-dvr-backlog |
Changed in neutron (Ubuntu Impish): | |
status: | New → Fix Released |
Changed in neutron (Ubuntu Hirsute): | |
status: | New → Fix Released |
Changed in neutron (Ubuntu Groovy): | |
status: | New → Fix Released |
tags: | added: sts |
description: | updated |
Changed in cloud-archive: | |
status: | New → Fix Released |
Fix proposed to branch: master /review. opendev. org/735432
Review: https:/