[L3 HA] Inconsistent state of neutron database after running rally create_and_list_routers

Bug #1528208 reported by Eugene Nikanorov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
MOS Neutron
8.0.x
Fix Released
High
MOS Neutron
9.x
Invalid
High
MOS Neutron

Bug Description

Due to (probably) issue described in 1528207 neutron database may get into inconsistent state:
A port gets created having device_id = <router-id>, while router itself have been deleted.

As a result, such port can't be deleted, and corresponding HA network can't be deleted too without cleaning DB manually.

Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Setting to medium priority since this issue was discovered after I've fixed API-part of the router creation issue.
That might be an issue with the solution.
However to close this bug we'll need to verify that after rally tests we are able to gen clean neutron database without orphaned records.

tags: added: scale
Changed in mos:
assignee: nobody → MOS Neutron (mos-neutron)
milestone: none → 8.0
status: New → Confirmed
Revision history for this message
Alexander Ignatov (aignatov) wrote :

Moved to High priority since it's scale bug

Changed in mos:
importance: Medium → High
tags: added: area-neutron
removed: neutron
Revision history for this message
Alexander Ignatov (aignatov) wrote :

Possible candidate to be moved to mu-1, we need yet another run on the scale lab to verify and debug.

Revision history for this message
Kevin Benton (kevinbenton) wrote :

If we can get https://review.openstack.org/#/c/274570/ merged, this will at least make it much easier to cleanup if it still happens.

Revision history for this message
Kevin Benton (kevinbenton) wrote :

I also found another path of code that could conceivably leave behind orphaned ports during a delete_router call.

The bug is here: https://bugs.launchpad.net/neutron/+bug/1540271

The fix to prevent that one is here: https://review.openstack.org/#/c/274571/

Revision history for this message
Kevin Benton (kevinbenton) wrote :

The fix for 1528207 (the likely cause of this) is here: https://review.openstack.org/#/c/267173/

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/neutron (openstack-ci/fuel-8.0/liberty)

Fix proposed to branch: openstack-ci/fuel-8.0/liberty
Change author: Oleg Bondarev <email address hidden>
Review: https://review.fuel-infra.org/16602

tags: added: hit-hcf
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/neutron (openstack-ci/fuel-8.0/liberty)
Download full text (3.2 KiB)

Reviewed: https://review.fuel-infra.org/16602
Submitter: Pkgs Jenkins <email address hidden>
Branch: openstack-ci/fuel-8.0/liberty

Commit: f82049048bb2eb959000f531ad8bd91dd2c3ed24
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 2 07:55:16 2016

Merge the tip of origin/stable/liberty into origin/openstack-ci/fuel-8.0/liberty

Note: commit ed7ad25 Revert "Revert "Revert "Remove TEMPEST_CONFIG_DIR in the api tox env"""
sets min tox version to 2.3.1 while we currently use 1.9.2.
This patch sets it back to 1.9.2 in order for tests to pass.
Will be reverted back to 2.3.1 once https://bugs.launchpad.net/fuel/+bug/1540516
is fixed.

8476f6f Add relationship between port and floating ip
0bd401c DVR: optimize check_ports_exist_on_l3_agent()
9246cff Change check_ports_exist_on_l3agent to pass the subnet_ids
a133de3 Keep reading stdout/stderr until after kill
ed7ad25 Revert "Revert "Revert "Remove TEMPEST_CONFIG_DIR in the api tox env"""
05f8099 Ensure that tunnels are fully reset on ovs restart
b908c55 Update HA router state if agent is not active
aebd27f Resync L3, DHCP and OVS/LB agents upon revival
8e685c8 Fix floatingip status for an HA router
80c9e84 DVR:Fix _notify_l3_agent_new_port for proper arp update
c12bf81 Fix L3 HA with IPv6
2298566 Make object creation methods in l3_hamode_db atomic
0cc889f Cache the ARP entries in L3 Agent for DVR
8bde9c4 Cleanup veth-pairs in default netns for functional tests
2b96f42 Do not prohibit VXLAN over IPv6
1ab1e58 Fix get_subnet_for_dvr() to return correct gateway mac
3b42dee Imported Translations from Zanata
ca193d0 Revert "Change function call order in ovs_neutron_agent."
96d4ab3 Remove check on dhcp enabled subnets while scheduling dvr
f5299d3 Check gateway ip when update subnet
0d5d7c7 Add tests that constrain db query count
8fb3f9d Don't call add_ha_port inside a transaction
a370fa3 Log INFO message when setting admin state up flag to False for OVS port
bf92dbb DVR: notify specific agent when deleting floating ip
99d1c0d Call _allocate_vr_id outside of transaction
2468b3d Move notifications before DB retry decorator
1b609d2 DVR: handle dvr serviceable port's host change
ad75ccc Imported Translations from Zanata
2e6e135 Run functional gate jobs in a constrained environment
6902c87 DVR: notify specific agent when creating floating ip
00b800d Tox: Remove fullstack env, keep only dsvm-fullstack
d11e9cb Force L3 agent to resync router it could not configure
42f4332 Support migrating of legacy routers to HA and back
4d85fa1 Updated from global requirements
f175cd7 ML2: Add tests to validate quota usage tracking
79d4a08 test_migrations: Avoid returning a filter object for python3
745b546 Do not autoreschedule routers if l3 agent is back online
430892a Avoid full_sync in l3_agent for router updates
fa9fba2 In port_dead, handle case when port already deleted
1d8aff3 Add compatibility with iproute2 >= 4.0

Conflicts:
 neutron/db/l3_dvr_db.py
 neutron/db/l3_hamode_db.py
 neutron/tests/functional/services/l3_router/test_l3_dvr_router_plugin.py
 neutron/tests/unit/agent/l3/test_agent.py

Closes-Bug: #1496341
Closes-Bug: #1531244
Closes-Bug: #1527581
Closes-Bug: #1528201
Closes-Bug: #1528207
Closes...

Read more...

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Fix merged ^

Revision history for this message
Oleg Bondarev (obondarev) wrote :

Marking as invalid for 9.0 as fix is in Mitaka already

Revision history for this message
Ivan Lozgachev (ilozgachev) wrote :

Verified on ENV13 build 518

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.