HA network remains even if there is no more HA router

Bug #1367157 reported by Sylvain Afchain
64
This bug affects 10 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Ann Taraday
tempest
Won't Fix
Undecided
Unassigned

Bug Description

Currently when the last HA router of a tenant is deleted the HA network belonging to this tenant is not removed. This is the case in the rollback of a router creation and in the delete_router itself.

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :
Changed in neutron:
importance: Undecided → Medium
tags: added: api
Changed in neutron:
status: New → Confirmed
Changed in neutron:
assignee: nobody → Eugene Nikanorov (enikanorov)
Revision history for this message
Assaf Muller (amuller) wrote :

Removed assignee. Eugene, if you'd like to have a crack at this, please take ownership again.

Changed in neutron:
assignee: Eugene Nikanorov (enikanorov) → nobody
Revision history for this message
Yair Fried (yfried) wrote :

This causes resource leakage in Tempest. HA networks aren't cleaned, and, when using limited VNI range, or VLAN range, Tests will start failing once limit is reached.
Suggestion:
Introduce a cleaner to the tenant (not router!) cleanup code:

1. search for the HA network of this tenant (neutron net-list --tenant-id " ") **note that the network has empty tenant-id but the tenant it belongs to is part of the network name.
2. if exist, log a warning.
3. if CONF.network.cleanup_l3_ha: delete the network

This has to be logged and has to be turned off by default.

Changed in neutron:
assignee: nobody → Sridhar Gaddam (sridhargaddam)
Revision history for this message
yong sheng gong (gongysh) wrote :

agree with Yair Fried,
We should not try to delete the HA network when the last HA router is deleted, it risks race condition with other HA router creation.

We can provide a tool which to remove a tenant's resources.

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/207395

Vincent Hou (houshengbo)
Changed in tempest:
assignee: nobody → Vincent Hou (houshengbo)
assignee: Vincent Hou (houshengbo) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/207395
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Ryan Moats (rmoats)
tags: added: kilo-backport-potential liberty-backport-potential
Changed in neutron:
assignee: Sridhar Gaddam (sridhargaddam) → Ann Kamyshnikova (akamyshnikova)
Changed in neutron:
assignee: Ann Kamyshnikova (akamyshnikova) → Assaf Muller (amuller)
Assaf Muller (amuller)
Changed in neutron:
assignee: Assaf Muller (amuller) → Ann Kamyshnikova (akamyshnikova)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/207395
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=853f7d7a74a2281b0cd34c337362e73635e501a2
Submitter: Jenkins
Branch: master

commit 853f7d7a74a2281b0cd34c337362e73635e501a2
Author: sridhargaddam <email address hidden>
Date: Thu Jul 30 10:54:39 2015 +0000

    Delete HA network when last HA router is deleted

    Currently when the last HA router of a tenant is deleted the HA network
    belonging to this tenant is not removed. While running tempest aganist an
    OpenStack setup where tenant VLANs (with small VLAN range) is used we hit
    the limits are tempest tests start to fail as we cannot create new networks.
    This patch addresses this issue by deleting the HA network when the last HA
    router is deleted for the tenant.

    Closes-Bug: #1367157

    Co-Authored-By: Ann Kamyshnikova<email address hidden>

    Change-Id: I1d50b973aed4148857ac3d2bbee0d38e2e199783

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/250266

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Removed Kilo tag since it does not seem especially critical, networks are still available for explicit clean up.

tags: removed: kilo-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/250266
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=478012a397029077e8444fb2872a78988d492f74
Submitter: Jenkins
Branch: stable/liberty

commit 478012a397029077e8444fb2872a78988d492f74
Author: sridhargaddam <email address hidden>
Date: Thu Jul 30 10:54:39 2015 +0000

    Delete HA network when last HA router is deleted

    Currently when the last HA router of a tenant is deleted the HA network
    belonging to this tenant is not removed. While running tempest aganist an
    OpenStack setup where tenant VLANs (with small VLAN range) is used we hit
    the limits are tempest tests start to fail as we cannot create new networks.
    This patch addresses this issue by deleting the HA network when the last HA
    router is deleted for the tenant.

    Closes-Bug: #1367157

    Co-Authored-By: Ann Kamyshnikova<email address hidden>

    Change-Id: I1d50b973aed4148857ac3d2bbee0d38e2e199783
    (cherry picked from commit 853f7d7a74a2281b0cd34c337362e73635e501a2)

tags: added: in-stable-liberty
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b1

This issue was fixed in the openstack/neutron 8.0.0.0b1 development milestone.

Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.0.1

This issue was fixed in the openstack/neutron 7.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/254586

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/254586
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f54cba053556a43d51ccd895cdf8232c51210299
Submitter: Jenkins
Branch: master

commit f54cba053556a43d51ccd895cdf8232c51210299
Author: LIU Yulong <email address hidden>
Date: Tue Dec 8 14:13:44 2015 +0800

    Catch known exceptions during deleting last HA router

    In some scenarios, for instance rally test create_and_delete_routers,
    it will get some exceptions, such as the network in use exception,
    during the router deleting api call, but actually the router has
    been deleted. There has race between HA router create and delete,
    if set more api and rpc worker race raises exception more frequently.
    Because the inconsistent error message was not useful for user,
    this patch will catch those know exceptions ObjectDeletedError,
    NetworkInUse when user delete last HA router.

    At the same time, when user create the first HA router, but because
    of the failure of HA network creation, the router will be deleted,
    then the deleting HA network will raise AttributeError, this patch
    also move HA network deleting procedure under ha_network exist check
    block.

    Change-Id: I8cda00c1e7caffc4dfb20a817a11c60736855bb5
    Closes-Bug: #1523780
    Related-Bug: #1367157

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/liberty)

Related fix proposed to branch: stable/liberty
Review: https://review.openstack.org/259580

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/259580
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ff021e7e4c7cc28bf4524313d85f521d6ed4c3eb
Submitter: Jenkins
Branch: stable/liberty

commit ff021e7e4c7cc28bf4524313d85f521d6ed4c3eb
Author: LIU Yulong <email address hidden>
Date: Tue Dec 8 14:13:44 2015 +0800

    Catch known exceptions during deleting last HA router

    In some scenarios, for instance rally test create_and_delete_routers,
    it will get some exceptions, such as the network in use exception,
    during the router deleting api call, but actually the router has
    been deleted. There has race between HA router create and delete,
    if set more api and rpc worker race raises exception more frequently.
    Because the inconsistent error message was not useful for user,
    this patch will catch those know exceptions ObjectDeletedError,
    NetworkInUse when user delete last HA router.

    At the same time, when user create the first HA router, but because
    of the failure of HA network creation, the router will be deleted,
    then the deleting HA network will raise AttributeError, this patch
    also move HA network deleting procedure under ha_network exist check
    block.

    Change-Id: I8cda00c1e7caffc4dfb20a817a11c60736855bb5
    Closes-Bug: #1523780
    Related-Bug: #1367157
    (cherry picked from commit f54cba053556a43d51ccd895cdf8232c51210299)

tags: added: kilo-backport-potential
tags: removed: liberty-backport-potential
Revision history for this message
David Paterson (davpat2112) wrote :

From what I am seeing the HA networks are still being left behind and I verified the patched code in : https://review.openstack.org/#/c/259580/ is present. After running tempest there are many HA networks left behind.

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

This bug seems to be fixed on Neutron side, so it is nice to close the tempest bug at this time.

Changed in tempest:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.