[ovn-octavia-provider] loadbalancer stuck in PENDING_X if delete_vip_port fails

Bug #1965732 reported by Luis Tomas Bolivar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Luis Tomas Bolivar

Bug Description

Load balancer are stuck in pending_x status if delete_vip_port function fails
with an error different than PortNotFound when::
- deleting a loadbalancer
- failed to created a loadbalancer

The problem comes from proper status update not being sent back to octavia

Changed in neutron:
status: New → In Progress
Changed in neutron:
assignee: nobody → Luis Tomas Bolivar (ltomasbo)
Revision history for this message
Bence Romsics (bence-romsics) wrote :

Hi Tomas,

Thanks for the report. I only found a delete_vip_port function in ovn-octavia-provider, so I tagged the ticket accordingly. Please correct me if I'm misunderstanding something.

Is there way a to reproduce this problem? If yes, how? What should be the correct behavior - going to an error state? When the error happens do we miss freeing any other resources other than the db record of the load balancer?

tags: added: ovn-octavia-provider
summary: - loadbalancer stuck in PENDING_X if delete_vip_port fails
+ [ovn-octavia-provider] loadbalancer stuck in PENDING_X if
+ delete_vip_port fails
Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

Hi Bence,

Yes, only ovn-octavia-provider.

Regarding the way to reproduce, not completely sure how to do it, probably stopping the neutron server after the loadbalancer is deleted but before the VIP is deleted. So, perhaps putting a long sleep there and then stop the q-svc could do it

I assumed the right way is to go to error status, as the loadbalancer is actually gone on the ovn side, but the VIP is still there. This way a loadbalancer deletion can be re-trigger

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

on a second thought, and taking a look at the current code, the VIP may be leaked

Revision history for this message
Luis Tomas Bolivar (ltomasbo) wrote :

new patch set should account for that and avoid leaks

Revision history for this message
Bence Romsics (bence-romsics) wrote :
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/834335
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/28c50d9147fed8e9757ba073929ce5a547e90f03
Submitter: "Zuul (22348)"
Branch: master

commit 28c50d9147fed8e9757ba073929ce5a547e90f03
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Mar 18 17:52:21 2022 +0100

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Closes-Bug: #1965732
    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/yoga)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835654

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/xena)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835656

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835653
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/2b0dc5bd4b4f39cb0783fb77c7e344e11428cbdd
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 2b0dc5bd4b4f39cb0783fb77c7e344e11428cbdd
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Mar 18 17:52:21 2022 +0100

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Closes-Bug: #1965732
    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7
    (cherry picked from commit 28c50d9147fed8e9757ba073929ce5a547e90f03)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835654
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/0da5150b40fd62941e24659bd979c4f84245a6a0
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 0da5150b40fd62941e24659bd979c4f84245a6a0
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Mar 18 17:52:21 2022 +0100

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Also modified coverage to 88% just to allow backport on wallaby stable
    branch. Following patch [1] includes several tests over the code base
    of this current patch and will restore the threshold again to 90 when
    it is be backported.

    [1] https://review.opendev.org/c/openstack/ovn-octavia-provider/+/836033

    Closes-Bug: #1965732
    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7
    (cherry picked from commit 28c50d9147fed8e9757ba073929ce5a547e90f03)

tags: added: in-stable-wallaby
tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835656
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/e96ad123808bc2bb121c7e4a997c1426a7ce7e07
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit e96ad123808bc2bb121c7e4a997c1426a7ce7e07
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Mar 18 17:52:21 2022 +0100

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Also modified coverage to 88% just to allow backport on victoria stable
    branch. Following patch [1] includes several tests over the code base
    of this current patch and will restore the threshold again to 90 when
    it is be backported.

    [1] https://review.opendev.org/c/openstack/ovn-octavia-provider/+/836033

    Closes-Bug: #1965732
    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7
    (cherry picked from commit 28c50d9147fed8e9757ba073929ce5a547e90f03)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835660
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/43423d6dafaa1939413d40aea24670d9746c5f4e
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 43423d6dafaa1939413d40aea24670d9746c5f4e
Author: Fernando Royo <email address hidden>
Date: Tue Mar 29 18:04:02 2022 +0200

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Closes-Bug: #1965732
    (manually cherry picked from commit 28c50d9147fed8e9757ba073929ce5a547e90f03)

    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/835655
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/8861f951ca682be62642dd7a6f5ad7073900bba3
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 8861f951ca682be62642dd7a6f5ad7073900bba3
Author: Luis Tomas Bolivar <email address hidden>
Date: Fri Mar 18 17:52:21 2022 +0100

    Avoid loadbalancer stuck in PENDING_X if delete_vip_port fails

    If either on a failed loadbalancer creation action, or in a loadbalancer
    deletion, the delete_vip_port fails with a different exception than
    PortNotFound (for instance, if neutron API is unreachable at that time),
    the loadbalancer will remain stuck in PENDING_X, as the ERROR status
    is not returned

    Also modified coverage to 88% just to allow backport on xena stable
    branch. Following patch [1] includes several tests over the code base
    of this current patch and will restore the threshold again to 90 when
    it is be backported.

    [1] https://review.opendev.org/c/openstack/ovn-octavia-provider/+/836033

    Closes-Bug: #1965732
    Change-Id: I7701ac3ab7358d4ab061645a7221a541cd183aa7
    (cherry picked from commit 28c50d9147fed8e9757ba073929ce5a547e90f03)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.0.1

This issue was fixed in the openstack/ovn-octavia-provider 1.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 3.0.0.0rc1

This issue was fixed in the openstack/ovn-octavia-provider 3.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn train-eol

This issue was fixed in the openstack/networking-ovn train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.2.0

This issue was fixed in the openstack/ovn-octavia-provider 1.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 2.1.0

This issue was fixed in the openstack/ovn-octavia-provider 2.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider ussuri-eol

This issue was fixed in the openstack/ovn-octavia-provider ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider victoria-eom

This issue was fixed in the openstack/ovn-octavia-provider victoria-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.