kuryr-controller pod restarts continuously due to lb in ERROR status

Bug #1815880 reported by Luis Tomas Bolivar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
Undecided
Luis Tomas Bolivar

Bug Description

When a service is created and kuryr starts creating the octavia lb for it, if that lbs goes into ERROR status, kuryr will fail after waiting for it to be ACTIVE. The health checks then will eventually trigger a kuryr-controller restart. However, after the restart, the loadbalancer is already created but in Error status. Thus, the creation of a new one will fail due to lb already exiting, and the kuryr-controller will end up waiting for it to transition to active, which will never happen. This makes kuryr-controller be restarted over and over

Changed in kuryr-kubernetes:
assignee: nobody → Luis Tomas Bolivar (ltomasbo)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (master)

Fix proposed to branch: master
Review: https://review.openstack.org/636890

Changed in kuryr-kubernetes:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.openstack.org/636890
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=33594b870297aa88f15d73c221b6aa04709afe6d
Submitter: Zuul
Branch: master

commit 33594b870297aa88f15d73c221b6aa04709afe6d
Author: Luis Tomas Bolivar <email address hidden>
Date: Thu Feb 14 10:57:21 2019 +0100

    Ensure kuryr-controller recover from lb in ERROR status

    This patch ensures kuryr controller can recover from the situation
    when the created lb goes into ERROR status (instead of ACTIVE). Now,
    when the kuryr-controller finds the created lb, it checks its
    provisioning_status and if in ERROR status it will delete it and
    ensure a ResourceNotReady exception is triggered so that a new
    creation action is triggered after the deletion of the lb in ERROR
    status.

    Closes-Bug: 1815880
    Change-Id: I3f5de710a5ff37dee05f5f8826cb37c343141a08

Changed in kuryr-kubernetes:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/637724

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kuryr-kubernetes (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/637725

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (stable/rocky)

Reviewed: https://review.openstack.org/637724
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=17d26351c96a3e6e7e02191bbff43f4f5de5debe
Submitter: Zuul
Branch: stable/rocky

commit 17d26351c96a3e6e7e02191bbff43f4f5de5debe
Author: Luis Tomas Bolivar <email address hidden>
Date: Thu Feb 14 10:57:21 2019 +0100

    Ensure kuryr-controller recover from lb in ERROR status

    This patch ensures kuryr controller can recover from the situation
    when the created lb goes into ERROR status (instead of ACTIVE). Now,
    when the kuryr-controller finds the created lb, it checks its
    provisioning_status and if in ERROR status it will delete it and
    ensure a ResourceNotReady exception is triggered so that a new
    creation action is triggered after the deletion of the lb in ERROR
    status.

    Closes-Bug: 1815880
    Change-Id: I3f5de710a5ff37dee05f5f8826cb37c343141a08
    (cherry picked from commit 33594b870297aa88f15d73c221b6aa04709afe6d)

tags: added: in-stable-rocky
tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (stable/queens)

Reviewed: https://review.openstack.org/637725
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=84538d7627aa366ee72735e78fe44ddecd3895f6
Submitter: Zuul
Branch: stable/queens

commit 84538d7627aa366ee72735e78fe44ddecd3895f6
Author: Luis Tomas Bolivar <email address hidden>
Date: Thu Feb 14 10:57:21 2019 +0100

    Ensure kuryr-controller recover from lb in ERROR status

    This patch ensures kuryr controller can recover from the situation
    when the created lb goes into ERROR status (instead of ACTIVE). Now,
    when the kuryr-controller finds the created lb, it checks its
    provisioning_status and if in ERROR status it will delete it and
    ensure a ResourceNotReady exception is triggered so that a new
    creation action is triggered after the deletion of the lb in ERROR
    status.

    Closes-Bug: 1815880
    Change-Id: I3f5de710a5ff37dee05f5f8826cb37c343141a08
    (cherry picked from commit 33594b870297aa88f15d73c221b6aa04709afe6d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 1.0.0

This issue was fixed in the openstack/kuryr-kubernetes 1.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 0.4.7

This issue was fixed in the openstack/kuryr-kubernetes 0.4.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 0.5.4

This issue was fixed in the openstack/kuryr-kubernetes 0.5.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.