Comment 14 for bug 1498130

Revision history for this message
Michael Johnson (johnsom) wrote :

Hi folks,

Sorry we were not aware that this conversation was continuing. When a bug is in a closed state (such as Invalid) it removes it from our dashboards and stops sending notifications of comments.

Also note, the OpenStack foundation has migrated Octavia off of launchpad and onto Storyboard (https://storyboard.openstack.org/). All OpenStack projects are being migrated. Because of that I will reclose this bug (If I can as launchpad bugs were disabled for this project as part of the migration).

That said I am sorry my comments were not clear. Let me try to clarify.

All objects in Octavia should and will end in a consistent state: ACTIVE or ERROR. The only time an object would be in a PENDING_* state is when a controller has ownership of the object and is actively been managed. The timing for when a controller will give up and result to ERROR is configurable in the octavia.conf file. The defaults are quite long due to the low performance of some development systems (virtual box for one).

At no time should an object be "stuck" in PENDING_*. A controller should have ownership of the object and be actively managing it. This is why you should never interrupt or circumvent these states. Doing so will likely lead to the system going into alternate recovery paths, such as failover, that may not be the desired outcome. PENDING_* means the object is actively being worked on by a controller that has locked the object to make sure others do not make changes to this object while the controller is working on the object.

If you look at the Octavia code, all paths lead back to either ACTIVE or ERROR (Example: https://docs.openstack.org/octavia/latest/_images/ListenerFlows-get_create_listener_flow.svg). In the ERROR state users or operators can recover by deleting the object and recreating.

In the past there were a few bugs in the code that could lead to an abandoned object in a PENDING_* state. We have aggressively worked to resolve those bugs and do not have any outstanding bugs detailing an object stuck in PENDING_*. The only other path that could lead to a PENDING_* would be an unsafe shutdown (kill -9 or hard power off) of a controller that had ownership of the object.

You should never need to access or change the database, but "stuff" does happen. This would only be the last resort.

We have evaluated an admin tool to "force" a delete on objects in PENDING_* but found that it was abused because the ramifications where not clear for people using it. It became the "universal" screwdriver (also known as a sledgehammer to install a thumb tack). This led to issues in other services and very unhappy customers because they lost their VIP addresses or they have quota still in use for abandoned resources in the other services.

Because we were no longer seeing this issue, or getting bug reports of objects stuck in "PENDING_*" we have opted to not "do more harm than good" and decided to not enable a very dangerous "force" option.

As I requested in the original response, if you are seeing objects stuck in "PENDING_*", please open a bug in Storyboard for us. We need to understand how it got there and the rate of occurrence. If you are seeing objects stuck in "PENDING_*" we want to know about it.

I am sorry my comment was interpreted as "this is normal just hack the db", that is 100% the opposite of my intention with those comments.