Deleting stack with octavia member fails with ValueErrror when Pool/Member already deleted outside heat.

Bug #1747836 reported by liyi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
In Progress
High
Rico Lin

Bug Description

more information, please refer to http://paste.openstack.org/show/664555/

It will cause stack could not be deleted forever.

Rico Lin (rico-lin)
Changed in heat:
assignee: nobody → Rico Lin (rico-lin)
status: New → Triaged
importance: Undecided → High
milestone: none → queens-rc1
Revision history for this message
Rabi Mishra (rabi) wrote :

That's because you deleted the resources using octavia api before deleting the stack. Pool property translation is wrapping the HttpNotFound as ValueError. We should handle this though.

Revision history for this message
Rabi Mishra (rabi) wrote :

Also please add details in the bug description rather than using paste.

summary: - delete stack with non existed pool error
+ Deleting stack with octavia member fails with ValueErrror when
+ Pool/Member already deleted outside heat.
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/541558

Changed in heat:
status: Triaged → In Progress
Revision history for this message
Zane Bitter (zaneb) wrote :

Wait, why are we trying to resolve property values at delete time, when we should be using the cached value?

What is in the magic paste link? I can't access it.

Revision history for this message
Zane Bitter (zaneb) wrote :

I don't think this bug is unique to Octavia. Since I953a52e9b165d3ea4fb2fc57ceea8083c7f8f30c we try to translate properties *after* loading their stored values from the database while deleting a resource: http://git.openstack.org/cgit/openstack/heat/tree/heat/engine/resource.py#n1995

This is bad, we should never do anything that relies on live data during a delete (for exactly this reason). I think we should fix that (by storing the translated properties in the DB, if they aren't already).

Revision history for this message
Zane Bitter (zaneb) wrote :

OK, a generic fix is very tricky.

Say, for example, that a user creates a Nova server passing an image name, and then updates the image that that name points to. We shouldn't replace the server on the next update, because the property passed by the user (the image name) hasn't changed. (This was the longstanding behaviour prior to the existence of translations, and also what the user might expect with a rudimentary knowledge of how Heat compares templates.) This works right now, but for horrible reasons: we translate the image name to an image ID in both the old and new properties, and then compare them. Barring race conditions, they will be the same so we are (usually) OK. This is horrible and super inefficient.

But to change this by storing the resolved image ID instead (simple enough to do by just calling properties.user_value() in _update_stored_properties()) would mean this case would start failing, because the new image ID would not match the old one. Arguably the image ID is a special case, but there may be others and we'd have to go through and audit every RESOLVE translation as well as figure out how to actually handle it.

In addition, we'd still have to deal with old resources that have pre-translation property values stored.

It may actually be the case that this Octavia resource is the only one referencing a translated property during handle_delete() - most resources need only their resource_id, after all - in which case a hack to resolve this particular case is probably the best solution in the short term.

Rico Lin (rico-lin)
Changed in heat:
milestone: queens-rc1 → rocky-1
Rico Lin (rico-lin)
Changed in heat:
milestone: rocky-1 → rocky-2
Revision history for this message
Gauvain Pocentek (gpocentek) wrote :

Hi!

I hit this problem today on a rocky platform. I had to patch the PoolMember resource to make the stack deletion successful. Simple try/except on ValueError, as is done in other methods. Should I push this simple patch, even if it doesn't fix the root problem (more complex to fix)?

Revision history for this message
Rabi Mishra (rabi) wrote : Re: [Bug 1747836] Re: Deleting stack with octavia member fails with ValueErrror when Pool/Member already deleted outside heat.

On Tue, Apr 30, 2019 at 1:00 PM Gauvain Pocentek <
<email address hidden>> wrote:

> Hi!
>
> I hit this problem today on a rocky platform. I had to patch the
> PoolMember resource to make the stack deletion successful. Simple
> try/except on ValueError, as is done in other methods. Should I push
> this simple patch, even if it doesn't fix the root problem (more complex
> to fix)?
>
> We don't track bugs in launchpad any more. You can see the status of this
bug in storyboard
https://storyboard.openstack.org/#!/story/1747836

I think the issue had been fixed in rocky with
https://review.opendev.org/#/c/586474/. Not sure why you're seeing it again
in rocky. Please update the story with details of the issue you're seeing
with the heat logs/tracebacks.

> --
> You received this bug notification because you are subscribed to
> OpenStack Heat.
> Matching subscriptions: test
> https://bugs.launchpad.net/bugs/1747836
>
> Title:
> Deleting stack with octavia member fails with ValueErrror when
> Pool/Member already deleted outside heat.
>
> Status in OpenStack Heat:
> In Progress
>
> Bug description:
> more information, please refer to
> http://paste.openstack.org/show/664555/
>
> It will cause stack could not be deleted forever.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/heat/+bug/1747836/+subscriptions
>

--
Regards,
Rabi Mishra

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers