Cannot delete the stack after replacing image during stack-update

Bug #1356084 reported by Jacques Uber on 2014-08-12
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Deliang Fan
Kilo
Invalid
High
Steve Baker

Bug Description

Steps to Reproduce:
    * create a stack with with a single server (see attached example template)
    * update the stack and replace the image the instance is using
        - This will fail with the error "Conflict: Port <uuid> is still in use. (HTTP 409) "
    * Try to delete the stack
        - This will fail with the error "Failed to DELETE : Error deleting backup resources: Resource DELETE failed: Forbidden: You are not authorized to perform the requested action, identity:delete_user. (HTTP 403)"

Jacques Uber (uberj) wrote :
Jacques Uber (uberj) wrote :
summary: - Cannot delete the stack after repalcing image during stack-update
+ Cannot delete the stack after replacing image during stack-update
Jacques Uber (uberj) wrote :
Download full text (5.0 KiB)

+---------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------------+
| resource_name | id | resource_status_reason | resource_status | event_time |
+---------------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------------+
| router | fbd01003-7d48-4e4b-942b-7707b84094e8 | state changed | CREATE_IN_PROGRESS | 2014-08-12T23:10:43Z |
| private_net | 6b8f66d6-b5e5-459a-8e88-5e493828aa63 | state changed | CREATE_IN_PROGRESS | 2014-08-12T23:10:43Z |
| private_net | e7981326-8294-428c-a140-975b392f8478 | state changed | CREATE_COMPLETE | 2014-08-12T23:10:43Z |
| router | 90c40cf0-8195-4cc6-b678-0fc9b375335a | state changed | CREATE_COMPLETE | 2014-08-12T23:10:43Z |
| private_subnet | dc7212aa-d0fe-4b7e-bfe1-7b0e20f46c07 | state changed | CREATE_IN_PROGRESS | 2014-08-12T23:10:43Z |
| private_subnet | e891cc34-0905-4a6d-9ad0-accff83066c7 | state changed | CREATE_COMPLETE | 2014-08-12T23:10:45Z |
| server1_port | bc01b808-5ec6-4332-b7fd-a43b1f83b007 | state changed | CREATE_IN_PROGRESS | 2014-08-12T23:10:45Z |
| router_interface | 0a3da608-a2aa-4a54-bf6f-dba722128d8d | state changed | CREATE_IN_PROGRESS | 2014-08-12T23:10:45Z |
| server1_port | d4dd1714-2ed3-4d47-b0b5-cbf4d5b889a6 | state changed | CREATE_COMPLETE | 2014-08-12T23:10:46Z |
| router_interface | 4bb83d47-8453-483b-87f4-e8b19f1c766f | state changed | CREATE_COMPLETE | 2014-08-12T23:10:46Z |
| server1 | 51ee...

Read more...

Changed in heat:
assignee: nobody → Nikunj Aggarwal (nikunj2512)
Zane Bitter (zaneb) wrote :

Could this be a duplicate of bug 1334514? It looks very similar, and the fix for that went in only a week ago.

Zane Bitter (zaneb) wrote :

Looking closer, there are two parts here: the initial failure and then the failure to delete. The second part could be due to bug 1334514, but the first obviously isn't.

Sergey Kraynev (skraynev) wrote :

There are some short ideas about this bug, according to last meeting in IRC:

The problem reason:
 <skraynev>>during replace we create second server and try to use old port
 <skraynev>>but according to neutron model it's not possible
 <skraynev>>to have two servers with same port

Some possible way to fix it:
......
 <shardy>>--skraynev: should you be using the rebuild image update policy?
 <shardy>>--instead of replacing the server?

.....
 <stevebaker>>--we should be able to detach the port from the old server at some point in the update process
 <mspreitz>>stevebaker: but timing is wrong, update makes new before deleting old, right?
 <stevebaker>>--mspreitz: the port would have to be attached to the new post-create
 <pas-ha>>--may be smth like post/pre-replace

.....
 <stevebaker>>--or we deprecate the port resource and make it possible to specify all port details in the server networks list
 <mspreitz>>yes, +1 on stevebaker's suggestion
 <zaneb>>---stevebaker: that might also eliminate the opposite problem we also have, of Nova deleting the port from under us
 <stevebaker>>--zaneb: it gets better, currently nova *doesn't* always delete ports that it creates, resulting in undeletable stacks!
 <stevebaker>>--I was going to suggest a summit session on a rich server networks property
 <zaneb>>---I'd like to have a summit session on a sensible API for neutron
 <stevebaker>>--zaneb: and now they're talking managing port lifecycle in novaclient ;)
 <zaneb>>---when it's easier to store state in the client(s) than your server, you know you've taken a wrong turn
 <stevebaker>>--anyhoo, arosen has a change to fix the port deleting which is still in review
 <zaneb>>---because nobody ever uses clients on different machines...

So As you can see some ways are possible: workarounds related with port detaching and attaching to new server or huge changes with deprecating whole resource....

Unfortunately we have not end solution, but it possibly will be discussed during mid-cycle meetup.
.

Zane Bitter (zaneb) on 2014-08-26
Changed in heat:
importance: Undecided → Medium
status: New → Triaged
Anant Patil (ananta) on 2014-09-17
Changed in heat:
assignee: Nikunj Aggarwal (nikunj2512) → Anant Patil (ananta)
Steve Baker (steve-stevebaker) wrote :

The description of this bug includes:
  Resource DELETE failed: Forbidden: You are not authorized to perform the requested action, identity:delete_user

which has nothing to do with the ports issue.

I believe the issue occurs in the following scenario:
- A resource subclass of StackUser (eg, Server, SoftwareDeployment) which has created a user is replaced on update
- For some reason a Forbidden is raised when an attempt is made to delete the resource user for the old replaced resource
- The replaced resource is marked CREATED_COMPLETE, but the stack goes to UPDATE_FAILED
- The stack becomes undeleteable, always going to DELETE_FAILED with the same error:

  Resource DELETE failed: Forbidden: You are not authorized to perform the requested action, identity:delete_user

I'll mark this as High and come up with a simple template

Steve Baker (steve-stevebaker) wrote :

heat stack-create -f bug-1356084.yaml a
heat stack-show bug-1356084 # CREATE_COMPLETE
heat stack-update -f bug-1356084.yaml -P user_data=two bug-1356084
heat stack-show bug-1356084 # UPDATE_FAILED
heat stack-delete bug-1356084
heat stack-show bug-1356084 # DELETE_FAILED

Changed in heat:
importance: Medium → High
milestone: none → kilo-2
Steve Baker (steve-stevebaker) wrote :
Download full text (4.6 KiB)

Full stack trace

2014-12-30 10:05:05.729 INFO heat.engine.resource [-] DELETE: Server "server" [78692c53-a417-4756-9ae9-72676a7d4a8d] Stack "bug-1356084*" [b62d44e5-ee62-4373-8cb6-fd6a75fbcc13]
2014-12-30 10:05:05.729 TRACE heat.engine.resource Traceback (most recent call last):
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/resource.py", line 465, in _action_recorder
2014-12-30 10:05:05.729 TRACE heat.engine.resource yield
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/resource.py", line 885, in delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource yield self.action_handler_task(action, *action_args)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/scheduler.py", line 295, in wrapper
2014-12-30 10:05:05.729 TRACE heat.engine.resource step = next(subtask)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/resource.py", line 506, in action_handler_task
2014-12-30 10:05:05.729 TRACE heat.engine.resource handler_data = handler(*args)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/resources/server.py", line 1070, in handle_delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource self._delete_user()
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/engine/stack_user.py", line 101, in _delete_user
2014-12-30 10:05:05.729 TRACE heat.engine.resource self.keystone().delete_stack_user(user_id)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/heat/heat/common/heat_keystoneclient.py", line 495, in delete_stack_user
2014-12-30 10:05:05.729 TRACE heat.engine.resource self.client.users.delete(user=user_id)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/python-keystoneclient/keystoneclient/v3/users.py", line 189, in delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource user_id=base.getid(user))
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/python-keystoneclient/keystoneclient/base.py", line 72, in func
2014-12-30 10:05:05.729 TRACE heat.engine.resource return f(*args, **new_kwargs)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/python-keystoneclient/keystoneclient/base.py", line 381, in delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource self.build_url(dict_args_in_out=kwargs))
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/python-keystoneclient/keystoneclient/base.py", line 210, in _delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource return self.client.delete(url, **kwargs)
2014-12-30 10:05:05.729 TRACE heat.engine.resource File "/home/steveb/dev/localstack/python-keystoneclient/keystoneclient/httpclient.py", line 644, in delete
2014-12-30 10:05:05.729 TRACE heat.engine.resource return self._cs_request(url, 'DELETE', **kwargs)
2014-12-30 10:05:05.729 TRACE heat.engine.res...

Read more...

tags: added: juno-backport-potential
Steve Baker (steve-stevebaker) wrote :

The stacks can be deleted by switching to an an admin user for that tenant

Changed in heat:
assignee: Anant Patil (ananta) → Deliang Fan (vanderliang)
Angus Salkeld (asalkeld) on 2015-02-03
Changed in heat:
milestone: kilo-2 → kilo-3
Angus Salkeld (asalkeld) on 2015-03-17
Changed in heat:
milestone: kilo-3 → kilo-rc1
Changed in heat:
status: Triaged → In Progress
Deliang Fan (vanderliang) wrote :

@Steve Baker (steve-stevebaker) Hi, would you plz tell me which auth method do you use? trust or user/passwd? thank you!

Steve Baker (steve-stevebaker) wrote :

devstack defaults (deferred_auth_method = trusts) with demo/demo account

Deliang Fan (vanderliang) wrote :

1. heat stack-create -f bug-1356084.yaml a
    an instance and a first user is created.

2. heat stack-update -f bug-1356084.yaml -P user_data=two bug-1356084
    After a new instance and the second user have been created, the primary instance and the first user will be deleted. But there comes the 403 failure from keystone during deleting the first user, which cause the failure of updating stack.

The primary cause of failure when updating the userdata of instance during stack update is that stack.stack_user_project_id is None when deleting the primary stack domain user. See in heat/engine/resources/stack_user.py

    def _delete_user(self):
        user_id = self._get_user_id()
        if user_id is None:
            return
        try:
            self.keystone().delete_stack_domain_user(
                user_id=user_id, project_id=self.stack.stack_user_project_id)
        except kc_exception.NotFound:
            pass
        except ValueError:
            # FIXME(shardy): This is a legacy delete path for backwards
            # compatibility with resources created before the migration
            # to stack_user.StackUser domain users. After an appropriate
            # transitional period, this should be removed.
            LOG.warn(_LW('Reverting to legacy user delete path'))
            try:
                self.keystone().delete_stack_user(user_id)
            except kc_exception.NotFound:
                pass

Because self.stack.stack_user_project_id is None, then delete_stack_domain_user fails and self.keystone().delete_stack_user(user_id) is to be called. The member role of demo user causes 403 failure from keystone during delete user.

So pass the stack_user_project_id to updated_stack, oldstack and backup_stack, then updates succeessfully.

Deliang Fan (vanderliang) wrote :

@Steve Baker (steve-stevebaker) I think I have fix this bug after some test, would you please have a try too~

Fix proposed to branch: master
Review: https://review.openstack.org/168801

Change abandoned by Deliang Fan (<email address hidden>) on branch: master
Review: https://review.openstack.org/168801
Reason: duplicated

Reviewed: https://review.openstack.org/168267
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=a6d65581f60a7405cf118e3ac876159f7e33ab4e
Submitter: Jenkins
Branch: master

commit a6d65581f60a7405cf118e3ac876159f7e33ab4e
Author: Deliang Fan <email address hidden>
Date: Mon Mar 30 14:33:15 2015 +0800

    Correctly initialize copies of stack during updating stack

    Pass stack_user_project_id to updated_stack, backup_stack and
    oldstack to make sure the success when deleting stack domain user.

    Create a common method to get the kwargs to create a stack from
    an existing stack.

    Co-Authored-By: Angus Salkeld <email address hidden>

    Change-Id: Ieb7726ed738d5ae8046184f312379b9132b6c4a9
    Closes-Bug: #1356084

Changed in heat:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-04-07
Changed in heat:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-04-30
Changed in heat:
milestone: kilo-rc1 → 2015.1.0
tags: added: kilo-backport-potential
Angus Salkeld (asalkeld) on 2015-09-17
tags: removed: juno-backport-potential kilo-backport-potential
Angus Salkeld (asalkeld) wrote :

Note: this patch is already in Kilo.

Thomas Herve (therve) on 2016-05-25
no longer affects: heat/juno
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers