Stack status IN_PROCESS will not change when restart heat service

Bug #1382320 reported by Ethan Lynn
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
High
Ethan Lynn

Bug Description

Restart heat services during stack creation.
Then the status of this stack will remain in CREATE_IN_PROCESS, can't automatically change to FAIL or something.
Status IN_PROCESS will confuse customer and they think this stack is still ongoing, but actually it's stopped.

I think it's better change IN_PROCESS to FAIL when restart heat service.

Workaround:
When heat services is restarted, check stack status. If status is IN_PROCESS then change to FAILED.

Need some feedback and then I will work on it.

Ethan Lynn (ethanlynn)
Changed in heat:
assignee: nobody → Ethan Lynn (ethanlynn)
Revision history for this message
Sergey Kraynev (skraynev) wrote :

I suppose, that you suggestion is good, but may be convergence will solve this problem better or not?

Revision history for this message
Ethan Lynn (ethanlynn) wrote :

I just notice this blueprint, it sounds great!
But it still in design, can I give out a patch for it for now?

Revision history for this message
Sergey Kraynev (skraynev) wrote :

Sure. I periodically met this problem, so it sounds reasonable for me.

Revision history for this message
Steven Hardy (shardy) wrote :

You can't just flip the status on engine restart, because there may be multiple heat-engine processes running.

Since the engine_id is generated each time an engine starts, you have no way of knowing when an engine starts if the engine starting up is the one which held the lock on the stack before restarting.

Revision history for this message
Ethan Lynn (ethanlynn) wrote :

Hi Steven,
   Is there a better way to fixed this ?

Revision history for this message
huangtianhua (huangtianhua) wrote :

Or may be we can change the status after stealing lock from old inactive engine if the the stack status is in_progress:)

Angus Salkeld (asalkeld)
Changed in heat:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Angus Salkeld (asalkeld) wrote :

Logically this is a very important bug - really important for HA setups.

This might be easier with our new "heat-manage service list".
We can now potentially re-trigger actions when we see that a service is unreachable:
https://review.openstack.org/#/c/165713/6/heat/engine/service.py (at the end of service_manage_cleanup()).

Changed in heat:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/169160

Changed in heat:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/169160
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=6bc753582b0d5b7226a093cc96dca2bdc1e66257
Submitter: Jenkins
Branch: master

commit 6bc753582b0d5b7226a093cc96dca2bdc1e66257
Author: Ethan Lynn <email address hidden>
Date: Tue Mar 31 10:08:45 2015 +0800

    Set stack status to FAILED when engine is down

    When stack is in status IN_PROGRESS and engine service went down,
    the status of stack will forever remain in IN_PROGRESS. This patch
    add a db apid to get engine_id from stacklock and try to reset the
    stack status to FAILED when engine is back.

    Closes-Bug: #1382320
    Change-Id: Ica856bb0d56c23a4423fb9476c1986aaacf24108

Changed in heat:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in heat:
milestone: none → kilo-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in heat:
milestone: kilo-rc1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.