Node stuck on DELETING/CLEANING state(s)

Bug #1651092 reported by Zhenguo Niu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Incomplete
High
Dmitry Tantsur

Bug Description

When a conductor managing a node dies abruptly mid cleaing/deleting, the
node will get stuck in the CLEANING/DELETING state.

Changed in ironic:
assignee: nobody → Zhenguo Niu (niu-zglinux)
status: New → In Progress
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Logic in the conductor should handle this in normal failover cases; can you provide logs around when this happens so we can more directly pinpoint the bug?

Until then, marking this as incomplete.

Changed in ironic:
status: In Progress → Incomplete
Revision history for this message
Ruby Loo (rloo) wrote :

This bug has been opened for over a year and there isn't enough information here, to know what to do. I'm going to set the status to "invalid", feel free to change it if you have more information. Thanks.

Changed in ironic:
status: Incomplete → Invalid
Revision history for this message
Dmitry Tantsur (divius) wrote :

Ruby, Jay, we only clean up DEPLOYING state on start up, there is no "normal failover" for CLEANING or DELETING.

Changed in ironic:
status: Invalid → In Progress
importance: Undecided → High
assignee: Zhenguo Niu (niu-zglinux) → Dmitry Tantsur (divius)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/349971
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=2921fe685d8f096717f8795494c1032025407fe4
Submitter: Zuul
Branch: master

commit 2921fe685d8f096717f8795494c1032025407fe4
Author: Zhenguo Niu <email address hidden>
Date: Tue Aug 2 20:24:54 2016 +0800

    Clean nodes stuck in CLEANING state when ir-cond restarts

    When a conductor managing a node dies abruptly mid cleaing, the
    node will get stuck in the CLEANING state.

    This also moves _start_service() before creating CLEANING nodes
    in tests. Finally, it adds autospec to a few places where the tests
    fail in a mysterious way otherwise.

    Change-Id: Ia7bce4dff57569707de4fcf3002eac241a5aa85b
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Partial-Bug: #1651092

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/545893

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/queens)

Reviewed: https://review.openstack.org/545893
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=f132e1b761ed5fb9bf8b51b4d58e787551e36746
Submitter: Zuul
Branch: stable/queens

commit f132e1b761ed5fb9bf8b51b4d58e787551e36746
Author: Zhenguo Niu <email address hidden>
Date: Tue Aug 2 20:24:54 2016 +0800

    Clean nodes stuck in CLEANING state when ir-cond restarts

    When a conductor managing a node dies abruptly mid cleaing, the
    node will get stuck in the CLEANING state.

    This also moves _start_service() before creating CLEANING nodes
    in tests. Finally, it adds autospec to a few places where the tests
    fail in a mysterious way otherwise.

    Change-Id: Ia7bce4dff57569707de4fcf3002eac241a5aa85b
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Partial-Bug: #1651092
    (cherry picked from commit 2921fe685d8f096717f8795494c1032025407fe4)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/546083

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/pike)

Reviewed: https://review.openstack.org/546083
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=8ad8c874d208e2c80be05bc64afe67d9a9c7a9ec
Submitter: Zuul
Branch: stable/pike

commit 8ad8c874d208e2c80be05bc64afe67d9a9c7a9ec
Author: Zhenguo Niu <email address hidden>
Date: Tue Aug 2 20:24:54 2016 +0800

    Clean nodes stuck in CLEANING state when ir-cond restarts

    When a conductor managing a node dies abruptly mid cleaing, the
    node will get stuck in the CLEANING state.

    This also moves _start_service() before creating CLEANING nodes
    in tests. Finally, it adds autospec to a few places where the tests
    fail in a mysterious way otherwise.

    Change-Id: Ia7bce4dff57569707de4fcf3002eac241a5aa85b
    Co-Authored-By: Dmitry Tantsur <email address hidden>
    Partial-Bug: #1651092
    (cherry picked from commit 2921fe685d8f096717f8795494c1032025407fe4)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic (master)

Change abandoned by Zhenguo Niu (<email address hidden>) on branch: master
Review: https://review.opendev.org/350439

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Please update the bug if it is still an issue with Ironic now.

Changed in ironic:
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.