SM service dependency incorrect rule of disable and go-standby actions

Bug #2012570 reported by Bin Qian
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bin Qian

Bug Description

Brief Description
-----------------
It is observed that SM does not enforce disable a service after its dependent becomes disabled or standby. This causes services could be shutting in incorrect order.

below sequence showed one instance of platform-fs became disabled after drbd-platform started disabling and going standby.
| 2023-02-06T17:13:40.670 | 777 | service-scn | platform-fs | enabled-active | disabling | disable state requested
...
| 2023-02-06T17:13:40.673 | 781 | service-scn | drbd-platform | enabled-active | enabled-go-standby | enabled-standby state requested
| 2023-02-06T17:13:40.674 | 782 | service-scn | drbd-rabbit | enabled-active | enabled-go-standby | enabled-standby state requested
...
| 2023-02-06T17:19:40.842 | 852 | service-scn | platform-fs | disabling | disabled | disable success

with the dependencies are correctly defined, the correct order of winding down services should be enforced.

Severity
--------
Major: this can cause unexpected behavior or even failure

Steps to Reproduce
------------------
no specific steps of reproduce. it is observed during a host swact action.

Expected Behavior
------------------
The set order is enforced

Reproducibility
---------------
intermittent, the behavior is not seen often.

System Configuration
--------------------
This could happen in any DX system

Tags: stx.9.0 stx.ha
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ha (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/ha/+/878285

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ha (master)

Reviewed: https://review.opendev.org/c/starlingx/ha/+/878285
Committed: https://opendev.org/starlingx/ha/commit/c81032a5727892f0107089c5a9d735f147770e10
Submitter: "Zuul (22348)"
Branch: master

commit c81032a5727892f0107089c5a9d735f147770e10
Author: Bin Qian <email address hidden>
Date: Wed Mar 22 20:54:35 2023 +0000

    Update rule of disable & standby dependency

    This change is to update the service disabling and going standby
    dependency check.
    The 2 specific rules are
    1. "service a" has a disable action dependency to "service b", with
       targeted "service b" state of disabled, disable action of
       "service a" is considered as "dependency met" only when "service b"
        is in disabled stated, or enabled-standby state.
    2. "service a" has a go-standby action dependency "to service b", with
       targeted "service b" state of disabled, go-standby action of
       "service a" is considered as "dependency met" only when "service b"
       is in disabled stated, or enabled-standby state.

    TCs:
       passed: Perform repeatedly host-swact operations, with adding long
               delay in xxx-fs ocf-script in disable action, observed that
               all xxx-fs services are disabled before drbd-xxx services
               start disabling.

    Closes-Bug: 2012570

    Signed-off-by: Bin Qian <email address hidden>
    Change-Id: Ie9717d3b2b73dc7d623e1b980b3387c6c4e6d991

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.ha
Changed in starlingx:
assignee: nobody → Bin Qian (bqian20)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.