SPG is not going out-of-sync after updating subclouds in the SPG

Bug #2054123 reported by Li Zhu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Unassigned

Bug Description

Brief Description
-----------------
SPG is not changing 'sync_status' from 'in-sync' to 'out-of-sync' after running any of the following operations:
Adding subcloud(s) to the SPG.
Removing subcloud(s) from the SPG.
Updating subcloud(s) in the SPG.

Severity
--------
Major

Steps to Reproduce
------------------
1. Create the system peer group from Site A to Site B
2. Create System peer from Site B to Site A
3. Create the subcloud peer group in the Site A
4. Add subcloud(s) to the peer group
5. Create peer group association to associate system peer and subcloud peer group - Site A
6. Check current sync status on Sites A and B. Verify they are 'in-sync'
dcmanager peer-group-association list
7. Now, remove the subcloud from the SPG (please also test adding and updating)
Note: Does not sync it manually.
8. After more than an hour, verify the status is still 'in-sync'

Expected Behavior
------------------
'sync_status' should be 'out-of-sync' after making changes in a subcloud that is part of the SPG.

Actual Behavior
----------------
'sync_status' 'in-sync'

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
22.12 MR3 iteration 11

Last Pass
---------
New test scenario

Timestamp/Logs
--------------
// Subcloud 3 and 4 in the SPG

$ dcmanager subcloud-peer-group list-subclouds 1 --------+----------+---------------+----------------------------+----------------------------+---------------+-----------------+
| 11 | subcloud4 | None | None | 22.12 | managed | online | complete | fdff:719a:bf60:1109::/64 | fdff:719a:bf60:1109::2 | fdff:719a:bf60:1109::ffff | fdff:719a:bf60:1109::1 | fdff:719a:bf60:1096: :1 | 1 | 1 | 2024-02-02 21:58:33.703653 | 2024-02-06 20:17:32.238938 | None | None |
| 12 | subcloud3 | None | None | 22.12 | managed | online | complete | fdff:719a:bf60:1099::/64 | fdff:719a:bf60:1099::2 | fdff:719a:bf60:1099::ffff | fdff:719a:bf60:1099::1 | fdff:719a:bf60:1096: :1 | 1 | 1 | 2024-02-07 11:10:44.657089 | 2024-02-07 13:15:26.553700 | None | None |
+----+-----------+-------------+----------+------------------+------------+--------------+---------------+--------------------------+------------------------+---------------------------+------------------------+---------------------
 // subcloud3 removed from the SPG

$ dcmanager subcloud update subcloud3 --peer-group none
+-----------------------------+----------------------------+
| Field | Value |
+-----------------------------+----------------------------+
| id | 12 |
| name | subcloud3 |
| description | None |
| location | None |
| software_version | 22.12 |
| management | managed |
| availability | online |
| deploy_status | complete |
| management_subnet | fdff:719a:bf60:1099::/64 |
| management_start_ip | fdff:719a:bf60:1099::2 |
| management_end_ip | fdff:719a:bf60:1099::ffff |
| management_gateway_ip | fdff:719a:bf60:1099::1 |
| systemcontroller_gateway_ip | fdff:719a:bf60:1096::1 |
| group_id | 1 |
| peer_group_id | None |
| created_at | 2024-02-07T11:10:44.657089 |
| updated_at | 2024-02-08T11:27:08.919357 |
| backup_status | None |
| backup_datetime | None |
+-----------------------------+----------------------------+
// Still showing as 'in-sync' on both sites.

$ dcmanager peer-group-association list
+----+---------------+----------------+---------+-------------+---------------------+
| id | peer_group_id | system_peer_id | type | sync_status | peer_group_priority |
+----+---------------+----------------+---------+-------------+---------------------+
| 1 | 1 | 1 | primary | in-sync | 1 |
+----+---------------+----------------+---------+-------------+---------------------+

Test Activity
-------------
Feature Testing

Workaround
----------
None

Li Zhu (lzhu1)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/distcloud/+/909277

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/c/starlingx/distcloud/+/909277
Committed: https://opendev.org/starlingx/distcloud/commit/5f132a43aba15e360e011ec5bc07789e8e1c6c75
Submitter: "Zuul (22348)"
Branch: master

commit 5f132a43aba15e360e011ec5bc07789e8e1c6c75
Author: Li Zhu <email address hidden>
Date: Fri Feb 16 18:56:55 2024 -0500

    Set PGA status to out-of-sync after certain operations

    Fix the reported issue:
    The PGA fails to transition from 'in-sync' to 'out-of-sync' after
    updating subclouds.

    This commit includes the following changes:
    1. Updates the SPG sync_status to 'out-of-sync' after performing any of
       the following operations and provides an informative message to
       the operator:
       a) Adding a subcloud to the SPG
       b) Removing a subcloud from the SPG
       c) Updating a subcloud in the SPG
    2. Ensures that updates on SPG attributes, such as name,
       max_subcloud_rehoming, and group_state, are automatically
       propagated to the peer site.

    Test plan:
    Pre-Steps: 1. Create the system peer from Site A to Site B
               2. Create System peer from Site B to Site A
               3. Create the subcloud peer group in the Site A
               4. Add subcloud(s) to the peer group
               5. Create peer group association to associate system peer
                  and subcloud peer group - Site A
               6. Check current sync status on Sites A and B. Verify
                  they are 'in-sync'.
    PASS: Verify 'out-of-sync' on both sites after running any of
          the following operations on site A, the primary and leader site:
          1. Adding subcloud to the SPG.
          2. Removing subcloud from the SPG.
          3. Updating any field of a subcloud in the SPG, such as bootstrap
             address, bootstrap values, install values, etc.
    PASS: Repeat the above operations while site B is down and verify that
          PGA sync_status is set to "failed".
    PASS: Verify that SPG attribute updates are accepted if peer site is up
          and the updates are successfully synced to the peer site.
    PASS: Verify that SPG attribute updates are rejected if the peer site
          is down.
    PASS: Verify that if the subcloud does not belong to any peer group, or
          if it is part of a peer group but the peer group is not associated
          with any peer yet, updating the subcloud would not result in an
          "informative" message and no attempt to update the PGA sync_status
          should occur.

    Closes-Bug: 2054123
    Closes-Bug: 2054124

    Change-Id: I9f0e44e34c7db5d60d211b70e839606d0361cf83
    Signed-off-by: lzhu1 <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.10.0 stx.distcloud
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.