Incorrect handling of SCALE_IN action in clusters with desired_capacity = min_size

Bug #2048100 reported by Nguyen Ngoc Hieu
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
senlin
Fix Released
Undecided
Nguyen Ngoc Hieu

Bug Description

Behavior: With a cluster having desired_capacity = min_size when performing the SCALE_IN action, the action will fail because the capacity has reached the minimum threshold of the cluster (this is correct). However, the node was removed from the LB earlier due to pre_op, when it should not have been removed in this case. In some instances, this may lead to downtime when all IPs are removed from the LB even though min_size has been set.

Reproduce this bug:
- Create a cluster with desired_capacity = min_size (desired_capacity > 0).
- Attach lb_policy.
- Perform SCALE_IN with count number = 1
=> The node still remains in the cluster, but its IP has been removed from the LB.

Root cause: pre_op runs before every SCALE_IN action and removes the IP from the LB without checking whether the action is valid or not. In this case, pre_op should check the condition from the beginning and do nothing if desired_capacity = min_size.
https://github.com/openstack/senlin/blob/541498724bc0c1cf730d9f31c2ca7cf218d07bc3/senlin/policies/lb_policy.py#L655

Changed in senlin:
assignee: nobody → Nguyen Ngoc Hieu (zitechdev201)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to senlin (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/senlin/+/904796

Changed in senlin:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to senlin (master)

Reviewed: https://review.opendev.org/c/openstack/senlin/+/904796
Committed: https://opendev.org/openstack/senlin/commit/23f3bd708b5c7026051c2635f34ee69b5c366a2a
Submitter: "Zuul (22348)"
Branch: master

commit 23f3bd708b5c7026051c2635f34ee69b5c366a2a
Author: Nguyen Ngoc Hieu <email address hidden>
Date: Fri Jan 5 01:52:00 2024 +0700

    Skip pre_op LB if cluster is already at min size

    In cases where the cluster's desired capacity equals
    the minimum size, executing an action causing node
    (exception: CLUSTER_REPLACE_NODES) reduction led to
    premature removal of the node's IP from the load
    balancer during the pre_op step.

    The commit addresses this issue by introducing a check
    to skip the pre_op step if the cluster is already at
    its minimum size. Now, when desired_capacity equals min_size,
    the pre_op step is bypassed, preventing unnecessary
    removal of IPs from the load balancer.

    Closes-Bug: #2048100
    Change-Id: Ia7389e8c555497cfa5ccbdca77258f4165dfc62d

Changed in senlin:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.