VIM orchestrator fails to remove node taint from inactive controller on host swact.

Bug #2046273 reported by Vanathi Selvaraju
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Vanathi Selvaraju

Bug Description

Brief Description
-----------------
VIM orchestrator fails to remove node taint from
inactive controller on host swact.
Checks added to ensure if the taints added
to the node by VIM are removed from the node.

Severity
--------
Minor

Steps to Reproduce
------------------
In a duplex system,
Lock in-active controller
system host-lock controller-1

Unlock controller-1
system host-unlock controller-1

Expected Behavior
------------------
'Services=disabled:NoExecute' taint to be removed from controller-1

Actual Behavior
----------------
'Services=disabled:NoExecute' taint remains in controller-1

Reproducibility
---------------
Seen once

System Configuration
--------------------
Duplex system (AIO-DX)

Timestamp/Logs
--------------
2023-10-18T20:09:49.619 controller-1 VIM_Infrastructure-Worker-0_Thread[109262] INFO kubernetes_client.py.130 Removing services:NoExecute taint from node worker-0

Taints present in the system /var/extra/containerization_kube.info

Name: worker-0
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    install=sco
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=worker-0
                    kubernetes.io/os=linux
Taints: services=disabled:NoExecute
Unschedulable: false

Test Activity
-------------
system testing

Changed in starlingx:
status: New → In Progress
summary: - VIM orchestrator to check if the node taints added are removed.
+ VIM orchestrator fails to remove node taint from inactive controller on
+ host swact.
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907130

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907130

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907131

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907131

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907484

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907484

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907537

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/907537

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/908188

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/908188

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nfv (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/908209

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nfv (master)

Change abandoned by "Vanathi Selvaraju <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/nfv/+/908209

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fault (master)

Reviewed: https://review.opendev.org/c/starlingx/fault/+/904788
Committed: https://opendev.org/starlingx/fault/commit/14468b9d2194f625e659d6d21cb2eeebcde4b9a1
Submitter: "Zuul (22348)"
Branch: master

commit 14468b9d2194f625e659d6d21cb2eeebcde4b9a1
Author: Vanathi.Selvaraju <email address hidden>
Date: Thu Jan 4 13:30:37 2024 -0500

    Adding new alarm definition for node taint.

    Currently there is no alarm for node taint.
    This new alarm 900.701 describes the attributes
    of the node taint.

    Test Plan:
    PASSED: Verified the details of the alarm
    using fm alarm-list.

    Partial-Bug: 2046273

    Change-Id: I929ddb45b75f1e4b097b84919f703d458d8fa39e
    Signed-off-by: Vanathi.Selvaraju <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nfv (master)

Reviewed: https://review.opendev.org/c/starlingx/nfv/+/903527
Committed: https://opendev.org/starlingx/nfv/commit/e230abb543cf639fe25bc170b325edddd38e6132
Submitter: "Zuul (22348)"
Branch: master

commit e230abb543cf639fe25bc170b325edddd38e6132
Author: Vanathi.Selvaraju <email address hidden>
Date: Tue Dec 12 12:49:39 2023 -0500

    Alarm 900.701 raised on failing to remove node taint.

    Additional checks added to ensure that alarm is
    raised by VIM on failing to remove node taint.

    Test Plan:
    PASSED: On a DX system, locked and unlocked one of
    the controller to check if taints are removed.
    PASSED: On a DX system, tweaked the code to
    fail untainting of node.
    Alarm 900.701 is raised as there is node taint.
    PASSED: On a DX system, check if the alarm 900.701
    is removed on locking the node.
    PASSED: Deployed a DX system with ISO that has the
    changes. No trace of alarm 900.701 after
    bootstrapping.
    PASSED: On a DX system Node taint alarm exists with
    multiple taints, node was locked followed by unlock.
    On successful untaint, the alarm is cleared.
    PASSED: On a DX system Node taint alarm exists, node was
    locked followed by unlock. On successful untaint, the
    alarm 900.701 is cleared.

    Closes-Bug: 2046273
    Depends-On: https://review.opendev.org/c/starlingx/fault/+/904788

    Change-Id: I4206b336cbe0021f2b45e3b3cd24b42ca43bc60e
    Signed-off-by: Vanathi.Selvaraju <email address hidden>

Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.9.0 stx.nfv
Changed in starlingx:
assignee: nobody → Vanathi Selvaraju (vselvara)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.