During the upgrade, OAM connectivity is lost on active controlller

Bug #2038306 reported by Andre Kantek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Andre Kantek

Bug Description

Brief Description
-----------------
During the upgrade, it was observed loss of connectivity with in controller-0 after the newly upgraded controller-1 was unlocked

Severity
--------
<Critical: System/Feature is not usable due to the defect>

Steps to Reproduce
------------------
Execute the upgrade and after controller-1 unlock check ICMP connectivity to controller-0 OAM network

Expected Behavior
------------------
controller-0 needs to be accessible

Actual Behavior
----------------
controller-0 cannot be accessed

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-DX

Test Activity
-------------
Regression Testing

Workaround
----------
After controller-1 unlock, edit the globalnetworkpolicy and change the selector
kubectl edit globalnetworkpolicies.crd.projectcalico.org controller-oam-if-gnp

changing the selector to
selector: ((has(nodetype) && nodetype == 'controller') || (has(notetype) && notetype == 'controller')) && has(iftype) && iftype contains 'oam'

Andre Kantek (akantek)
Changed in starlingx:
assignee: nobody → Andre Kantek (akantek)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/896510
Committed: https://opendev.org/starlingx/config/commit/7b251b4c99b3395cf62c87a067be53666164471e
Submitter: "Zuul (22348)"
Branch: master

commit 7b251b4c99b3395cf62c87a067be53666164471e
Author: Andre Kantek <email address hidden>
Date: Tue Sep 26 08:50:44 2023 -0300

    For OAM firewall support old host-endpoint label notetype

    It was observed that during upgrade that the new OAM firewall,
    generated by controller-1 already in the new version, does not
    contain the host endpoint label "notetype", used in the previous
    versions. This label was renamed "nodetype" as it only contains the
    values of 'controller' or 'worker'.

    This changes adds "notetype" as a supported label in the OAM Global
    Network Policy, to allow firewall application during upgrade in the
    controller-0, when still running the old version. The other network
    types did not had firewall attached to them in previous versions.

    Test Plan
    [PASS] Execute upgrade on an AIO-DX system and validate that after the
            unlock of controller-1 the OAM access is operational in
            controller-0 and controller-1
    [PASS] Execute boostrap install for AIO-DX system and validate that
            that the OAM access is operational in both controllers

    Closes-Bug: 2038306

    Change-Id: Ib6bc95db9a2ec22ec63f693bd4f85dad89023a2b
    Signed-off-by: Andre Kantek <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.networking stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.