Comment 4 for bug 1897334

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/c/starlingx/metal/+/787398
Committed: https://opendev.org/starlingx/metal/commit/7539d36c3f01a338acfa449204c6034dc43f45df
Submitter: "Zuul (22348)"
Branch: master

commit 7539d36c3f01a338acfa449204c6034dc43f45df
Author: Eric MacDonald <email address hidden>
Date: Wed Apr 21 10:12:30 2021 -0400

    Prevent mtcClient from sending to uninitialized socket in AIO SX

    The mtcClient will perform a socket reinit if it detects a socket
    failure. The mtcClient also avoids setting up its controller-1
    cluster network socket for the AIO SX system type ; because there
    is no controller-1 provisioned.

    Most AIO SX systems have the management/cluster networks set to
    the 'loopback' interface. However, when an AIO SX system is setup
    with its management and cluster networks on physical interfaces,
    with or without vlan, the mtcAlive send message utility will try
    to send to the uninitialized controller-1 cluster socket. This
    leads to a socket error that triggers a socket reinitialization
    loop which causes log flooding.

    This update adds a check to the mtcAlive send utility to avoid
    sending mtcAlive to controller-1 for AIO SX system type where
    there is no controller-1 provisioned; no send,no error,no flood.

    Since this update needed to add a system type check, this update
    also implemented a system type definition rename from CPE to AIO.
    Other related definitions and comments were also changed to make
    the code base more understandable and maintainable

    Test Plan:

    PASS: Verify AIO SX with mgmnt/clstr on physical (failure mode)
    PASS: Verify AIO SX Install with mgmnt/clstr on 'lo'
    PASS: Verify AIO SX Lock msg and ack over mgmnt and clstr
    PASS: Verify AIO SX locked-disabled-online state
    PASS: Verify mtcClient clstr socket error detect/auto-recovery (fit)
    PASS: Verify mtcClient mgmnt socket error detect/auto-recovery (fit)

    Regression:

    PASS: Verify AIO SX Lock and Unlock (lazy reboot)
    PASS: Verify AIO DX and DC install with pv regression and sanity
    PASS: Verify Standard system install with pv regression and sanity

    Change-Id: I658d33a677febda6c0e3fcb1d7c18e5b76cb3762
    Closes-Bug: 1897334
    Signed-off-by: Eric MacDonald <email address hidden>