Comment 8 for bug 1858110

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (r/stx.3.0)

Reviewed: https://review.opendev.org/702534
Committed: https://git.openstack.org/cgit/starlingx/metal/commit/?id=eaf1f0c93db8e1219914ee0ae4b81a99549cff18
Submitter: Zuul
Branch: r/stx.3.0

commit eaf1f0c93db8e1219914ee0ae4b81a99549cff18
Author: Eric MacDonald <email address hidden>
Date: Fri Jan 3 09:34:37 2020 -0500

    Fix BMC access loss handling

    Recent refactoring of the BMC handler FSM introduced a code change that
    prevents the BMC Access alarm from being raised after initial BMC
    accessibility was established and is then lost.

    This update ensures BMC access alarm management is working properly.

    This update also implements ping failure debounce so that a single ping
    failure does not trigger full reconnection handling. Instead that now
    requires 3 ping failures in a row. This has the effect of adding a minute
    to ping failure action handling before the usual 2 minute BMC access failure
    alarm is raised. ping failure logging is reduced/improved.

    Test Plan: for both hwmond and mtcAgent

    PASS: Verify BMC access alarm due to bad provisioning (un, pw, ip, type)
    PASS: Verify BMC ping failure debounce handling, recovery and logging
    PASS: Verify BMC ping persistent failure handling
    PASS: Verify BMC ping periodic miss handling
    PASS: Verify BMC ping and access failure recovery timing
    PASS: Verify BMC ping failure and recovery handling over BMC link pull/plug
    PASS: Verify BMC sensor monitoring stops/resumes over ping failure/recovery

    Regression:

    PASS: Verify IPv6 System Install using provisioned BMCs (wp8-12)
    PASS: Verify BMC power-off request handling with BMC ping failing & recovering
    PASS: Verify BMC power-on request handling with BMC ping failing & recovering
    PASS: Verify BMC reset request handling with BMC ping failing & recovering
    PASS: Verify BMC sensor group read failure handling & recovery
    PASS: Verify sensor monitoring after ping failure handling & recovery

    Change-Id: I74870816930ef6cdb11f987424ffed300ff8affe
    Closes-Bug: 1858110
    Signed-off-by: Eric MacDonald <email address hidden>
    (cherry picked from commit 9bf231a2866c0ff737064755d0106198d4df7d7d)