Comment 6 for bug 2053149

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/914142
Committed: https://opendev.org/starlingx/config/commit/933d3a3a73e923efc86d7ac8b8a059a598e6fbe1
Submitter: "Zuul (22348)"
Branch: master

commit 933d3a3a73e923efc86d7ac8b8a059a598e6fbe1
Author: Tara Subedi <email address hidden>
Date: Mon Mar 25 13:51:26 2024 -0400

    Report port and device inventory after the worker manifest

    This is incremental fix of bug:2053149.
    Upon network boot (first boot) of worker node, agent manager is
    supposed to report ports/devices, without waiting for worker manifest,
    as that would never run on first boot. Without this, after system
    restore, it will be unable to unlock compute node due to sriov config
    update.

    kickstart records first boot as "/etc/platform/.first_boot". Agent
    manager deletes this file. In case agent manager get crashed, it will
    start again. This time, agent manager don't see .first_boot file, and
    don't know this is still first boot and it won't report inventory for
    the worker node.

    This commit fixes this issue by creating volatile file
    "/var/run/.first_boot" before deleting "/etc/platform/.first_boot", and
    agent relies on both files to figure out it is first boot or not. This
    present same logic for multiple crash/restart of agent manager.

    TEST PLAN:
    PASS: AIO-DX bootstrap has no issues. lock/unlock has no issues.
    PASS: Network-boot worker node, before doing unlock, restart agent
          manager (sysinv-agent), check sysinv.log to see ports are reported.

    Closes-Bug: 2053149
    Change-Id: Iace5576575388a6ed3403590dbeec545c25fc0e0
    Signed-off-by: Tara Nath Subedi <email address hidden>