host attributes need to reflect that initial inventory has been collected

Bug #1837097 reported by Allain Legacy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
John Kung

Bug Description

Brief Description
-----------------
The system inventory agent needs to explicitly indicate that inventory collection has finished for each host. Our current method for determining whether a host has been inventoried successfully is to wait for the disk list to be non-empty. That worked well until recently when the host file system feature was merged. The system inventory agent now collects/creates host file systems after the disk list is populated so a provisioning system waiting on the disk list will move ahead to unlock the node prematurely before the host file systems have been created and reported to system inventory. This can lead to undefined behavior either on the system being provisioned or the provisioning system that is configuring the target system.

If we do not fix this properly with an explicit/deterministic flag then we will trip over this issue each time someone adds a new inventory collection step to the end of the system inventory agent's initial process loop.

Severity
--------
major, the system may not be configured properly if it is unlocked prior to the end of inventory collection.

Steps to Reproduce
------------------
This is a timing issue so there are no reliable steps to reproduce the issue.

Expected Behavior
------------------
The should be a host attribute that reports whether the system inventory agent has completed the initial inventory collection (i.e., something like initial-inventory-complete=true)

Actual Behavior
----------------
There is no deterministic indicator today, we need to rely on knowing what step the agent will run last and poll for that completion.

Reproducibility
---------------
unknown

System Configuration
--------------------
Any

Branch/Pull Time/Commit
-----------------------
20190718T013000Z

Last Pass
---------
unknown

Timestamp/Logs
--------------
n/a

Test Activity
-------------
Developer testing

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.2.0 as the current approach is not deterministic

Changed in starlingx:
importance: Undecided → High
tags: added: stx.2.0 stx.config
Changed in starlingx:
status: New → Triaged
assignee: nobody → John Kung (john-kung)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/672772

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)

Fix proposed to branch: master
Review: https://review.opendev.org/672801

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/672772
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=168442b2e17fd66797485b9e593533b1ca35ca85
Submitter: Zuul
Branch: master

commit 168442b2e17fd66797485b9e593533b1ca35ca85
Author: John Kung <email address hidden>
Date: Thu Jul 25 10:38:55 2019 -0400

    Create host state for determining initial inventory complete

    Add host inv_state attribute to allow determination of when the
    initial inventory collection has been completed.

    Update references which were using disks/pvs as proxy for inventory
    completion to reference the host inv_state attribute.

    Description of issue (from Bug 1837097):
    The system inventory agent needs to explicitly indicate that inventory
    collection has finished for each host. The current method for
    determining whether a host has been inventoried successfully is to
    wait for the disk/pv list to be non-empty.

    That worked well until recently when the host file system feature
    was merged. The system inventory agent now collects/creates host file
    systems after the disk list is populated so a provisioning system
    waiting on the disk list will move ahead to unlock the node
    prematurely before the host file systems have been created and reported
    to system inventory. This can lead to undefined behavior either on
    the system being provisioned or the provisioning system that is
    configuring the target system.

    If we do not fix this properly with an explicit/deterministic flag then
    we will trip over this issue each time someone adds a new inventory
    collection step to the end of the system inventory agent's
    initial process loop.

    Change-Id: Ifdb8871a892414ee4c433cf7a6ec7e79390c6420
    Closes-bug: 1837097
    Signed-off-by: John Kung <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/672801
Committed: https://git.openstack.org/cgit/starlingx/ansible-playbooks/commit/?id=e5178771888bd5a13c0c50dd7cd9fd960d52163d
Submitter: Zuul
Branch: master

commit e5178771888bd5a13c0c50dd7cd9fd960d52163d
Author: John Kung <email address hidden>
Date: Thu Jul 25 10:35:22 2019 -0400

    Create host state for determing initial inventory complete

    Add host inv_state attribute to allow determination of when the
    initial inventory collection has been completed.

    Update references which were using disks/pvs as proxy for inventory
    completion to reference the host inv_state attribute.

    Description of issue (from Bug 1837097):
    The system inventory agent needs to explicitly indicate that inventory
    collection has finished for each host. The current method for
    determining whether a host has been inventoried successfully is to
    wait for the disk/pv list to be non-empty.

    That worked well until recently when the host file system feature
    was merged. The system inventory agent now collects/creates host file
    systems after the disk list is populated so a provisioning system
    waiting on the disk list will move ahead to unlock the node
    prematurely before the host file systems have been created and reported
    to system inventory. This can lead to undefined behavior either on
    the system being provisioned or the provisioning system that is
    configuring the target system.

    If we do not fix this properly with an explicit/deterministic flag then
    we will trip over this issue each time someone adds a new inventory
    collection step to the end of the system inventory agent's
    initial process loop.

    Change-Id: Ibe590055a3e2c6b312a92c1afc445f7c14d87e2f
    Closes-bug: 1837097
    Depends-On: https://review.opendev.org/#/c/672772/
    Signed-off-by: John Kung <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.