Sanity: After system initial, all compute nodes got 250.001 alerts "Configuration is out-of-date"

Bug #1970809 reported by Iago Filipe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Iago Filipe

Bug Description

Brief Description
-----------------
After DX/PLus installation is done, all compute nodes got 250.001 alerts. like
compute-* Configuration is out-of-date

Severity
--------
Major

Steps to Reproduce
------------------
install DX/PLus

TC-name:

Expected Behavior
------------------
no 250.001 alert after installation

Actual Behavior
----------------
got 250.001 alert after installation

Reproducibility
---------------
This is the first time saw this

System Configuration
--------------------
Multi-node system

Branch/Pull Time/Commit
-----------------------
master

Last Pass
---------
2022-04-26

Timestamp/Logs
--------------
[2022-04-28 06:14:17,753] 348 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne alarm-list --nowrap --uuid'
[2022-04-28 06:14:17,803] 546 DEBUG MainThread ssh.exec_cmd:: Expecting [.@controller-[01] .(keystone_admin)]\$ in prompt
[2022-04-28 06:14:18,796] 471 DEBUG MainThread ssh.expect :: Output:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

UUID Alarm ID Reason Text Entity ID Severity Time Stamp
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1b545d54-34f2-44c3-acf9-10a15f65a6ab 100.114 NTP address 2601:603:b7f:fec0:1111:1111:1111:1112 is not a valid or a reachable NTP server. host=controller-0=2601:603:b7f:fec0:1111:1111:1111:1112 minor 2022-04-28T05:38:04.672319
a9f62463-bdde-4209-bd6d-fc7f5072b503 100.114 NTP address 2600:3c01::f03c:91ff:fe96:13c5 is not a valid or a reachable NTP server. host=controller-1=2600:3c01::f03c:91ff:fe96:13c5 minor 2022-04-28T05:37:56.850085
74b7ab40-df5a-4c17-95f3-5119fd928f94 250.001 compute-2 Configuration is out-of-date. (applied: 0dc16a85-4048-4cf6-93eb-84986e8f0be9 target: 6bcc8479-a73e-4bfe-8234-d9cb2b591dd1) host=compute-2 major 2022-04-28T05:28:16.419533
1a19c243-bc51-4a35-b61f-3297ce6da52e 250.001 compute-1 Configuration is out-of-date. (applied: 49f94412-28b5-4df3-9eb7-530b07fef094 target: 7504d596-4a6b-4906-a371-124b25a19ac5) host=compute-1 major 2022-04-28T05:27:52.450515
abd2c3dd-113e-4c11-a071-66824d449120 250.001 compute-0 Configuration is out-of-date. (applied: 4d0830d9-7e4a-43c5-b47b-eb2a611f6993 target: 466cde4b-b037-4e37-aced-6316e02ee3fe) host=compute-0 major 2022-04-28T05:26:58.629163
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Iago Filipe (ifest1)
Changed in starlingx:
assignee: nobody → Iago Filipe (ifest1)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/839827

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/839827
Committed: https://opendev.org/starlingx/config/commit/bdb6b40fe214a12731d19b2cf2b6a768e504b200
Submitter: "Zuul (22348)"
Branch: master

commit bdb6b40fe214a12731d19b2cf2b6a768e504b200
Author: Iago Estrela <email address hidden>
Date: Thu Apr 28 19:09:51 2022 -0300

    Fix runtime manifest triggered multiple times

    When max_cpu_frequency is set the API is triggering the
    ['platform::compute::config::runtime'] several times due to
    an additional RPC call to update_max_cpu_frequency in conductor.
    This commit fix this behavior by calling the conductor RPC
    method only if the host configure is not required, if is required
    the full manifest will be executed and there is no need to call
    update_max_cpu_frequency.

    Closes-Bug: 1970809

    Test plan:
    PASS: Bootstrap system and verify alarms.
    PASS: Host lock and unlock and verify alarms.

    Regression:
    PASS: Host-update max_cpu_frequency.
    PASS: Host-update max_cpu_frequency=max_cpu_default.

    Regression logs:
    https://paste.opendev.org/show/b1mRTZaNGS0bkAgpIsA5/

    Relates-To: https://storyboard.openstack.org/#!/story/2009886
    Task: 44882

    Signed-off-by: Iago Estrela <email address hidden>
    Change-Id: I2ff1c8e61a3189ea156cdeccda7c088620c80f41

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
tags: added: stx.7.0 stx.config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.