controller-1 install fails at "Query Power Status" from controller-0

Bug #1892052 reported by Yang Liu on 2020-08-18
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Low
Eric MacDonald

Bug Description

Brief Description
-----------------
  DC-3 system install failed due to controller-1 Query Power Status Command failure

Severity
--------
  Major: system install failed

Steps to Reproduce
------------------
  System install

Expected Behavior
------------------
  System installs correctly. All bmc commands succeed.

Actual Behavior
----------------
  System install failed for controller-1 when Query Power Status Command failed for unknown reason.

Reproducibility
---------------
Intermittent - >50% recently

System Configuration
--------------------
  DC-3 - IPV6

Branch/Pull Time/Commit
-----------------------
  BUILD_DATE="2020-05-07_21-11-18"

Last Pass
---------
Did this test scenario pass previously? Yes, is intermittent

Timestamp/Logs
--------------

2020-08-17T21:40:45.266 [124388.00232] controller-0 mtcAgent inv mtcInvApi.cpp ( 334) mtcInvApi_update_task : Info : controller-1 Task: Reinstalling (seq:7)
2020-08-17T21:40:50.276 [124388.00233] controller-0 mtcAgent --- mtcBmcUtil.cpp ( 214) bmc_command_recv :Error : controller-1 bmc redfish Query Power Status command failed (redfishtool) (data:) (rc:108:108:system call failed)
2020-08-17T21:40:50.276 [124388.00234] controller-0 mtcAgent hdl mtcNodeHdlrs.cpp (4248) reinstall_handler : Warn : controller-1 'Query BMC Root' failed receive (rc:108)
2020-08-17T21:40:50.276 [124388.00235] controller-0 mtcAgent inv mtcInvApi.cpp ( 334) mtcInvApi_update_task : Info : controller-1 Task: Reinstall Failed ; could not query power state (seq:8)

Test Activity
-------------
Lab install

Ghada Khalil (gkhalil) on 2020-09-09
tags: added: stx.metal
Ghada Khalil (gkhalil) wrote :

This scenario maybe addressed by the robustness fix planned for https://bugs.launchpad.net/starlingx/+bug/1880578

Assigning to Eric MacDonald to review and mark as a duplicate if applicable.

Ghada Khalil (gkhalil) wrote :

oops! Forgot to assign to Eric MacDonald.
As noted above, I believe this can be marked as a duplicate of https://bugs.launchpad.net/starlingx/+bug/1880578

Changed in starlingx:
importance: Undecided → Low
status: New → Incomplete
status: Incomplete → Triaged
assignee: nobody → Eric MacDonald (rocksolidmtce)
Ghada Khalil (gkhalil) wrote :

Note: 1880578 reports the same issue on the same lab.

Eric MacDonald (rocksolidmtce) wrote :

Yes, this is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1880578

From issue reported by that LP

1880578/ALL_NODES_20200713.171117/controller-0_20200713.171117/var/log/mtcAgent.log:2020-07-12T22:59:03.544 [107777.00264] controller-0 mtcAgent --- mtcBmcUtil.cpp ( 214) bmc_command_recv :Error : worker-0 bmc redfish Query Power Status command failed (redfishtool) (data:) (rc:108:108:system call failed)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers