controller-1 install fails at "Query Power Status" from controller-0

Bug #1892052 reported by Yang Liu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Low
Eric MacDonald

Bug Description

Brief Description
-----------------
  DC-3 system install failed due to controller-1 Query Power Status Command failure

Severity
--------
  Major: system install failed

Steps to Reproduce
------------------
  System install

Expected Behavior
------------------
  System installs correctly. All bmc commands succeed.

Actual Behavior
----------------
  System install failed for controller-1 when Query Power Status Command failed for unknown reason.

Reproducibility
---------------
Intermittent - >50% recently

System Configuration
--------------------
  DC-3 - IPV6

Branch/Pull Time/Commit
-----------------------
  BUILD_DATE="2020-05-07_21-11-18"

Last Pass
---------
Did this test scenario pass previously? Yes, is intermittent

Timestamp/Logs
--------------

2020-08-17T21:40:45.266 [124388.00232] controller-0 mtcAgent inv mtcInvApi.cpp ( 334) mtcInvApi_update_task : Info : controller-1 Task: Reinstalling (seq:7)
2020-08-17T21:40:50.276 [124388.00233] controller-0 mtcAgent --- mtcBmcUtil.cpp ( 214) bmc_command_recv :Error : controller-1 bmc redfish Query Power Status command failed (redfishtool) (data:) (rc:108:108:system call failed)
2020-08-17T21:40:50.276 [124388.00234] controller-0 mtcAgent hdl mtcNodeHdlrs.cpp (4248) reinstall_handler : Warn : controller-1 'Query BMC Root' failed receive (rc:108)
2020-08-17T21:40:50.276 [124388.00235] controller-0 mtcAgent inv mtcInvApi.cpp ( 334) mtcInvApi_update_task : Info : controller-1 Task: Reinstall Failed ; could not query power state (seq:8)

Test Activity
-------------
Lab install

Tags: stx.metal
Ghada Khalil (gkhalil)
tags: added: stx.metal
Revision history for this message
Ghada Khalil (gkhalil) wrote :

This scenario maybe addressed by the robustness fix planned for https://bugs.launchpad.net/starlingx/+bug/1880578

Assigning to Eric MacDonald to review and mark as a duplicate if applicable.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

oops! Forgot to assign to Eric MacDonald.
As noted above, I believe this can be marked as a duplicate of https://bugs.launchpad.net/starlingx/+bug/1880578

Changed in starlingx:
importance: Undecided → Low
status: New → Incomplete
status: Incomplete → Triaged
assignee: nobody → Eric MacDonald (rocksolidmtce)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Note: 1880578 reports the same issue on the same lab.

Revision history for this message
Eric MacDonald (rocksolidmtce) wrote :

Yes, this is a duplicate of https://bugs.launchpad.net/starlingx/+bug/1880578

From issue reported by that LP

1880578/ALL_NODES_20200713.171117/controller-0_20200713.171117/var/log/mtcAgent.log:2020-07-12T22:59:03.544 [107777.00264] controller-0 mtcAgent --- mtcBmcUtil.cpp ( 214) bmc_command_recv :Error : worker-0 bmc redfish Query Power Status command failed (redfishtool) (data:) (rc:108:108:system call failed)

Revision history for this message
Eric MacDonald (rocksolidmtce) wrote :

Marked as 'invalid' duplicate of ...

https://bugs.launchpad.net/starlingx/+bug/1880578

... which has update that is already merged by ...

https://review.opendev.org/c/starlingx/metal/+/761760/

Changed in starlingx:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.