Comment 11 for bug 1864906

Revision history for this message
Difu Hu (difuhu) wrote :

Hit similar issue during upgrade. Controller-1 stayed power-off
| 4 | controller-1 | controller | locked | disabled | power-off |

log at: https://files.starlingx.kube.cengn.ca/launchpad/1864906/hp380_ALL_NODES_20201103.202315.tar

020-11-03T17:43:01.429 [96966.00615] controller-0 mtcAgent |-| mtcNodeHdlrs.cpp (4936) power_handler : Info : controller-1 Power-Off Completed
2020-11-03T17:43:01.429 fmAPI.cpp(490): Enqueue raise alarm request: UUID (13c50889-19e9-4e76-9791-ca787f4bcac3) alarm id (200.021) instant id (host=controller-1.action=power-off)
2020-11-03T17:43:01.429 [96966.00616] controller-0 mtcAgent inv mtcInvApi.cpp (1119) mtcInvApi_update_state : Info : controller-1 power-off (seq:309)
2020-11-03T17:43:01.438 fmAlarmUtils.cpp(624): Sending FM raise alarm request: alarm_id (200.021), entity_id (host=controller-1.action=power-off)
2020-11-03T17:43:01.483 fmAlarmUtils.cpp(658): FM Response for raise alarm: (0), alarm_id (200.021), entity_id (host=controller-1.action=power-off)
2020-11-03T17:43:06.434 [96966.00617] controller-0 mtcAgent --- threadUtil.cpp ( 344) thread_launch : Warn : controller-1 bmc not in IDLE stage (in Monitor stage)
2020-11-03T17:43:06.434 [96966.00618] controller-0 mtcAgent --- mtcBmcUtil.cpp ( 144) bmc_command_send :Error : controller-1 failed to launch power control thread (rc:72)
2020-11-03T17:43:06.434 [96966.00619] controller-0 mtcAgent hdl mtcNodeHdlrs.cpp (4340) reinstall_handler :Error : controller-1 Reinstall netboot request failed (rc:72)
2020-11-03T17:43:06.434 [96966.00620] controller-0 mtcAgent inv mtcInvApi.cpp ( 334) mtcInvApi_update_task : Info : controller-1 Task: Reinstall Failed ; netboot request (seq:310)
2020-11-03T17:43:06.440 [96966.00621] controller-0 mtcAgent --- threadUtil.cpp ( 763) thread_handler : Warn : controller-1 bmc thread kill req (rc:0)
2020-11-03T17:43:06.440 fmAPI.cpp(490): Enqueue raise alarm request: UUID (62769211-0f8b-4c17-8812-0e5b575c4009) alarm id (200.022) instant id (host=controller-1.status=reinstall-failed)
2020-11-03T17:43:06.442 fmAlarmUtils.cpp(624): Sending FM raise alarm request: alarm_id (200.022), entity_id (host=controller-1.status=reinstall-failed)
2020-11-03T17:43:06.495 fmAlarmUtils.cpp(658): FM Response for raise alarm: (0), alarm_id (200.022), entity_id (host=controller-1.status=reinstall-failed)
2020-11-03T17:43:08.739 [96966.00622] controller-0 mtcAgent --- threadUtil.cpp ( 805) pthread_signal_handler : Info : controller-1 bmc thread SIGKILL ; exiting ...
2020-11-03T17:43:36.445 [96966.00623] controller-0 mtcAgent |-| mtcNodeHdlrs.cpp (4600) reinstall_handler : Info : controller-1 Reinstall complete ; operation failure
2020-11-03T17:43:36.445 [96966.00624] controller-0 mtcAgent inv mtcInvApi.cpp ( 437) mtcInvApi_force_task : Info : controller-1 task clear (seq:311) (was:Reinstall Fa