Validate fm manager socket fd before send a message
When fm manager is restarted, there is no mechanism to detect it
from fm api client side. As a result, when subcloud delete clear
alarm request is sent after fm manager is restarted, fm api client
will show broke pipe and clear alarm request is not received and
this alarm stays.
This fix is to check socket fd state before send/receive from
fm api client. If broken pipe is detected, it will try to
reconnect to fm manager.
Closes-bug: 2039684
Test Plan:
PASS: Restart fm manager and confirm that detect broken pipe
and reconnect messages in /var/log. For example,
-----
sm: err fmSocket.cpp(270): A broken pipe error occurred
sm: warning fmAPI.cpp(116): Invalid file descriptor. Atte
mpting to reconnect...
sm: info fmAPI.cpp(149): Connected to FM Manager.
-----
PASS: Delete offline subcloud and confirm the alarm is
removed.
Reviewed: https:/ /review. opendev. org/c/starlingx /fault/ +/898144 /opendev. org/starlingx/ fault/commit/ 8bd6e5b92d50e15 235f2777ea715aa ea75c2886e
Committed: https:/
Submitter: "Zuul (22348)"
Branch: master
commit 8bd6e5b92d50e15 235f2777ea715aa ea75c2886e
Author: Takamasa Takenaka <email address hidden>
Date: Thu Oct 12 19:21:44 2023 -0300
Validate fm manager socket fd before send a message
When fm manager is restarted, there is no mechanism to detect it
from fm api client side. As a result, when subcloud delete clear
alarm request is sent after fm manager is restarted, fm api client
will show broke pipe and clear alarm request is not received and
this alarm stays.
This fix is to check socket fd state before send/receive from
fm api client. If broken pipe is detected, it will try to
reconnect to fm manager.
Closes-bug: 2039684
Test Plan:
PASS: Restart fm manager and confirm that detect broken pipe
and reconnect messages in /var/log. For example,
-----
sm: err fmSocket.cpp(270): A broken pipe error occurred
sm: warning fmAPI.cpp(116): Invalid file descriptor. Atte
mpting to reconnect...
sm: info fmAPI.cpp(149): Connected to FM Manager.
-----
PASS: Delete offline subcloud and confirm the alarm is
removed.
Change-Id: Ibc0f4d96b5c0a3 85d8fedbc1acd23 898f1cbea46
Signed-off-by: Takamasa Takenaka <email address hidden>