Comment 9 for bug 1918703

Revision history for this message
Eric MacDonald (rocksolidmtce) wrote :

The user is modifying the maintenance service parameter 'mnfa_timeout' value from the active controller-1. Which is normal and the only place it can be modified from. This change triggers the mtc in-service manifest to run to apply that change to /etc/mtc.ini

When a change like that is made, the mtcAgent on the active controller needs its configuration reloaded so that it learns of the change. This is fine on the active controller. However, the in-service manifest is also run on the standby controller so its /etc/mtc.ini is also updated.

However, the mtcAgent does not run on the standby controller therefore the manifest does not need to and should not be trying to sig-hup the mtcAgent there. That's why the manifest is reporting an error on the standby controller.

Here are the relevant logs:

2021-03-11T03:50:25.561 [2178865.00460] controller-1 mtcAgent sig daemon_signal.cpp ( 147) daemon_signal_hdlr : Info : Received SIGHUP ; Reloading Config

[cut out other config reload logs]

# here is the config reload logs that produce a customer log

2021-03-11T03:50:25.578 fmAlarmUtils.cpp(624): Sending FM raise alarm request: alarm_id (200.021), entity_id (host=controller-1.config=mnfa_timeout)
2021-03-11T03:50:25.625 fmAlarmUtils.cpp(658): FM Response for raise alarm: (0), alarm_id (200.021), entity_id (host=controller-1.config=mnfa_timeout)

| 2021-03-11T03:50:25.619752| log | 200.021 | controller-1 platform maintenance service parameter 'mnfa_timeout' changed from 120 to 0 | host=controller-1.config=mnfa_timeout | not-applicable | :50:25.619752 |

Next Steps:
1. Determine if this manifest has always behaved this way.
2. Determine if this potentially silent error has any consequence in this issue.