100.114 "NTP configuration does not contain any valid or reachable NTP servers." major alarm not issued when no NTP sources
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Low
|
Takamasa Takenaka |
Bug Description
Brief Description
-----------------
In standard system, 100.114 "NTP configuration does not contain any valid or reachable NTP servers." major alarm not issued when no NTP sources
Severity
--------
Major
Steps to Reproduce
------------------
1. on standby controller (controller-1), block the CIDR for one of the NTP servers
2. minor reachability alarm for NTP server generated (as expected)
3. on standby controller, block the CIDR for two other NTP servers (no more external sources)
4. minor reachability alarms for NTP servers generated (as expected)
5. minor alarm for no external sources, syncing with peer controller issued (as expected)
6. swact to controller-1
7. lock and power off controller-0
8. ntpq shows no selected NTP soruces... major NTP alarm not issued (==> Issue-1)
9. ntpq shows no reachability to mate source... still no major NTP alarm issued
10. system sat in this state overnight, and still did not clear "syncing with peer" alarm (==> Issue-2), nor issue the major NTP alarm for no reachable servers..
Expected Behavior
------------------
When there is no reachable NTP server:
1. Major alarm "NTP configuration does not contain any valid or reachable NTP servers." should be raised.
2. Minor alarm "NTP cannot reach external time source; syncing with peer controller only" should be suppressed.
Actual Behavior
----------------
When there is no reachable NTP server:
1. Major alarm "NTP configuration does not contain any valid or reachable NTP servers." is not raised.
2. Minor alarm "NTP cannot reach external time source; syncing with peer controller only" stays.
Reproducibility
---------------
Reproducible
System Configuration
-------
Two node system
Branch/Pull Time/Commit
-------
stx4 as of 2020-06-27 18:37:38 -0400
Last Pass
---------
no known
Timestamp/Logs
--------------
There are two issues.
Issue-1:
Following alarm did not clear for almost 10+ hours even though there was no peer controller was available.
2021-03-
{ "event_log_id" : "100.114", "reason_text" : "NTP cannot reach external time source; syncing with peer controller only", "entity_
Issue-2:
Even though all NTP sources unavailable as of 2021-03-
2021-03-
{ "event_log_id" : "100.114", "reason_text" : "NTP address 2607:f160:
2021-03-
{ "event_log_id" : "100.114", "reason_text" : "NTP address 2607:f160:
2021-03-
{ "event_log_id" : "100.114", "reason_text" : "NTP address 2607:f160:
Timestamp when failure occurred:
2021-03-
Test Activity
-------------
Evaluation
Workaround
----------
N/A
Changed in starlingx: | |
importance: | Undecided → Low |
tags: | added: stx.config |
This bug is reproducible.
1. Create the state which peer is selected and out server are unreachable:
controller-1:~$ ntpq -np ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= =
206. 108.0.132 2 u 4 64 377 0.090 -9.015 1.158
.INIT. 16 u - 64 0 0.000 0.000 0.000
.INIT. 16 u - 64 0 0.000 0.000 0.000
remote refid st t when poll reach delay offset jitter
=======
*192.168.204.2
172.217.13.142
74.6.143.26
2. Confirm we have alarm "NTP cannot reach external time source; syncing with peer controller only" and two "NTP address [ip] is not a valid or a reachable NTP server."
[sysadmin@ controller- 0 ~(keystone_admin)]$ fm alarm-list -+----- ------- ------- ------- ----+-- ------- ------- ------- ------- ------- -+----- -----+- ------- -----+ -+----- ------- ------- ------- ----+-- ------- ------- ------- ------- ------- -+----- -----+- ------- -----+ -1.ntp= 74.6.143. 26 | minor | 2021-05-06T | -1.ntp= 172.217. 13.142 | minor | 2021-05-06T | -1.ntp | minor | 2021-05-06T |
+------
| Alarm | Reason Text | Entity ID | Severity | Time Stamp |
| ID | | | | |
+------
| 100. | NTP address 74.6.143.26 is | host=controller
| 114 | not a valid or a reachable | | | 13:30:06. |
| | NTP server. | | | 911250 |
| | | | | |
| 100. | NTP address 172.217.13.142 | host=controller
| 114 | is not a valid or a | | | 13:30:06. |
| | reachable NTP server. | | | 908394 |
| | | | | |
| 100. | NTP cannot reach external | host=controller
| 114 | time source; syncing with | | | 13:25:06. |
| | peer controller only | | | 967746 |
3. swact controller-0, lock controller-0, power off controller-0 and wait until no NTP server is selected
[sysadmin@ controller- 1 ~(keystone_admin)]$ ntpq -np ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= =
206. 108.0.132 2 u 758 64 0 0.001 -8.771 0.000
.INIT. 16 u - 1024 0 0.000 0.000 0.000
.INIT. 16 u - 1024 0 0.000 0.000 0.000
remote refid st t when poll reach delay offset jitter
=======
192.168.204.2
172.217.13.142
74.6.143.26
[sysadmin@ controller- 1 ~(keystone_admin)]$ fm alarm-list | grep 100.114 -1.ntp | minor | 2021-05- 06T13:55: 06...
| 100.114 | NTP cannot reach external time source; syncing with peer controller only | host=controller