DC After both system controller nodes power off/on ssh connection lost for 50 mins
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Bart Wensley |
Bug Description
Brief Description
-----------------
In Distributed Cloud, After power off/on both system controller nodes, ssh connection lost for 50 mins.
Severity
--------
Major
Steps to Reproduce
------------------
In Distributed Cloud, power off/on both system controller nodes,
check ssh connection
Expected Behavior
------------------
ssh connection should be resume after nodes boot up, like within 5 mins
Actual Behavior
----------------
ssh re-connected in 50 mins
Reproducibility
---------------
Unknown - first time this is seen in sanity, will monitor
System Configuration
-------
DC system
Lab-name: DC-3
Branch/Pull Time/Commit
-------
2020-03-20_00-10-00
Last Pass
---------
unknown
Timestamp/Logs
--------------
[2020-03-21 09:06:21,007] 314 DEBUG MainThread ssh.send :: Send '/folk/
[2020-03-21 09:06:21,078] 479 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2020-03-21 09:06:21,078] 314 DEBUG MainThread ssh.send :: Send '/folk/
[2020-03-21 09:06:22,837] 436 DEBUG MainThread ssh.expect :: Output:
1
[2020-03-21 09:07:24,367] 314 DEBUG MainThread ssh.send :: Send '/folk/
[2020-03-21 09:07:26,625] 436 DEBUG MainThread ssh.expect :: Output:
1
[2020-03-21 09:07:27,985] 314 DEBUG MainThread ssh.send :: Send '/folk/
[2020-03-21 09:07:30,693] 436 DEBUG MainThread ssh.expect :: Output:
1
[2020-03-21 09:10:04,339] 314 DEBUG MainThread ssh.send :: Send '/usr/bin/ssh -o RSAAuthenticati
[2020-03-21 09:11:04,464] 407 WARNING MainThread ssh.expect :: No match found for ['.*controller\
expect timeout.
[2020-03-21 09:11:04,464] 1294 INFO MainThread ssh.connect :: Unable to ssh to 2620:10a:
[2020-03-21 09:56:18,396] 314 DEBUG MainThread ssh.send :: Send '/usr/bin/ssh -o RSAAuthenticati
[2020-03-21 09:56:21,568] 436 DEBUG MainThread ssh.expect :: Output:
command-line line 0: Unsupported option "rsaauthentication"
ssh: connect to host 2620:10a:
svc-cgcsauto@
[2020-03-21 09:56:21,569] 1294 INFO MainThread ssh.connect :: Unable to ssh to 2620:10a:
[2020-03-21 09:56:21,569] 1310 INFO MainThread ssh.connect :: Retry in 10 seconds
[2020-03-21 09:56:31,579] 314 DEBUG MainThread ssh.send :: Send '/usr/bin/ssh -o RSAAuthenticati
[2020-03-21 09:56:31,761] 436 DEBUG MainThread ssh.expect :: Output:
command-line line 0: Unsupported option "rsaauthentication"
Warning: Permanently added '2620:10a:
Release 20.01
-------
W A R N I N G *** W A R N I N G *** W A R N I N G *** W A R N I N G ***
-------
THIS IS A PRIVATE COMPUTER SYSTEM.
This computer system including all related equipment, network devices
(specifically including Internet access), are provided only for authorized use.
All computer systems may be monitored for all lawful purposes, including to
ensure that their use is authorized, for management of the system, to
facilitate protection against unauthorized access, and to verify security
procedures, survivability and operational security. Monitoring includes active
attacks by authorized personnel and their entities to test or verify the
security of the system. During monitoring, information may be examined,
recorded, copied and used for authorized purposes. All information including
personal information, placed on or sent over this system may be monitored. Uses
of this system, authorized or unauthorized, constitutes consent to monitoring
of this system. Unauthorized use may subject you to criminal prosecution.
Evidence of any such unauthorized use collected during monitoring may be used
for administrative, criminal or other adverse action. Use of this system
constitutes consent to monitoring for these purposes.
sysadmin@
[2020-03-21 09:56:31,761] 314 DEBUG MainThread ssh.send :: Send 'Li69nux*'
[2020-03-21 09:56:32,034] 436 DEBUG MainThread ssh.expect :: Output:
Last login: Sat Mar 21 08:33:54 2020 from fd01:11::5
/etc/motd.
[H[2J
WARNING: Unauthorized access to this system is forbidden and will be
prosecuted by law. By accessing this system, you agree that your
actions may be monitored if unauthorized usage is suspected.
[?1034hcontrol
[2020-03-21 09:56:32,034] 1288 INFO MainThread ssh.connect :: Successfully connected
Test Activity
-------------
Sanity
tags: | added: stx.retestneeded |
After DOR, following alarms appeared:
| f6a850a0- 2dfd-411b- a2b3-1d9a68beff fa | 250.001 | controller-1 Configuration is out-of-date. | host=controller-1 | major | 2020-03- 21T09:57: 46.659447 | b0fc-4bd8- 9923-e0b16714a2 32 | 200.006 | controller-0 critical 'dockerd' process has failed and could not be auto-recovered gracefully. Auto-recovery progression by host reboot is required and in progress. Manual Lock and Unlock may be required if auto-recovery is unsuccessful. | host=controller -0.process= dockerd | critical | 2020-03- 21T09:20: 19.229356 |
| ee80d1af-