Activity log for bug #1884704

Date Who What changed Old value New value Message
2020-06-23 03:45:49 Yang Liu bug added bug
2020-06-23 03:53:44 Yang Liu description Brief Description ----------------- A DC system was having stability issue when bootstraping subclouds (bootstrap of subcloud fails premature and stuck at bootstrapping even after it's done, etc). It was then noticed mgr-restful-plugin has been restarting every few minutes. It also causes a number of other services to restart. Even though those services recover fast, it causes instability of the system. Following services are likely affected according to Gerry Kopec. | 2020-06-22T20:16:57.174 | 12980 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T20:16:57.552 | 12981 | service-scn | ceph-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.553 | 12982 | service-scn | sysinv-conductor | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.554 | 12983 | service-scn | sysinv-inv | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.056 | 12984 | service-scn | dcorch-sysinv-api-proxy | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.057 | 12985 | service-scn | dcmanager-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12986 | service-scn | dnsmasq | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12987 | service-scn | mtc-agent | enabled-active | disabling | disable state requested Severity -------- Major Steps to Reproduce ------------------ Check sm-customer.log and observe that mgr-restful-plugin is going from enabled-active to disabling due to audit failed every few minutes Not sure about steps to reproduce. Automated regression was running on when issue started to happen, but the events feels unrelated. - See details in timestamp and logs. Expected Behavior ------------------ system is stable Actual Behavior ---------------- mgr-restful-plugin restarts every few minutes along with a number of other services Reproducibility --------------- Intermittent System Configuration -------------------- DC system controller Lab-name: DC-4 Branch/Pull Time/Commit ----------------------- 2020-06-20_20-00-00 Last Pass --------- 2020-06-18_20-00-00 - but this likely to be intermittent, so not sure about exact last pass. Timestamp/Logs -------------- Here's what was done before the first restart: # Modify timezone on system controller and subcloud7 [2020-06-22 14:04:15,729] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="Europe/Berlin"' [2020-06-22 14:04:51,483] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="Canada/Central"' # Time zone reverted [2020-06-22 14:24:02,950] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="UTC"' [2020-06-22 14:24:38,673] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="UTC"' # lock/unlock subcloud7 host [2020-06-22 14:26:00,195] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-lock controller-0' [2020-06-22 14:27:20,656] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-unlock controller-0' # subcloud7 was just recovered at 14:35: [2020-06-22 14:35:16,714] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' Authorization failed: Unable to establish connection to http://[fd01:88::2]:5000/v3/auth/tokens controller-0:~$ [2020-06-22 14:35:27,685] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' +------------------------+--------------------------------------+ | Property | Value | +------------------------+--------------------------------------+ | contact | None | | created_at | 2020-06-21T06:59:44.523373+00:00 | | description | None | | distributed_cloud_role | subcloud | | https_enabled | False | | location | None | | name | dc-subcloud7 | | region_name | subcloud7 | | sdn_enabled | False | | security_feature | spectre_meltdown_v1 | | service_project_name | services | | shared_services | ['identity', ] | | software_version | 20.06 | | system_mode | simplex | | system_type | All-in-one | | timezone | UTC | | updated_at | 2020-06-22T14:24:40.061605+00:00 | | uuid | aa2c9682-a3c0-4a41-9b4a-c303914c6a89 | | vswitch_type | none | +------------------------+--------------------------------------+ controller-0:~$ # There were no explicit operations on system controller at 14:35, where the issue started. | 2020-06-22T14:35:49.783 | 6069 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:37:05.359 | 6140 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:47:01.390 | 6222 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:56:56.845 | 6295 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:02:12.548 | 6367 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:04:48.142 | 6439 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:07:24.001 | 6510 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:08:39.621 | 6582 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:11:56.083 | 6658 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed Test Activity ------------- Regression Testing, Developer Testing Brief Description ----------------- A DC system was having stability issue when bootstraping subclouds (bootstrap of subcloud fails premature and stuck at bootstrapping even after it's done, etc). It was then noticed mgr-restful-plugin has been restarting every few minutes. It also causes a number of other services to restart. Even though those services recover fast, it causes instability of the system. Following services are likely affected according to Gerry Kopec. | 2020-06-22T20:16:57.174 | 12980 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T20:16:57.552 | 12981 | service-scn | ceph-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.553 | 12982 | service-scn | sysinv-conductor | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.554 | 12983 | service-scn | sysinv-inv | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.056 | 12984 | service-scn | dcorch-sysinv-api-proxy | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.057 | 12985 | service-scn | dcmanager-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12986 | service-scn | dnsmasq | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12987 | service-scn | mtc-agent | enabled-active | disabling | disable state requested Severity -------- Major Steps to Reproduce ------------------ Check sm-customer.log and observe that mgr-restful-plugin is going from enabled-active to disabling due to audit failed every few minutes Not sure about steps to reproduce. Automated regression was running on when issue started to happen, but the events feels unrelated. - See details in timestamp and logs. Expected Behavior ------------------ system is stable Actual Behavior ----------------  mgr-restful-plugin restarts every few minutes along with a number of other services Reproducibility --------------- Intermittent System Configuration -------------------- DC system controller Lab-name: DC-4 Branch/Pull Time/Commit ----------------------- 2020-06-20_20-00-00 Last Pass --------- 2020-06-18_20-00-00 - but this likely to be intermittent, so not sure about exact last pass. Timestamp/Logs -------------- https://files.starlingx.kube.cengn.ca/launchpad/1884704 Here's what was done before the first restart: # Modify timezone on system controller and subcloud7 [2020-06-22 14:04:15,729] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="Europe/Berlin"' [2020-06-22 14:04:51,483] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="Canada/Central"' # Time zone reverted [2020-06-22 14:24:02,950] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="UTC"' [2020-06-22 14:24:38,673] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="UTC"' # lock/unlock subcloud7 host [2020-06-22 14:26:00,195] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-lock controller-0' [2020-06-22 14:27:20,656] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-unlock controller-0' # subcloud7 was just recovered at 14:35: [2020-06-22 14:35:16,714] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' Authorization failed: Unable to establish connection to http://[fd01:88::2]:5000/v3/auth/tokens controller-0:~$ [2020-06-22 14:35:27,685] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' +------------------------+--------------------------------------+ | Property | Value | +------------------------+--------------------------------------+ | contact | None | | created_at | 2020-06-21T06:59:44.523373+00:00 | | description | None | | distributed_cloud_role | subcloud | | https_enabled | False | | location | None | | name | dc-subcloud7 | | region_name | subcloud7 | | sdn_enabled | False | | security_feature | spectre_meltdown_v1 | | service_project_name | services | | shared_services | ['identity', ] | | software_version | 20.06 | | system_mode | simplex | | system_type | All-in-one | | timezone | UTC | | updated_at | 2020-06-22T14:24:40.061605+00:00 | | uuid | aa2c9682-a3c0-4a41-9b4a-c303914c6a89 | | vswitch_type | none | +------------------------+--------------------------------------+ controller-0:~$ # There were no explicit operations on system controller at 14:35, where the issue started. | 2020-06-22T14:35:49.783 | 6069 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:37:05.359 | 6140 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:47:01.390 | 6222 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:56:56.845 | 6295 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:02:12.548 | 6367 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:04:48.142 | 6439 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:07:24.001 | 6510 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:08:39.621 | 6582 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:11:56.083 | 6658 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed Test Activity ------------- Regression Testing, Developer Testing
2020-06-23 11:43:52 Bart Wensley tags stx.storage
2020-06-23 18:37:45 Ghada Khalil bug added subscriber Daniel Badea
2020-06-23 18:38:00 Ghada Khalil starlingx: status New Triaged
2020-06-23 18:38:13 Ghada Khalil starlingx: assignee Stefan Dinescu (stefandinescu)
2020-06-23 18:38:20 Ghada Khalil starlingx: importance Undecided Medium
2020-06-23 18:38:30 Ghada Khalil tags stx.storage stx.4.0 stx.storage
2020-06-24 15:55:25 Frank Miller description Brief Description ----------------- A DC system was having stability issue when bootstraping subclouds (bootstrap of subcloud fails premature and stuck at bootstrapping even after it's done, etc). It was then noticed mgr-restful-plugin has been restarting every few minutes. It also causes a number of other services to restart. Even though those services recover fast, it causes instability of the system. Following services are likely affected according to Gerry Kopec. | 2020-06-22T20:16:57.174 | 12980 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T20:16:57.552 | 12981 | service-scn | ceph-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.553 | 12982 | service-scn | sysinv-conductor | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.554 | 12983 | service-scn | sysinv-inv | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.056 | 12984 | service-scn | dcorch-sysinv-api-proxy | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.057 | 12985 | service-scn | dcmanager-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12986 | service-scn | dnsmasq | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12987 | service-scn | mtc-agent | enabled-active | disabling | disable state requested Severity -------- Major Steps to Reproduce ------------------ Check sm-customer.log and observe that mgr-restful-plugin is going from enabled-active to disabling due to audit failed every few minutes Not sure about steps to reproduce. Automated regression was running on when issue started to happen, but the events feels unrelated. - See details in timestamp and logs. Expected Behavior ------------------ system is stable Actual Behavior ----------------  mgr-restful-plugin restarts every few minutes along with a number of other services Reproducibility --------------- Intermittent System Configuration -------------------- DC system controller Lab-name: DC-4 Branch/Pull Time/Commit ----------------------- 2020-06-20_20-00-00 Last Pass --------- 2020-06-18_20-00-00 - but this likely to be intermittent, so not sure about exact last pass. Timestamp/Logs -------------- https://files.starlingx.kube.cengn.ca/launchpad/1884704 Here's what was done before the first restart: # Modify timezone on system controller and subcloud7 [2020-06-22 14:04:15,729] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="Europe/Berlin"' [2020-06-22 14:04:51,483] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="Canada/Central"' # Time zone reverted [2020-06-22 14:24:02,950] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="UTC"' [2020-06-22 14:24:38,673] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="UTC"' # lock/unlock subcloud7 host [2020-06-22 14:26:00,195] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-lock controller-0' [2020-06-22 14:27:20,656] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-unlock controller-0' # subcloud7 was just recovered at 14:35: [2020-06-22 14:35:16,714] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' Authorization failed: Unable to establish connection to http://[fd01:88::2]:5000/v3/auth/tokens controller-0:~$ [2020-06-22 14:35:27,685] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' +------------------------+--------------------------------------+ | Property | Value | +------------------------+--------------------------------------+ | contact | None | | created_at | 2020-06-21T06:59:44.523373+00:00 | | description | None | | distributed_cloud_role | subcloud | | https_enabled | False | | location | None | | name | dc-subcloud7 | | region_name | subcloud7 | | sdn_enabled | False | | security_feature | spectre_meltdown_v1 | | service_project_name | services | | shared_services | ['identity', ] | | software_version | 20.06 | | system_mode | simplex | | system_type | All-in-one | | timezone | UTC | | updated_at | 2020-06-22T14:24:40.061605+00:00 | | uuid | aa2c9682-a3c0-4a41-9b4a-c303914c6a89 | | vswitch_type | none | +------------------------+--------------------------------------+ controller-0:~$ # There were no explicit operations on system controller at 14:35, where the issue started. | 2020-06-22T14:35:49.783 | 6069 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:37:05.359 | 6140 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:47:01.390 | 6222 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:56:56.845 | 6295 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:02:12.548 | 6367 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:04:48.142 | 6439 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:07:24.001 | 6510 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:08:39.621 | 6582 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:11:56.083 | 6658 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed Test Activity ------------- Regression Testing, Developer Testing Brief Description ----------------- A DC system was having stability issue when bootstraping subclouds (bootstrap of subcloud fails premature and stuck at bootstrapping even after it's done, etc). It was then noticed mgr-restful-plugin has been restarting every few minutes. It also causes a number of other services to restart. Even though those services recover fast, it causes instability of the system. Following services are likely affected according to Gerry Kopec. | 2020-06-22T20:16:57.174 | 12980 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T20:16:57.552 | 12981 | service-scn | ceph-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.553 | 12982 | service-scn | sysinv-conductor | enabled-active | disabling | disable state requested | 2020-06-22T20:16:57.554 | 12983 | service-scn | sysinv-inv | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.056 | 12984 | service-scn | dcorch-sysinv-api-proxy | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.057 | 12985 | service-scn | dcmanager-manager | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12986 | service-scn | dnsmasq | enabled-active | disabling | disable state requested | 2020-06-22T20:16:58.058 | 12987 | service-scn | mtc-agent | enabled-active | disabling | disable state requested Severity -------- Major Steps to Reproduce ------------------ Check sm-customer.log and observe that mgr-restful-plugin is going from enabled-active to disabling due to audit failed every few minutes Not sure about steps to reproduce. Automated regression was running on when issue started to happen, but the events feels unrelated. - See details in timestamp and logs. Expected Behavior ------------------ system is stable Actual Behavior ----------------  mgr-restful-plugin restarts every few minutes along with a number of other services Reproducibility --------------- Intermittent System Configuration -------------------- DC system controller Lab-name: DC-4 Branch/Pull Time/Commit ----------------------- 2020-06-20_20-00-00 Last Pass --------- 2020-06-18_20-00-00 - but this likely to be intermittent, so not sure about exact last pass. Timestamp/Logs -------------- https://files.starlingx.kube.cengn.ca/launchpad/1884704 Here's what was done before the first restart: # Modify timezone on system controller and subcloud7 [2020-06-22 14:04:15,729] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="Europe/Berlin"' [2020-06-22 14:04:51,483] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="Canada/Central"' # Time zone reverted [2020-06-22 14:24:02,950] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:81::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne modify --timezone="UTC"' [2020-06-22 14:24:38,673] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 modify --timezone="UTC"' # lock/unlock subcloud7 host [2020-06-22 14:26:00,195] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-lock controller-0' [2020-06-22 14:27:20,656] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 host-unlock controller-0' # subcloud7 was just recovered at 14:35: [2020-06-22 14:35:16,714] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' Authorization failed: Unable to establish connection to http://[fd01:88::2]:5000/v3/auth/tokens controller-0:~$ [2020-06-22 14:35:27,685] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[fd01:88::2]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name subcloud7 show' +------------------------+--------------------------------------+ | Property | Value | +------------------------+--------------------------------------+ | contact | None | | created_at | 2020-06-21T06:59:44.523373+00:00 | | description | None | | distributed_cloud_role | subcloud | | https_enabled | False | | location | None | | name | dc-subcloud7 | | region_name | subcloud7 | | sdn_enabled | False | | security_feature | spectre_meltdown_v1 | | service_project_name | services | | shared_services | ['identity', ] | | software_version | 20.06 | | system_mode | simplex | | system_type | All-in-one | | timezone | UTC | | updated_at | 2020-06-22T14:24:40.061605+00:00 | | uuid | aa2c9682-a3c0-4a41-9b4a-c303914c6a89 | | vswitch_type | none | +------------------------+--------------------------------------+ controller-0:~$ # There were no explicit operations on system controller at 14:35, where the issue started. | 2020-06-22T14:35:49.783 | 6069 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:37:05.359 | 6140 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:47:01.390 | 6222 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T14:56:56.845 | 6295 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:02:12.548 | 6367 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:04:48.142 | 6439 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:07:24.001 | 6510 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:08:39.621 | 6582 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed | 2020-06-22T15:11:56.083 | 6658 | service-scn | mgr-restful-plugin | enabled-active | disabling | audit failed Test Activity ------------- Regression Testing, Developer Testing Workaround ---------- Performing a controller swact appears to fix the issue
2020-06-24 17:46:45 Ghada Khalil removed subscriber Daniel Badea
2020-06-24 17:47:04 Ghada Khalil tags stx.4.0 stx.storage stx.5.0 stx.storage
2020-06-24 17:47:25 Ghada Khalil bug added subscriber Allain Legacy
2020-06-28 01:25:16 Ghada Khalil tags stx.5.0 stx.storage stx.5.0 stx.retestneeded stx.storage
2020-07-01 12:44:59 Stefan Dinescu marked as duplicate 1885582