Activity log for bug #1954333

Date Who What changed Old value New value Message
2021-12-09 21:19:48 Chris Friesen bug added bug
2021-12-09 21:36:09 OpenStack Infra starlingx: status New In Progress
2021-12-10 00:18:33 OpenStack Infra starlingx: status In Progress Fix Released
2021-12-10 16:41:15 Ghada Khalil description Brief Description DC Central cloud upgrade activation failed. There was a swact soon after the activation and activation was in a failed state. From the investigation below information was captured. code related to activation failure https://review.opendev.org/c/starlingx/stx-puppet/+/820418 This fails the puppet manifest as the Kubernetes isn’t up. I assume it won’t come up until etcd is restarted, which would normally be the next action in the puppet manifest. http://bitbucket.wrs.com/projects/CGCS/repos/opendev.org.starlingx.stx-puppet/browse/puppet-manifests/src/modules/platform/manifests/etcd.pp?at=refs%2Fheads%2FWRCP_21.12#184 Swacting between the controllers restarts etcd ** 2021-12-08 19:37:44,664 p=601519 u=root | changed: [localhost] => (item=apiserver-etcd-client.crt) 2021-12-08 19:37:44,750 p=601519 u=root | changed: [localhost] => (item=apiserver-etcd-client.key) 2021-12-08 19:37:44,821 p=601519 u=root | TASK [Create list of etcd classes to pass to puppet] *************************** 2021-12-08 19:37:44,822 p=601519 u=root | Wednesday 08 December 2021 19:37:44 +0000 (0:00:00.273) 0:00:09.706 **** 2021-12-08 19:37:45,111 p=601519 u=root | changed: [localhost] 2021-12-08 19:37:45,180 p=601519 u=root | TASK [Applying puppet for enabling etcd security] ****************************** 2021-12-08 19:37:45,180 p=601519 u=root | Wednesday 08 December 2021 19:37:45 +0000 (0:00:00.358) 0:00:10.064 **** 2021-12-08 19:38:06,749 p=601519 u=root | fatal: [localhost]: FAILED! => changed=true cmd: - /usr/local/bin/puppet-manifest-apply.sh - /opt/platform/puppet/21.12/hieradata/ - fd01:1::3 - controller - runtime - /tmp/etcd.yml delta: '0:00:21.453324' end: '2021-12-08 19:38:06.732159' msg: non-zero return code rc: 1 start: '2021-12-08 19:37:45.278835' stderr: '' stderr_lines: [] stdout: |- Applying puppet runtime manifest... [WARNING] Warnings found. See /var/log/puppet/2021-12-08-19-37-45_runtime/puppet.log for details stdout_lines: <omitted> 2021-12-08T19:38:06.510 /usr/share/ruby/vendor_ruby/puppet/util/command_line.rb:72:in `execute' 2021-12-08T19:38:06.511 /usr/bin/puppet:5:in `<main>'^[[0m 2021-12-08T19:38:06.513 [[1;31mError: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Kubernetes::Master::Change_apiserver_parameters/Exec[wait_for_kube_api_server]/returns: change from notrun to 0 failed: Command exceeded timeout[[0m 2021-12-08T19:38:06.515 [[0;36mDebug: 2021-12-08 19:38:06 +0000 Class[Platform::Kubernetes::Master::Change_apiserver_parameters]: Resource is being skipped, unscheduling all events[[0m 2021-12-08T19:38:06.516 [[0;32mInfo: 2021-12-08 19:38:06 +0000 Class[Platform::Kubernetes::Master::Change_apiserver_parameters]: Unscheduling all events on Class[Platform::Kubernetes::Master::Change_apiserver_parameters][[0m 2021-12-08T19:38:06.518 [[0;36mDebug: 2021-12-08 19:38:06 +0000 Platform::Sm::Restart[etcd]: Resource is being skipped, unscheduling all events[[0m 2021-12-08T19:38:06.519 [[mNotice: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Etcd::Upgrade::Runtime/Platform::Sm::Restart[etcd]/Exec[sm-restart-etcd]: Dependency Exec[wait_for_kube_api_server] has failures: true[[0m 2021-12-08T19:38:06.521 [[1;33mWarning: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Etcd::Upgrade::Runtime/Platform::Sm::Restart[etcd]/Exec[sm-restart-etcd]: Skipping because of failed dependencies[[0m ** Severity Major Steps to Reproduce Follow upgrade procedure to upgrade DC central cloud from 21.05 21.12 . Upgrade activation failure was seen during the upgrade activation step. Expected Behavior Upgrade activation success Actual Behavior As per description upgrade activation failed Reproducibility ** System Configuration DC-1 Distributed system Branch/Pull Time/Commit 2021-12-06_23-00-09 Last Pass "2021-12-04_23-00-07" Brief Description        DC Central cloud upgrade activation failed. There was a swact soon after the activation and activation was in a failed state. From the investigation below information was captured.  code related to activation failure  https://review.opendev.org/c/starlingx/stx-puppet/+/820418 This fails the puppet manifest as the Kubernetes isn’t up. I assume it won’t come up until etcd is restarted, which would normally be the next action in the puppet manifest. Swacting between the controllers restarts etcd  ** 2021-12-08 19:37:44,664 p=601519 u=root | changed: [localhost] => (item=apiserver-etcd-client.crt) 2021-12-08 19:37:44,750 p=601519 u=root | changed: [localhost] => (item=apiserver-etcd-client.key) 2021-12-08 19:37:44,821 p=601519 u=root | TASK [Create list of etcd classes to pass to puppet] *************************** 2021-12-08 19:37:44,822 p=601519 u=root | Wednesday 08 December 2021 19:37:44 +0000 (0:00:00.273) 0:00:09.706 **** 2021-12-08 19:37:45,111 p=601519 u=root | changed: [localhost] 2021-12-08 19:37:45,180 p=601519 u=root | TASK [Applying puppet for enabling etcd security] ****************************** 2021-12-08 19:37:45,180 p=601519 u=root | Wednesday 08 December 2021 19:37:45 +0000 (0:00:00.358) 0:00:10.064 **** 2021-12-08 19:38:06,749 p=601519 u=root | fatal: [localhost]: FAILED! => changed=true   cmd:   - /usr/local/bin/puppet-manifest-apply.sh   - /opt/platform/puppet/21.12/hieradata/   - fd01:1::3   - controller   - runtime   - /tmp/etcd.yml   delta: '0:00:21.453324'   end: '2021-12-08 19:38:06.732159'   msg: non-zero return code   rc: 1   start: '2021-12-08 19:37:45.278835'   stderr: ''   stderr_lines: []   stdout: |-     Applying puppet runtime manifest...     [WARNING]     Warnings found. See /var/log/puppet/2021-12-08-19-37-45_runtime/puppet.log for details   stdout_lines: <omitted> 2021-12-08T19:38:06.510 /usr/share/ruby/vendor_ruby/puppet/util/command_line.rb:72:in `execute' 2021-12-08T19:38:06.511 /usr/bin/puppet:5:in `<main>'^[[0m 2021-12-08T19:38:06.513 [[1;31mError: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Kubernetes::Master::Change_apiserver_parameters/Exec[wait_for_kube_api_server]/returns: change from notrun to 0 failed: Command exceeded timeout[[0m 2021-12-08T19:38:06.515 [[0;36mDebug: 2021-12-08 19:38:06 +0000 Class[Platform::Kubernetes::Master::Change_apiserver_parameters]: Resource is being skipped, unscheduling all events[[0m 2021-12-08T19:38:06.516 [[0;32mInfo: 2021-12-08 19:38:06 +0000 Class[Platform::Kubernetes::Master::Change_apiserver_parameters]: Unscheduling all events on Class[Platform::Kubernetes::Master::Change_apiserver_parameters][[0m 2021-12-08T19:38:06.518 [[0;36mDebug: 2021-12-08 19:38:06 +0000 Platform::Sm::Restart[etcd]: Resource is being skipped, unscheduling all events[[0m 2021-12-08T19:38:06.519 [[mNotice: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Etcd::Upgrade::Runtime/Platform::Sm::Restart[etcd]/Exec[sm-restart-etcd]: Dependency Exec[wait_for_kube_api_server] has failures: true[[0m 2021-12-08T19:38:06.521 [[1;33mWarning: 2021-12-08 19:38:06 +0000 /Stage[main]/Platform::Etcd::Upgrade::Runtime/Platform::Sm::Restart[etcd]/Exec[sm-restart-etcd]: Skipping because of failed dependencies[[0m  ** Severity Major Steps to Reproduce     Follow upgrade procedure to upgrade DC central cloud from 21.05 21.12 .  Upgrade activation failure was seen during the upgrade activation step. Expected Behavior Upgrade activation success Actual Behavior As per description upgrade activation failed Reproducibility  ** System Configuration DC-1 Distributed system Branch/Pull Time/Commit 2021-12-06_23-00-09 Last Pass "2021-12-04_23-00-07"
2021-12-10 16:41:26 Ghada Khalil starlingx: assignee Chris Friesen (cbf123)
2021-12-10 16:41:31 Ghada Khalil starlingx: importance Undecided Medium
2021-12-10 16:43:09 Ghada Khalil tags stx.containers
2021-12-10 17:13:32 Ghada Khalil tags stx.containers stx.6.0 stx.cherrypickneeded stx.containers
2021-12-10 17:46:08 Ghada Khalil tags stx.6.0 stx.cherrypickneeded stx.containers in-r-stx60 stx.6.0 stx.containers