Brief Description
-----------------
Restore failed during bootstrap essential services. Backup from taken from controller-1.
Severity
--------
<Major: System/Feature is usable but degraded>
Steps to Reproduce
------------------
Install duplex system with stx master
Swact from controller-0 to to controller-1
Run the Backup Ansible playbook
Install a clean image of stx in the system
Run the restore Ansible playbook with the backup file saved above
Expected Behavior
Run the restore Ansible playbook sucessfully.
Actual Behavior
----------------
Fail running restore Ansible playbook sucessfully
Reproducibility
---------------
Reproducible
System Configuration
--------------------
Duplex system
Default configuration.
Branch/Pull Time/Commit
--------------------
SW_VERSION="21.12"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="2021-10-21_00-00-06"
Last Pass
--------------------
This is the first time this TC is run for latest release
1. Verified in a dc lab using 2021-05-22_23-32-17 build
Backup taken from controller_1, restored in controller-0.
Timestamp/Logs
--------------------
TASK [bootstrap/bringup-essential-services : Get wait tasks results] ***************************************************************************
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'k8s-app=kube-proxy', u'ansible_job_id': u'443674968421.142067', 'failed': False, u'started': 1, 'changed': True, 'item': u'k8s-app=kube-proxy', u'finished': 0, u'results_file': u'/root/.ansible_async/443674968421.142067', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'app=multus', u'ansible_job_id': u'464206363239.142123', 'failed': False, u'started': 1, 'changed': True, 'item': u'app=multus', u'finished': 0, u'results_file': u'/root/.ansible_async/464206363239.142123', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'app=sriov-cni', u'ansible_job_id': u'941260772048.142247', 'failed': False, u'started': 1, 'changed': True, 'item': u'app=sriov-cni', u'finished': 0, u'results_file': u'/root/.ansible_async/941260772048.142247', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'component=kube-apiserver', u'ansible_job_id': u'887621538031.142316', 'failed': False, u'started': 1, 'changed': True, 'item': u'component=kube-apiserver', u'finished': 0, u'results_file': u'/root/.ansible_async/887621538031.142316', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'component=kube-controller-manager', u'ansible_job_id': u'27963140421.142391', 'failed': False, u'started': 1, 'changed': True, 'item': u'component=kube-controller-manager', u'finished': 0, u'results_file': u'/root/.ansible_async/27963140421.142391', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label': u'component=kube-scheduler', u'ansible_job_id': u'685428121851.142449', 'failed': False, u'started': 1, 'changed': True, 'item': u'component=kube-scheduler', u'finished': 0, u'results_file': u'/root/.ansible_async/685428121851.142449', '_ansible_ignore_errors': None, '_ansible_no_log': False})
FAILED - RETRYING: Get wait tasks results (40 retries left).
FAILED - RETRYING: Get wait tasks results (39 retries left).
FAILED - RETRYING: Get wait tasks results (38 retries left).
FAILED - RETRYING: Get wait tasks results (37 retries left).
FAILED - RETRYING: Get wait tasks results (36 retries left).
FAILED - RETRYING: Get wait tasks results (35 retries left).
FAILED - RETRYING: Get wait tasks results (34 retries left).
FAILED - RETRYING: Get wait tasks results (33 retries left).
FAILED - RETRYING: Get wait tasks results (32 retries left).
FAILED - RETRYING: Get wait tasks results (31 retries left).
FAILED - RETRYING: Get wait tasks results (30 retries left).
FAILED - RETRYING: Get wait tasks results (29 retries left).
FAILED - RETRYING: Get wait tasks results (28 retries left).
FAILED - RETRYING: Get wait tasks results (27 retries left).
FAILED - RETRYING: Get wait tasks results (26 retries left).
FAILED - RETRYING: Get wait tasks results (25 retries left).
FAILED - RETRYING: Get wait tasks results (24 retries left).
FAILED - RETRYING: Get wait tasks results (23 retries left).
FAILED - RETRYING: Get wait tasks results (22 retries left).
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}
, u'ansible_job_id': u'947219856717.142512', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}, u'finished': 0, u'results_file': u'/root/.ansible_async/947219856717.142512', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'flux-helm', u'deployment': u'helm-controller'}
, u'ansible_job_id': u'124310910849.142684', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'flux-helm', u'deployment': u'helm-controller'}, u'finished': 0, u'results_file': u'/root/.ansible_async/124310910849.142684', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'flux-helm', u'deployment': u'source-controller'}
, u'ansible_job_id': u'233493705004.142800', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'flux-helm', u'deployment': u'source-controller'}, u'finished': 0, u'results_file': u'/root/.ansible_async/233493705004.142800', '_ansible_ignore_errors': None, '_ansible_no_log': False})
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'armada', u'deployment': u'armada-api'}
, u'ansible_job_id': u'998341509452.142878', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'armada', u'deployment': u'armada-api'}, u'finished': 0, u'results_file': u'/root/.ansible_async/998341509452.142878', '_ansible_ignore_errors': None, '_ansible_no_log': False})
FAILED - RETRYING: Get wait tasks results (40 retries left).
changed: [localhost] => (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'kube-system', u'deployment': u'coredns'}
, u'ansible_job_id': u'915942839727.142954', 'failed': False, u'started': 1, 'changed': True, 'item': {u'namespace': u'kube-system', u'deployment': u'coredns'}, u'finished': 0, u'results_file': u'/root/.ansible_async/915942839727.142954', '_ansible_ignore_errors': None, '_ansible_no_log': False})
TASK [bootstrap/bringup-essential-services : Fail if any of the Kubernetes component, Networking or Armada pods are not ready by this time] ****
failed: [localhost] (item={'_ansible_parsed': True, 'stderr_lines': [u'error: timed out waiting for the condition on deployments/calico-kube-controllers'], u'changed': True, u'stderr': u'error: timed out waiting for the condition on deployments/calico-kube-controllers', u'ansible_job_id': u'947219856717.142512', u'stdout': u'', '_ansible_item_result': True, u'invocation': {u'module_args': {u'creates': None, u'executable': None, u'_uses_shell': False, u'_raw_params': u'kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=kube-system --for=condition=Available deployment calico-kube-controllers --timeout=120s', u'removes': None, u'argv': None, u'warn': True, u'chdir': None, u'stdin': None}}, 'attempts': 20, u'delta': u'0:02:00.080321', 'stdout_lines': [], 'failed_when_result': False, '_ansible_no_log': False, u'end': u'2021-10-27 13:54:44.320076', '_ansible_item_label': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}
, u'ansible_job_id': u'947219856717.142512', 'item': {u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/947219856717.142512', '_ansible_ignore_errors': None, '_ansible_no_log': False}, u'start': u'2021-10-27 13:52:44.239755', u'cmd': [u'kubectl', u'--kubeconfig=/etc/kubernetes/admin.conf', u'wait', u'--namespace=kube-system', u'--for=condition=Available', u'deployment', u'calico-kube-controllers', u'--timeout=120s'], u'finished': 1, u'failed': False, 'item': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_no_log': False, u'ansible_job_id': u'947219856717.142512', 'item':
{u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}
, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/947219856717.142512', '_ansible_ignore_errors': None, '_ansible_item_label': {u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'}}, u'rc': 1, u'msg': u'non-zero return code', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"ansible_job_id": "947219856717.142512", "attempts": 20, "changed": true, "cmd": ["kubectl", "--kubeconfig=/etc/kubernetes/admin.conf", "wait", "--namespace=kube-system", "--for=condition=Available", "deployment", "calico-kube-controllers", "--timeout=120s"], "delta": "0:02:00.080321", "end": "2021-10-27 13:54:44.320076", "failed": false, "failed_when_result": false, "finished": 1, "invocation": {"module_args": {"_raw_params": "kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=kube-system --for=condition=Available deployment calico-kube-controllers --timeout=120s", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": {"ansible_job_id": "947219856717.142512", "changed": true, "failed": false, "finished": 0, "item":
{"deployment": "calico-kube-controllers", "namespace": "kube-system"}
, "results_file": "/root/.ansible_async/947219856717.142512", "started": 1}, "msg": "non-zero return code", "rc": 1, "start": "2021-10-27 13:52:44.239755", "stderr": "error: timed out waiting for the condition on deployments/calico-kube-controllers", "stderr_lines": ["error: timed out waiting for the condition on deployments/calico-kube-controllers"], "stdout": "", "stdout_lines": []}, "msg": "Pod {u'namespace': u'kube-system', u'deployment': u'calico-kube-controllers'} is still not ready."}
failed: [localhost] (item={'_ansible_parsed': True, 'stderr_lines': [u'error: timed out waiting for the condition on deployments/armada-api'], u'changed': True, u'stderr': u'error: timed out waiting for the condition on deployments/armada-api', u'ansible_job_id': u'998341509452.142878', u'stdout': u'', '_ansible_item_result': True, u'invocation': {u'module_args': {u'creates': None, u'executable': None, u'_uses_shell': False, u'_raw_params': u'kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=armada --for=condition=Available deployment armada-api --timeout=120s', u'removes': None, u'argv': None, u'warn': True, u'chdir': None, u'stdin': None}}, 'attempts': 1, u'delta': u'0:02:00.072974', 'stdout_lines': [], 'failed_when_result': False, '_ansible_no_log': False, u'end': u'2021-10-27 13:54:47.596154', '_ansible_item_label': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'armada', u'deployment': u'armada-api'}
, u'ansible_job_id': u'998341509452.142878', 'item': {u'namespace': u'armada', u'deployment': u'armada-api'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/998341509452.142878', '_ansible_ignore_errors': None, '_ansible_no_log': False}, u'start': u'2021-10-27 13:52:47.523180', u'cmd': [u'kubectl', u'--kubeconfig=/etc/kubernetes/admin.conf', u'wait', u'--namespace=armada', u'--for=condition=Available', u'deployment', u'armada-api', u'--timeout=120s'], u'finished': 1, u'failed': False, 'item': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_no_log': False, u'ansible_job_id': u'998341509452.142878', 'item':
{u'namespace': u'armada', u'deployment': u'armada-api'}
, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/998341509452.142878', '_ansible_ignore_errors': None, '_ansible_item_label': {u'namespace': u'armada', u'deployment': u'armada-api'}}, u'rc': 1, u'msg': u'non-zero return code', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"ansible_job_id": "998341509452.142878", "attempts": 1, "changed": true, "cmd": ["kubectl", "--kubeconfig=/etc/kubernetes/admin.conf", "wait", "--namespace=armada", "--for=condition=Available", "deployment", "armada-api", "--timeout=120s"], "delta": "0:02:00.072974", "end": "2021-10-27 13:54:47.596154", "failed": false, "failed_when_result": false, "finished": 1, "invocation": {"module_args": {"_raw_params": "kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=armada --for=condition=Available deployment armada-api --timeout=120s", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": {"ansible_job_id": "998341509452.142878", "changed": true, "failed": false, "finished": 0, "item":
{"deployment": "armada-api", "namespace": "armada"}
, "results_file": "/root/.ansible_async/998341509452.142878", "started": 1}, "msg": "non-zero return code", "rc": 1, "start": "2021-10-27 13:52:47.523180", "stderr": "error: timed out waiting for the condition on deployments/armada-api", "stderr_lines": ["error: timed out waiting for the condition on deployments/armada-api"], "stdout": "", "stdout_lines": []}, "msg": "Pod {u'namespace': u'armada', u'deployment': u'armada-api'} is still not ready."}
failed: [localhost] (item={'_ansible_parsed': True, 'stderr_lines': [u'error: timed out waiting for the condition on deployments/coredns'], u'changed': True, u'stderr': u'error: timed out waiting for the condition on deployments/coredns', u'ansible_job_id': u'915942839727.142954', u'stdout': u'', '_ansible_item_result': True, u'invocation': {u'module_args': {u'creates': None, u'executable': None, u'_uses_shell': False, u'_raw_params': u'kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=kube-system --for=condition=Available deployment coredns --timeout=120s', u'removes': None, u'argv': None, u'warn': True, u'chdir': None, u'stdin': None}}, 'attempts': 2, u'delta': u'0:02:00.069320', 'stdout_lines': [], 'failed_when_result': False, '_ansible_no_log': False, u'end': u'2021-10-27 13:54:48.689142', '_ansible_item_label': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_item_label':
{u'namespace': u'kube-system', u'deployment': u'coredns'}
, u'ansible_job_id': u'915942839727.142954', 'item': {u'namespace': u'kube-system', u'deployment': u'coredns'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/915942839727.142954', '_ansible_ignore_errors': None, '_ansible_no_log': False}, u'start': u'2021-10-27 13:52:48.619822', u'cmd': [u'kubectl', u'--kubeconfig=/etc/kubernetes/admin.conf', u'wait', u'--namespace=kube-system', u'--for=condition=Available', u'deployment', u'coredns', u'--timeout=120s'], u'finished': 1, u'failed': False, 'item': {'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_no_log': False, u'ansible_job_id': u'915942839727.142954', 'item':
{u'namespace': u'kube-system', u'deployment': u'coredns'}
, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/915942839727.142954', '_ansible_ignore_errors': None, '_ansible_item_label': {u'namespace': u'kube-system', u'deployment': u'coredns'}}, u'rc': 1, u'msg': u'non-zero return code', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"ansible_job_id": "915942839727.142954", "attempts": 2, "changed": true, "cmd": ["kubectl", "--kubeconfig=/etc/kubernetes/admin.conf", "wait", "--namespace=kube-system", "--for=condition=Available", "deployment", "coredns", "--timeout=120s"], "delta": "0:02:00.069320", "end": "2021-10-27 13:54:48.689142", "failed": false, "failed_when_result": false, "finished": 1, "invocation": {"module_args": {"_raw_params": "kubectl --kubeconfig=/etc/kubernetes/admin.conf wait --namespace=kube-system --for=condition=Available deployment coredns --timeout=120s", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": {"ansible_job_id": "915942839727.142954", "changed": true, "failed": false, "finished": 0, "item":
{"deployment": "coredns", "namespace": "kube-system"}
, "results_file": "/root/.ansible_async/915942839727.142954", "started": 1}, "msg": "non-zero return code", "rc": 1, "start": "2021-10-27 13:52:48.619822", "stderr": "error: timed out waiting for the condition on deployments/coredns", "stderr_lines": ["error: timed out waiting for the condition on deployments/coredns"], "stdout": "", "stdout_lines": []}, "msg": "Pod {u'namespace': u'kube-system', u'deployment': u'coredns'} is still not ready."}
PLAY RECAP *************************************************************************************************************************************
localhost : ok=410 changed=223 unreachable=0 failed=1
Alarms
[sysadmin@controller-0 ~(keystone_admin)]$ fm alarm-list
----------------------------------------------------------------------------++++------------------------------------------
Alarm ID Reason Text Entity ID Severity Time Stamp
----------------------------------------------------------------------------++++------------------------------------------
200.001 controller-0 was administratively locked to take it out-of-service. host=controller-0 warning 2021-10-27T1
3:49:55.
015118
----------------------------------------------------------------------------++++------------------------------------------
Test Activity
--------------------
Testing
Workaround
--------------------
no
Fix proposed to branch: master /review. opendev. org/c/starlingx /ansible- playbooks/ +/822130
Review: https:/