2020-02-25 22:51:51 |
Yosief Gebremariam |
bug |
|
|
added bug |
2020-02-26 14:55:00 |
Yang Liu |
summary |
Distributed Cloud: Replay on subcloud failed after initial deployment failure |
Distributed Cloud: Delete and re-add subcloud failed at bootstrap after initial deployment failure |
|
2020-02-26 15:00:32 |
Yosief Gebremariam |
summary |
Distributed Cloud: Delete and re-add subcloud failed at bootstrap after initial deployment failure |
Distributed Cloud: Delete and re-add subcloud failed at bootstrap after initial configuration failure on controller-0 |
|
2020-02-26 15:03:57 |
Yosief Gebremariam |
description |
Brief Description
-----------------
Initially the "dcmanager subcloud add subcloud4" command failed on subcloud deployment because of missing ceph-cluster specification in the subcloud deployment configuration yaml file. After adding the missing data, and removing the subcloud from the DC system, I attempted to re-add the subcloud. Unfortunately, the replay failed early in bootstrapping the subcloud with the error message below:
failed: [subcloud4] (item={'_ansible_parsed': True, 'stderr_lines': [u'RTNETLINK answers: Cannot assign requested address'], u'changed': True, u'stdout': u'', '_ansible_item_result': True, u'msg': u'non-zero return code', u'delta': u'0:00:00.002213', 'stdout_lines': [], 'failed_when_result': False, '_ansible_item_label': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'end': u'2020-02-25 18:21:44.547733', '_ansible_no_log': False, 'item': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'cmd': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'failed': False, u'stderr': u'RTNETLINK answers: Cannot assign requested address', u'rc': 2, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'removes': None, u'argv': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2020-02-25 18:21:44.545520', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"changed": true, "cmd": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "delta": "0:00:00.002213", "end": "2020-02-25 18:21:44.547733", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "msg": "non-zero return code", "rc": 2, "start": "2020-02-25 18:21:44.545520", "stderr": "RTNETLINK answers: Cannot assign requested address", "stderr_lines": ["RTNETLINK answers: Cannot assign requested address"], "stdout": "", "stdout_lines": []}, "msg": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host failed for reason: RTNETLINK answers: Cannot assign requested address."}
PLAY RECAP *********************************************************************
subcloud4 : ok=147 changed=41 unreachable=0 failed=1
Preliminary assessment from Tao Lui:
During the first deployment, the mgmt/cluster interfaces had already been re-configured prior to unlock ( no longer on lo).
The bootstrap replay failed at removing the cluster ip from the lo interface.
Severity
--------
Major
Steps to Reproduce
------------------
1) Setup a DC System Controller
2) Boot a subcloud active controller node
3) Add the subcloud to the DC system: "dcmanager subcloud add subcloud4 ...."
4) The subcloud failed at deployment because of missing ceph-cluster from the deployment config yaml file
5) Fixed the subcloud deployment config yaml
6) Deleted the subcloud from DC system ( dcmanager subcloud delete subcloud4)
7) Re-add the subcloud with updated deployment config ( dcmanager subcloud add subcloud4 ....)
8) The replay failed early on bootstrapping with the above error message
TC-name:
Expected Behavior
------------------
Subcloud added to DC system successfully on replay
Actual Behavior
----------------
Subcloud add failed early on bootstrapping
Reproducibility
---------------
Tested once
System Configuration
--------------------
DC system
Lab-name: wcp_80-91
subcloud4: wcp_85_86
Branch/Pull Time/Commit
-----------------------
2020-02-24_20-23-53
Last Pass
---------
unknown
Timestamp/Logs
--------------
2020-02-25-18-21-02
+----+-----------+------------+--------------+------------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+------------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
| 2 | subcloud5 | managed | online | complete | in-sync |
| 4 | subcloud4 | unmanaged | offline | bootstrap-failed | unknown | |
Brief Description
-----------------
Initially the "dcmanager subcloud add subcloud4" command failed on subcloud because of missing ceph-cluster backend. After removing the subcloud from the DC system, I attempted to re-add the subcloud. Unfortunately, the replay failed early in bootstrapping the subcloud with the error message below:
failed: [subcloud4] (item={'_ansible_parsed': True, 'stderr_lines': [u'RTNETLINK answers: Cannot assign requested address'], u'changed': True, u'stdout': u'', '_ansible_item_result': True, u'msg': u'non-zero return code', u'delta': u'0:00:00.002213', 'stdout_lines': [], 'failed_when_result': False, '_ansible_item_label': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'end': u'2020-02-25 18:21:44.547733', '_ansible_no_log': False, 'item': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'cmd': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'failed': False, u'stderr': u'RTNETLINK answers: Cannot assign requested address', u'rc': 2, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'removes': None, u'argv': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2020-02-25 18:21:44.545520', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"changed": true, "cmd": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "delta": "0:00:00.002213", "end": "2020-02-25 18:21:44.547733", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "msg": "non-zero return code", "rc": 2, "start": "2020-02-25 18:21:44.545520", "stderr": "RTNETLINK answers: Cannot assign requested address", "stderr_lines": ["RTNETLINK answers: Cannot assign requested address"], "stdout": "", "stdout_lines": []}, "msg": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host failed for reason: RTNETLINK answers: Cannot assign requested address."}
PLAY RECAP *********************************************************************
subcloud4 : ok=147 changed=41 unreachable=0 failed=1
Preliminary assessment from Tao Lui:
During the first deployment, the mgmt/cluster interfaces had already been re-configured prior to unlock ( no longer on lo).
The bootstrap replay failed at removing the cluster ip from the lo interface.
Severity
--------
Major
Steps to Reproduce
------------------
1) Setup a DC System Controller
2) Boot a subcloud active controller node
3) Add the subcloud to the DC system: "dcmanager subcloud add subcloud4 ...."
4) The subcloud failed at configuration because of missing ceph-cluster backend
5) Deleted the subcloud from DC system ( dcmanager subcloud delete subcloud4)
6) Re-add the subcloud with ceph-cluster backend update ( dcmanager subcloud add subcloud4 ....)
8) The replay failed early on bootstrapping with the above error message
TC-name:
Expected Behavior
------------------
Subcloud added to DC system successfully on replay
Actual Behavior
----------------
Subcloud add failed early on bootstrapping
Reproducibility
---------------
Tested once
System Configuration
--------------------
DC system
Lab-name: wcp_80-91
subcloud4: wcp_85_86
Branch/Pull Time/Commit
-----------------------
2020-02-24_20-23-53
Last Pass
---------
unknown
Timestamp/Logs
--------------
2020-02-25-18-21-02
+----+-----------+------------+--------------+------------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+------------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
| 2 | subcloud5 | managed | online | complete | in-sync |
| 4 | subcloud4 | unmanaged | offline | bootstrap-failed | unknown | |
|
2020-02-26 15:07:35 |
Yosief Gebremariam |
description |
Brief Description
-----------------
Initially the "dcmanager subcloud add subcloud4" command failed on subcloud because of missing ceph-cluster backend. After removing the subcloud from the DC system, I attempted to re-add the subcloud. Unfortunately, the replay failed early in bootstrapping the subcloud with the error message below:
failed: [subcloud4] (item={'_ansible_parsed': True, 'stderr_lines': [u'RTNETLINK answers: Cannot assign requested address'], u'changed': True, u'stdout': u'', '_ansible_item_result': True, u'msg': u'non-zero return code', u'delta': u'0:00:00.002213', 'stdout_lines': [], 'failed_when_result': False, '_ansible_item_label': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'end': u'2020-02-25 18:21:44.547733', '_ansible_no_log': False, 'item': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'cmd': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'failed': False, u'stderr': u'RTNETLINK answers: Cannot assign requested address', u'rc': 2, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'removes': None, u'argv': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2020-02-25 18:21:44.545520', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"changed": true, "cmd": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "delta": "0:00:00.002213", "end": "2020-02-25 18:21:44.547733", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "msg": "non-zero return code", "rc": 2, "start": "2020-02-25 18:21:44.545520", "stderr": "RTNETLINK answers: Cannot assign requested address", "stderr_lines": ["RTNETLINK answers: Cannot assign requested address"], "stdout": "", "stdout_lines": []}, "msg": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host failed for reason: RTNETLINK answers: Cannot assign requested address."}
PLAY RECAP *********************************************************************
subcloud4 : ok=147 changed=41 unreachable=0 failed=1
Preliminary assessment from Tao Lui:
During the first deployment, the mgmt/cluster interfaces had already been re-configured prior to unlock ( no longer on lo).
The bootstrap replay failed at removing the cluster ip from the lo interface.
Severity
--------
Major
Steps to Reproduce
------------------
1) Setup a DC System Controller
2) Boot a subcloud active controller node
3) Add the subcloud to the DC system: "dcmanager subcloud add subcloud4 ...."
4) The subcloud failed at configuration because of missing ceph-cluster backend
5) Deleted the subcloud from DC system ( dcmanager subcloud delete subcloud4)
6) Re-add the subcloud with ceph-cluster backend update ( dcmanager subcloud add subcloud4 ....)
8) The replay failed early on bootstrapping with the above error message
TC-name:
Expected Behavior
------------------
Subcloud added to DC system successfully on replay
Actual Behavior
----------------
Subcloud add failed early on bootstrapping
Reproducibility
---------------
Tested once
System Configuration
--------------------
DC system
Lab-name: wcp_80-91
subcloud4: wcp_85_86
Branch/Pull Time/Commit
-----------------------
2020-02-24_20-23-53
Last Pass
---------
unknown
Timestamp/Logs
--------------
2020-02-25-18-21-02
+----+-----------+------------+--------------+------------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+------------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
| 2 | subcloud5 | managed | online | complete | in-sync |
| 4 | subcloud4 | unmanaged | offline | bootstrap-failed | unknown | |
Brief Description
-----------------
Initially the "dcmanager subcloud add subcloud4" command failed on subcloud because of missing ceph-cluster backend. After removing the subcloud from the DC system, I attempted to re-add the subcloud. Unfortunately, the replay failed early in bootstrapping the subcloud with the error message below:
failed: [subcloud4] (item={'_ansible_parsed': True, 'stderr_lines': [u'RTNETLINK answers: Cannot assign requested address'], u'changed': True, u'stdout': u'', '_ansible_item_result': True, u'msg': u'non-zero return code', u'delta': u'0:00:00.002213', 'stdout_lines': [], 'failed_when_result': False, '_ansible_item_label': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'end': u'2020-02-25 18:21:44.547733', '_ansible_no_log': False, 'item': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'cmd': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'failed': False, u'stderr': u'RTNETLINK answers: Cannot assign requested address', u'rc': 2, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u'ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host', u'removes': None, u'argv': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'start': u'2020-02-25 18:21:44.545520', '_ansible_ignore_errors': None}) => {"changed": false, "item": {"changed": true, "cmd": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "delta": "0:00:00.002213", "end": "2020-02-25 18:21:44.547733", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "_uses_shell": true, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host", "msg": "non-zero return code", "rc": 2, "start": "2020-02-25 18:21:44.545520", "stderr": "RTNETLINK answers: Cannot assign requested address", "stderr_lines": ["RTNETLINK answers: Cannot assign requested address"], "stdout": "", "stdout_lines": []}, "msg": "ip addr delete aefd::2/64 brd aefd::ffff:ffff:ffff:ffff dev lo:5 scope host failed for reason: RTNETLINK answers: Cannot assign requested address."}
PLAY RECAP *********************************************************************
subcloud4 : ok=147 changed=41 unreachable=0 failed=1
Preliminary assessment from Tao Lui:
During the first deployment, the mgmt/cluster interfaces had already been re-configured prior to unlock ( no longer on lo).
The bootstrap replay failed at removing the cluster ip from the lo interface.
Severity
--------
Major
Steps to Reproduce
------------------
1) Setup a DC System Controller
2) Boot a subcloud active controller node
3) Add the subcloud to the DC system: "dcmanager subcloud add subcloud4 ...."
4) The subcloud fails at controller-0 configuration because of missing ceph-cluster backend
5) Delete the failed subcloud from DC system ( dcmanager subcloud delete subcloud4)
6) Re-add the subcloud with ceph-cluster backend ( dcmanager subcloud add subcloud4 ....)
8) The replay failed early on bootstrapping with the above error message
TC-name:
Expected Behavior
------------------
Subcloud added to DC system successfully on replay
Actual Behavior
----------------
Subcloud add failed early on bootstrapping
Reproducibility
---------------
Tested once
System Configuration
--------------------
DC system
Lab-name: wcp_80-91
subcloud4: wcp_85_86
Branch/Pull Time/Commit
-----------------------
2020-02-24_20-23-53
Last Pass
---------
unknown
Timestamp/Logs
--------------
2020-02-25-18-21-02
+----+-----------+------------+--------------+------------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+------------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
| 2 | subcloud5 | managed | online | complete | in-sync |
| 4 | subcloud4 | unmanaged | offline | bootstrap-failed | unknown | |
|
2020-02-28 20:12:33 |
Ghada Khalil |
tags |
|
stx.4.0 stx.distcloud |
|
2020-02-28 20:12:46 |
Ghada Khalil |
starlingx: importance |
Undecided |
Medium |
|
2020-02-28 20:12:50 |
Ghada Khalil |
starlingx: status |
New |
Triaged |
|
2020-03-26 19:27:19 |
Dariush Eslimi |
starlingx: assignee |
|
Tee Ngo (teewrs) |
|
2020-03-30 16:33:34 |
Bill Zvonar |
bug |
|
|
added subscriber Daniel Badea |
2020-04-13 15:27:59 |
Bill Zvonar |
removed subscriber Daniel Badea |
|
|
|
2020-05-06 19:16:54 |
Bart Wensley |
starlingx: assignee |
Tee Ngo (teewrs) |
Jessica Castelino (jcasteli) |
|
2020-05-07 11:13:34 |
Bart Wensley |
bug |
|
|
added subscriber Bart Wensley |
2020-05-28 19:16:34 |
Bill Zvonar |
bug |
|
|
added subscriber Allain Legacy |
2020-06-01 14:43:01 |
OpenStack Infra |
starlingx: status |
Triaged |
In Progress |
|
2020-06-11 23:44:47 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2021-06-04 13:26:02 |
OpenStack Infra |
tags |
stx.4.0 stx.distcloud |
in-f-centos8 stx.4.0 stx.distcloud |
|
2021-06-16 12:26:26 |
OpenStack Infra |
bug watch added |
|
https://github.com/kubernetes-client/python/issues/765 |
|