When re-deploying multiple times with bonded control plane, this can lead tripleo-kernel to disable nic1
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
New
|
Undecided
|
Unassigned |
Bug Description
Description of problem:
As we can see here, tripleo-kernel disables eno1 interface [1] and this is an undesired behavior.
This happens because in a previous deployment, eno1 was part of a bond with eno2, so it had no IP set on it. We can see this by printing the facts right before we disable it [2].
Adding this task the block should prevent this issue from happening:
~~~
- name: Apply workaround for node reboot
block:
- name: Update facts before attempting to disable interfaces
setup:
~~~
Version-Release number of selected component (if applicable):
How reproducible:
50% of the time
Steps to Reproduce:
1. Deploy sucessfully with a bonded interface using nic1.
2. Delete overcloud and redeploy using the same templates
Actual results:
tripleo-kernel will disable nic1
Expected results:
tripleo-kernel shouldn't disable nic1 at this stage because os-net-config hasn't run yet.
Additional info:
[1]
~~~
2020-11-27 21:51:05,980 p=142287 u=mistral n=ansible | TASK [tripleo-kernel : Replace BOOTPROTO to none for interfaces which does not have IP] ***
2020-11-27 21:51:05,980 p=142287 u=mistral n=ansible | Friday 27 November 2020 21:51:05 -0500 (0:00:01.768) 0:02:25.407 *******
[...]
2020-11-27 21:51:06,898 p=142287 u=mistral n=ansible | changed: [ess1612-
546.37639, 'gr_name': 'root', 'pw_name': 'root', 'wusr': True, 'rusr': True, 'xusr': False, 'wgrp': False, 'rgrp': True, 'xgrp': False, 'woth': False, 'roth': True, 'xoth': False, 'isuid': False, 'isgid': False}) => {"ansible_
false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "mode": "0644", "mtime": 1606531546.37439, "nlink": 1, "path": "/etc/sysconfig
[...]
2020-11-27 21:51:10,102 p=142287 u=mistral n=ansible | TASK [tripleo-kernel : Reboot debug message] *******
2020-11-27 21:51:10,103 p=142287 u=mistral n=ansible | Friday 27 November 2020 21:51:10 -0500 (0:00:04.122) 0:02:29.530 *******
2020-11-27 21:51:10,163 p=142287 u=mistral n=ansible | ok: [ess1612-
"msg": "Going to reboot the node after applying kernel args..."
}
2020-11-27 21:51:10,374 p=142287 u=mistral n=ansible | TASK [tripleo-kernel : Reboot after kernel args update] *******
2020-11-27 21:51:10,374 p=142287 u=mistral n=ansible | Friday 27 November 2020 21:51:10 -0500 (0:00:00.271) 0:02:29.801 *******
2020-11-27 22:09:36,896 p=142287 u=mistral n=ansible | fatal: [ess1612-
2020-11-27 22:09:36,897 p=142287 u=mistral n=ansible | NO MORE HOSTS LEFT *******
2020-11-27 22:09:36,898 p=142287 u=mistral n=ansible | PLAY RECAP *******
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | ess1612-
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | ess1612-ctrl-0 : ok=44 changed=24 unreachable=0 failed=0 skipped=22 rescued=0 ignored=0
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | ess1612-ctrl-1 : ok=43 changed=24 unreachable=0 failed=0 skipped=22 rescued=0 ignored=0
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | ess1612-ctrl-2 : ok=43 changed=24 unreachable=0 failed=0 skipped=22 rescued=0 ignored=0
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | undercloud : ok=8 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | Friday 27 November 2020 22:09:36 -0500 (0:18:26.525) 0:20:56.326 *******
2020-11-27 22:09:36,899 p=142287 u=mistral n=ansible | =======
~~~
[2]
~~~
"fqdn": "ess1612-
},
},
},
},
~~~
This issue was fixed in the openstack/ tripleo- ansible 3.0.0 release.