possible timing issues in windows contrail ansible deployer
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenContrail |
New
|
Medium
|
Michał Kostrzewa |
Bug Description
The first time ansible-playbook configure_
Please search for "Reboot the system" in the logs below to see the differences.
[root@a5s38node4 ansible]# ansible-playbook configure_
PLAY [windows_host] *******
TASK [Gathering Facts] *******
ok: [10.84.14.226]
ok: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.234]
ok: [10.84.14.226]
TASK [configure_
ok: [10.84.14.226]
ok: [10.84.14.234]
TASK [configure_
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [configure_
fatal: [10.84.14.234]: FAILED! => {"changed": false, "elapsed": 621, "msg": "timed out waiting for reboot uptime check success: ('Connection aborted.', error(111, 'Connection refused'))", "rebooted": true}
changed: [10.84.14.226]
TASK [configure_
ok: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
TASK [configure_
changed: [10.84.14.226]
to retry, use: --limit @/root/
PLAY RECAP *******
10.84.14.226 : ok=14 changed=9 unreachable=0 failed=0
10.84.14.234 : ok=5 changed=2 unreachable=0 failed=1
[root@a5s38node4 ansible]#
[root@a5s38node4 ansible]# ansible-playbook configure_
PLAY [windows_host] *******
TASK [Gathering Facts] *******
ok: [10.84.14.226]
ok: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
changed: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
changed: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
ok: [10.84.14.234]
TASK [configure_
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
changed: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
TASK [configure_
ok: [10.84.14.226]
changed: [10.84.14.234]
PLAY RECAP *******
10.84.14.226 : ok=14 changed=4 unreachable=0 failed=0
10.84.14.234 : ok=14 changed=12 unreachable=0 failed=0
[root@a5s38node4 ansible]# ansible-playbook install_
PLAY [windows_host] *******
TASK [Gathering Facts] *******
ok: [10.84.14.234]
ok: [10.84.14.226]
TASK [install_contrail : Check if OpenStack Keystone configuration is present] ***
skipping: [10.84.14.226]
skipping: [10.84.14.234]
TASK [install_contrail : Create artifacts directory] *******
changed: [10.84.14.226]
changed: [10.84.14.234]
TASK [install_contrail : Run contrail-
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Copy dlls to testbed] *******
changed: [10.84.14.226]
changed: [10.84.14.234]
TASK [install_contrail : Import vRouter certificate] *******
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Install vRouter Extension] *******
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Install vRouter Agent] *******
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Install Docker Driver] *******
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Get auth token from Keystone] *******
ok: [10.84.14.234 -> localhost]
ok: [10.84.14.226 -> localhost]
TASK [install_contrail : set_fact] *******
ok: [10.84.14.226]
ok: [10.84.14.234]
TASK [install_contrail : Create virtual router in Contrail] *******
ok: [10.84.14.226 -> localhost]
ok: [10.84.14.234 -> localhost]
TASK [install_contrail : Install contrail docker driver] *******
changed: [10.84.14.234]
changed: [10.84.14.226]
TASK [install_contrail : Start contrail docker driver] *******
fatal: [10.84.14.234]: FAILED! => {"can_pause_
fatal: [10.84.14.226]: FAILED! => {"can_pause_
to retry, use: --limit @/root/
PLAY RECAP *******
10.84.14.226 : ok=12 changed=8 unreachable=0 failed=1
10.84.14.234 : ok=12 changed=
Changed in opencontrail: | |
importance: | Undecided → Medium |
assignee: | nobody → Michał Kostrzewa (mkostrzewa) |
The timeout for reboot is 10 minutes.
I seems to be reasonable compromise between the two needs:
1) Detecting that compute node couldn't be restarted.
2) Allowing compute node to handle reasonable amount of updates.
There is completely no guarantee about reboot time, so everything we pick is a compromise.
Is there any specific timeout that you suggest to set?