ansible: service endpoint reconfiguration timeout
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Al Bailey |
Bug Description
Brief Description
-----------------
On a virtual AIO-SX and 2+2, I'm observing a regular occurrence of
TASK [persist-config : Wait for service endpoints reconfiguration to complete] *******
fatal: [localhost]: FAILED! => {"changed": false, "elapsed": 180, "msg": "Timeout waiting for service endpoints reconfiguration to complete"}
I suspect that this is related to the relative load/processing power of the host system.
Severity
--------
Major: System/Feature is usable but only after applying the workaround described below
Steps to Reproduce
------------------
ansible-playbook /usr/share/
Expected Behavior
------------------
PLAY RECAP *******
localhost : ok=190 changed=121 unreachable=0 failed=0
Actual Behavior
----------------
PLAY RECAP *******
localhost : ok=99 changed=34 unreachable=0 failed=1
Reproducibility
---------------
I'm using two hosts for virtual installs. This is observed 100% of the time on one host and intermittent on the other.
System Configuration
-------
AIO-SX and 2+2 installs
Branch/Pull Time/Commit
-------
Private build of StarlingX master on 5/29
Last Pass
---------
Worked consistently on builds prior to 5/25
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Developer Testing
Workaround
----------
# Bump the timeout after install and before running ansible
controller-0:~$ grep timeout /usr/share/
timeout: 180
controller-0:~$ sudo sed -i 's/180/360/g' /usr/share/
controller-0:~$ grep timeout /usr/share/
timeout: 360
This corresponds to:
# If this is initial play or replay with management and/or oam network config change, must
# wait for the keystone endpoint runtime manifest to complete and restart
# sysinv agent and api.
- name: Wait for service endpoints reconfiguration to complete
wait_for:
path: /etc/platform/
state: present
timeout: 180
msg: Timeout waiting for service endpoints reconfiguration to complete
Marking as release gating; initial system config can fail. Related to ansible deployment.