Allow deployment without --limit to complete when one or more compute hosts are unreachable
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
In Progress
|
Medium
|
Mark Goddard |
Bug Description
As deployments scale it becomes more likely that some hosts may be unreachable. They may be down for maintenance or have some temporary failure. Ideally Kolla Ansible should continue to be usable in this scenario. Controllers are more critical than compute nodes, so this issue will address unreachable compute nodes only. Currently if one or more compute hosts are unreachable, Kolla Ansible cannot complete the deployment for all other hosts.
If not using --limit, each host runs the setup module for itself only. Hosts should be able to fail at this point and drop out of execution without affecting others. However, there is a bug that causes us to hit the --limit code path and therefore become subject to its limitations. These include the use of delegated fact gathering, and failing the delegated host when the delegating host is unreachable.
Changed in kolla-ansible: | |
importance: | Undecided → Medium |
Changed in kolla-ansible: | |
status: | New → Confirmed |
Changed in kolla-ansible: | |
status: | Confirmed → In Progress |
Fix proposed to branch: master /review. opendev. org/c/openstack /kolla- ansible/ +/899592
Review: https:/