Allow deployment without --limit to complete when one or more compute hosts are unreachable

Bug #2041859 reported by Mark Goddard
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
In Progress
Medium
Mark Goddard

Bug Description

As deployments scale it becomes more likely that some hosts may be unreachable. They may be down for maintenance or have some temporary failure. Ideally Kolla Ansible should continue to be usable in this scenario. Controllers are more critical than compute nodes, so this issue will address unreachable compute nodes only. Currently if one or more compute hosts are unreachable, Kolla Ansible cannot complete the deployment for all other hosts.

If not using --limit, each host runs the setup module for itself only. Hosts should be able to fail at this point and drop out of execution without affecting others. However, there is a bug that causes us to hit the --limit code path and therefore become subject to its limitations. These include the use of delegated fact gathering, and failing the delegated host when the delegating host is unreachable.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
importance: Undecided → Medium
Changed in kolla-ansible:
status: New → Confirmed
Changed in kolla-ansible:
status: Confirmed → In Progress
Revision history for this message
Maksim Malchuk (mmalchuk) wrote :
Changed in kolla-ansible:
assignee: nobody → Mark Goddard (mgoddard)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.