Fuel for OpenStack

Comment 0 for bug 1629031

Revision history for this message

Alexander Gordeev (a-gordeev) wrote on 2016-09-29:

Detailed bug description:
  IBP provisioning requires specific file with provisioning data to be uploaded on every node before the actual provisioning script would be executed.
  This upload data task executes synchronously for every node in a row; one by one.
  Unless the task accomplished for one node, it won't start to upload the data for next node.
  If one node for some reasons becomes irresponsible, than it would take 11.5minutes to recognize the failure.

Steps to reproduce:
0. Emulate mcollective outage for some slave nodes: stop mcollective service on some of nodes.
1. Start provisioning of the nodes

Expected results:
Provisioning task provisions all nodes on which mcollective is still operatable. No significant delay due to some nodes being unavailable for mcollective.
Actual result:
Provisioning task provisions all nodes on which mcollective is still operatable. Every unavailable node add additional 11.5mins of delay to provisioning. Eg.: for 8 unavailable nodes it would be 1.5H
Reproducibility:
Always
Workaround:
???
Impact:
UX, abnormally increased provisioning time on large scale.
Description of the environment:
Fuel 9.0
Additional information:
Perhaps, somebody should add more shorter timeout for non-responding mcollective agents for UploadTask. Since UploadTask for provisioning usually takes few secs, than the timeout should be adjusted to a dozen of seconds. Not a dozen of minutes.

https://github.com/openstack/fuel-astute/blob/stable/mitaka/lib/astute/image_provision.rb#L48-L58