Activity log for bug #1629031

Date Who What changed Old value New value Message
2016-09-29 17:24:43 Alexander Gordeev bug added bug
2016-09-29 17:28:09 Alexander Gordeev attachment added example of astute.log https://bugs.launchpad.net/fuel/+bug/1629031/+attachment/4751173/+files/delayed_provisioning.log
2016-09-29 17:29:03 Alexander Gordeev tags module-astute scale
2016-09-29 17:42:50 Alexander Gordeev description Detailed bug description: IBP provisioning requires specific file with provisioning data to be uploaded on every node before the actual provisioning script would be executed. This upload data task executes synchronously for every node in a row; one by one. Unless the task accomplished for one node, it won't start to upload the data for next node. If one node for some reasons becomes irresponsible, than it would take 11.5minutes to recognize the failure. Steps to reproduce: 0. Emulate mcollective outage for some slave nodes: stop mcollective service on some of nodes. 1. Start provisioning of the nodes Expected results: Provisioning task provisions all nodes on which mcollective is still operatable. No significant delay due to some nodes being unavailable for mcollective. Actual result: Provisioning task provisions all nodes on which mcollective is still operatable. Every unavailable node add additional 11.5mins of delay to provisioning. Eg.: for 8 unavailable nodes it would be 1.5H Reproducibility: Always Workaround: ??? Impact: UX, abnormally increased provisioning time on large scale. Description of the environment: Fuel 9.0 Additional information: Perhaps, somebody should add more shorter timeout for non-responding mcollective agents for UploadTask. Since UploadTask for provisioning usually takes few secs, than the timeout should be adjusted to a dozen of seconds. Not a dozen of minutes. https://github.com/openstack/fuel-astute/blob/stable/mitaka/lib/astute/image_provision.rb#L48-L58 Detailed bug description:   IBP provisioning requires specific file with provisioning data to be uploaded on every node before the actual provisioning script would be executed.   This upload data task is executed synchronously for every node in a row; one by one.   Unless the task accomplished for one node, it won't start to upload the data for next node.   If one node for some reasons becomes irresponsible, then it would take 11.5minutes to recognize the failure. Steps to reproduce:  0. Emulate mcollective outage for some slave nodes: stop mcollective service on some of nodes.  1. Start provisioning of the nodes Expected results:  Provisioning task provisions all nodes on which mcollective is still operatable. No significant delay due to some nodes are being unavailable for mcollective. Actual result:   Provisioning task provisions all nodes on which mcollective is still operatable. Every unavailable node add additional 11.5mins of delay to provisioning. Eg.: for 8 unavailable nodes it would be 1.5H Reproducibility:  Always Workaround:  ??? Impact:  UX, abnormally increased provisioning time on large scale. Description of the environment:  Fuel 9.0 Additional information:  Perhaps, somebody should add more shorter timeout for non-responding mcollective agents for UploadTask. Since UploadTask for provisioning usually takes few secs, therefore the timeout should be adjusted to a dozen of seconds. Not a dozen of minutes. https://github.com/openstack/fuel-astute/blob/stable/mitaka/lib/astute/image_provision.rb#L48-L58
2016-09-29 17:43:56 Vladimir Sharshov fuel: milestone 9.2
2016-09-29 17:43:58 Vladimir Sharshov fuel: assignee Vladimir Sharshov (vsharshov)
2016-09-29 17:44:00 Vladimir Sharshov fuel: importance Undecided High
2016-09-29 17:44:03 Vladimir Sharshov fuel: status New Confirmed
2016-09-29 17:54:38 Alexander Gordeev tags module-astute scale area-python module-astute scale
2016-09-30 13:39:51 Alexander Gordeev description Detailed bug description:   IBP provisioning requires specific file with provisioning data to be uploaded on every node before the actual provisioning script would be executed.   This upload data task is executed synchronously for every node in a row; one by one.   Unless the task accomplished for one node, it won't start to upload the data for next node.   If one node for some reasons becomes irresponsible, then it would take 11.5minutes to recognize the failure. Steps to reproduce:  0. Emulate mcollective outage for some slave nodes: stop mcollective service on some of nodes.  1. Start provisioning of the nodes Expected results:  Provisioning task provisions all nodes on which mcollective is still operatable. No significant delay due to some nodes are being unavailable for mcollective. Actual result:   Provisioning task provisions all nodes on which mcollective is still operatable. Every unavailable node add additional 11.5mins of delay to provisioning. Eg.: for 8 unavailable nodes it would be 1.5H Reproducibility:  Always Workaround:  ??? Impact:  UX, abnormally increased provisioning time on large scale. Description of the environment:  Fuel 9.0 Additional information:  Perhaps, somebody should add more shorter timeout for non-responding mcollective agents for UploadTask. Since UploadTask for provisioning usually takes few secs, therefore the timeout should be adjusted to a dozen of seconds. Not a dozen of minutes. https://github.com/openstack/fuel-astute/blob/stable/mitaka/lib/astute/image_provision.rb#L48-L58 Detailed bug description:   IBP provisioning requires specific file with provisioning data to be uploaded on every node before the actual provisioning script would be executed.   This upload data task is executed synchronously for every node in a row; one by one.   Unless the task accomplished for one node, it won't start to upload the data for next node.   If one node for some reasons becomes irresponsible, then it would take 11.5minutes to recognize the failure. Steps to reproduce:  0. Emulate mcollective outage for some slave nodes: stop mcollective service on some of nodes.  1. Start provisioning of the nodes Expected results:  Provisioning task provisions all nodes on which mcollective is still operatable. No significant delay due to some nodes are being unavailable for mcollective. Actual result:   Provisioning task provisions all nodes on which mcollective is still operatable. Every unavailable node add additional 11.5mins of delay to provisioning. Eg.: for 8 unavailable nodes it would be 1.5H Reproducibility:  Always Workaround:  ??? Impact:  UX, abnormally increased OS provisioning time on large scale could lead to provisioning task failure by timeout. Description of the environment:  Fuel 9.0 Additional information:  Perhaps, somebody should add more shorter timeout for non-responding mcollective agents for UploadTask. Since UploadTask for provisioning usually takes few secs, therefore the timeout should be adjusted to a dozen of seconds. Not a dozen of minutes. https://github.com/openstack/fuel-astute/blob/stable/mitaka/lib/astute/image_provision.rb#L48-L58
2016-12-27 15:11:18 OpenStack Infra fuel: status Confirmed In Progress
2016-12-28 14:49:06 Vladimir Sharshov nominated for series fuel/newton
2016-12-28 14:49:06 Vladimir Sharshov bug task added fuel/newton
2016-12-28 14:49:14 Vladimir Sharshov fuel/newton: status New In Progress
2016-12-28 14:49:17 Vladimir Sharshov fuel/newton: importance Undecided High
2016-12-28 14:49:19 Vladimir Sharshov fuel/newton: assignee Vladimir Sharshov (vsharshov)
2016-12-28 14:49:32 Vladimir Sharshov fuel/newton: milestone 10.x-updates
2016-12-28 14:50:24 OpenStack Infra fuel: status In Progress Fix Committed
2016-12-28 16:46:47 OpenStack Infra fuel/newton: status In Progress Fix Committed
2016-12-28 20:19:02 OpenStack Infra tags area-python module-astute scale area-python in-stable-mitaka module-astute scale
2017-02-03 10:06:33 Andrew Kalach fuel: status Fix Committed Fix Released