http://paste.openstack.org/show/487455/ long story short, what actually happened: 1) provisioning of 50 target nodes started. 2016-02-17 17:57:25 INFO [1071] Starting OS provisioning for nodes: 102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,12 5,126,127,128,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152 2) it went smooth. all changes were successfully applied to cobbler node profiles. Then uploading of provision data started (provision.json). Technically, that uploading is implemented via mcollective service. 3) provision.json was uploaded to nodes 102,103,104,105,106,107,108,109,110,111. 4) for some reasons, the next target node 112 was offline at this moment, hence uploading failed. last entries in log files ended at 17:30:30 2016-02-17T17:30:30.578712+00:00 debug: 17:30:30.400765 #2746] DEBUG -- : runnerstats.rb:56:in `block in sent' Incrementing replies stat 2016-02-17T17:30:30.578844+00:00 warning: 17:30:30.405476 #2746] WARN -- : netio.rb:387:in `_init_line_read' PLMC7: Exiting after signal: SignalException: SIGTERM 2016-02-17T17:30:30.578844+00:00 debug: 17:30:30.405615 #2746] DEBUG -- : rabbitmq.rb:350:in `disconnect' Disconnecting from RabbitMQ 2016-02-17T17:30:30.578968+00:00 info: 17:30:30.405943 #2746] INFO -- : rabbitmq.rb:20:in `on_disconnect' Disconnected from stomp://mcollective@10.20.0.2:61613 5) astute did 10 retries with no luck. 2016-02-17 17:58:38 DEBUG [1071] Retry #1 to run mcollective agent on nodes: '112' 2016-02-17 17:59:41 DEBUG [1071] Retry #2 to run mcollective agent on nodes: '112' 2016-02-17 18:00:43 DEBUG [1071] Retry #3 to run mcollective agent on nodes: '112' 2016-02-17 18:01:46 DEBUG [1071] Retry #4 to run mcollective agent on nodes: '112' 2016-02-17 18:02:49 DEBUG [1071] Retry #5 to run mcollective agent on nodes: '112' 2016-02-17 18:03:51 DEBUG [1071] Retry #6 to run mcollective agent on nodes: '112' 2016-02-17 18:04:54 DEBUG [1071] Retry #7 to run mcollective agent on nodes: '112' 2016-02-17 18:05:56 DEBUG [1071] Retry #8 to run mcollective agent on nodes: '112' 2016-02-17 18:06:59 DEBUG [1071] Retry #9 to run mcollective agent on nodes: '112' 2016-02-17 18:08:02 DEBUG [1071] Retry #10 to run mcollective agent on nodes: '112' 6) astute gave up with trace: 2016-02-17 18:09:04 ERROR [1071] MCollective agents 'uploadfile' '112' didn't respond within the allotted time. trace: ["/usr/share/gems/gems/astute-8.0.0/lib/astute/mclient.rb:114:in `check_results_with_retries'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/mclient.rb:60:in `method_missing'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/image_provision.rb:46:in `upload_provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/image_provision.rb:22:in `block in provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/image_provision.rb:22:in `each'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/image_provision.rb:22:in `provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:296:in `image_provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:241:in `block in provision_piece'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:288:in `call'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:288:in `report_image_provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:240:in `provision_piece'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:336:in `call'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:336:in `sleep_not_greater_than'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:115:in `loop'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:114:in `catch'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/provision.rb:46:in `provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/orchestrator.rb:123:in `provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/dispatcher.rb:51:in `provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/dispatcher.rb:37:in `image_provision'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/server.rb:189:in `dispatch_message'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/server.rb:146:in `block in dispatch'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/task_queue.rb:64:in `call'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/task_queue.rb:64:in `block in each'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/task_queue.rb:56:in `each'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/task_queue.rb:56:in `each'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/server.rb:144:in `each_with_index'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/server.rb:144:in `dispatch'", "/usr/share/gems/gems/astute-8.0.0/lib/astute/server/server.rb:123:in `block in perform_main_job'"] {"status"=>"error", "error"=> 2016-02-17 18:09:04 DEBUG [1071] Data send by DeploymentProxyReporter to report it up: {"status"=>"error", "error"=> 2016-02-17 18:09:04 INFO [1071] Changing node netboot state node-102 2016-02-17 18:09:04 INFO [1071] Casting message to Nailgun: {"method"=>"provision_resp", "args"=> "status"=>"error", "error"=> 7) however, provisioning task proceeded further ignoring that error due to https://github.com/openstack/fuel-astute/blob/stable/8.0/lib/astute/image_provision.rb#L23 upload_provision() failed, thus run_provision() wasnot executed, neither was failed_uids set correctly 8) failed_uids was empty. This, in turn, leaded to fake positive result of run_provision() execution. So, astute mistakenly assumed that all target nodes were provisioned without errors. 9) astute changed netboot to false for all target nodes and tried to reboot them into target OS due to lines: https://github.com/openstack/fuel-astute/blob/stable/8.0/lib/astute/provision.rb#L243-L254 11) target nodes tried to boot from local disks, since all disks were wiped out prior provisioning, there was no any valid boot sector. Therefore they threw 'boot sector signature not found' error.