Steps to reproduce:
To deployed env with total number of nodes 120 add 80 new nodes.
Try to provision added nodes.
Expected result - provision should be successful
Actual result provision fails with error:
2016-08-01 06:27:28 WARNING [1444] Cobbler problem. Try to repeat: 1 attempt
2016-08-01 06:28:08 WARNING [1444] Cobbler problem. Try to repeat: 2 attempt
2016-08-01 06:28:48 ERROR [1444] Error occured while provisioning:
#<Net::ReadTimeout: Net::ReadTimeout>
2016-08-01 06:28:48 INFO [1444] Casting message to Nailgun:
{"method"=>"provision_resp",
"args"=>
{"task_uuid"=>"3ec17e5f-c596-4c08-8677-22efa082cebc",
"status"=>"error",
"error"=>"Net::ReadTimeout",
"progress"=>100}}
2016-08-01 06:28:49 INFO [1444] 3ec17e5f-c596-4c08-8677-22efa082cebc: stdout: stderr: Warning: Permanently added '172.29.193.78' (ECDSA) to the list of known hosts.
exit code: 0
2016-08-01 06:28:49 INFO [1444] Provision summary: time was spent 00:20:04
2016-08-01 06:28:49 ERROR [1444] Error running provisioning: Net::ReadTimeout, trace:
["/usr/share/ruby/net/protocol.rb:158:in `rescue in rbuf_fill'",
"/usr/share/ruby/net/protocol.rb:152:in `rbuf_fill'",
"/usr/share/ruby/net/protocol.rb:134:in `readuntil'",
"/usr/share/ruby/net/protocol.rb:144:in `readline'",
"/usr/share/ruby/net/http/response.rb:39:in `read_status_line'",
"/usr/share/ruby/net/http/response.rb:28:in `read_new'",
"/usr/share/ruby/net/http.rb:1406:in `block in transport_request'",
"/usr/share/ruby/net/http.rb:1403:in `catch'",
"/usr/share/ruby/net/http.rb:1403:in `transport_request'",
"/usr/share/ruby/net/http.rb:1376:in `request'",
"/usr/share/ruby/net/http.rb:1322:in `request_post'",
"/usr/share/ruby/xmlrpc/client.rb:475:in `do_rpc'",
"/usr/share/ruby/xmlrpc/client.rb:279:in `call2'",
"/usr/share/ruby/xmlrpc/client.rb:260:in `call'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler.rb:115:in `sync'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler_manager.rb:173:in `sync'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler_manager.rb:124:in `edit_nodes'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:235:in `provision_piece'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:122:in `block (3 levels) in provision_and_watch_progress'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:336:in `call'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:336:in `sleep_not_greater_than'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:116:in `block (2 levels) in provision_and_watch_progress'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:115:in `loop'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:115:in `block in provision_and_watch_progress'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:114:in `catch'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:114:in `provision_and_watch_progress'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:46:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/orchestrator.rb:127:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/dispatcher.rb:52:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/dispatcher.rb:37:in `image_provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:187:in `dispatch_message'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:146:in `block in dispatch'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:64:in `call'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:64:in `block in each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:56:in `each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:56:in `each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:144:in `each_with_index'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:144:in `dispatch'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:121:in `block in perform_main_job'"]
The problem is that "cobbler sync" takes more than 30 second. And default xml::rpm timeout is set to 30.
If we modify cobbler.rb with:
@remote.timeout = 60
Than "cobbler sync" finished, but another problem acquires:
2016-08-01 21:24:16 DEBUG [431] Data received by ProvisiningProxyReporter to report it up:
{"status"=>"error", "error"=>"end of file reached", "progress"=>100}
2016-08-01 21:24:16 DEBUG [431] Data send by DeploymentProxyReporter to report it up:
{"status"=>"error", "error"=>"end of file reached", "progress"=>100}
2016-08-01 21:24:16 INFO [431] Casting message to Nailgun:
{"method"=>"provision_resp",
"args"=>
{"task_uuid"=>"0b0e2702-7eb4-4d7e-b785-70933c72aa3e",
"status"=>"error",
"error"=>"end of file reached",
"progress"=>100}}
2016-08-01 21:24:16 DEBUG [431] 0b0e2702-7eb4-4d7e-b785-70933c72aa3e: MC agent 'execute_shell_command', method 'execute', results:
{:sender=>"33",
:statuscode=>0,
:statusmsg=>"OK",
:data=>{:stdout=>"", :stderr=>"", :exit_code=>0}}
2016-08-01 21:24:16 DEBUG [431] Unlock discovery for failed nodes. Result: [{"uid"=>"33", "exit code"=>0}]
2016-08-01 21:24:16 DEBUG [431] 0b0e2702-7eb4-4d7e-b785-70933c72aa3e: MC agent 'execute_shell_command', method 'execute', results:
{:sender=>"master",
:statuscode=>0,
:statusmsg=>"OK",
:data=>
{:stdout=>"",
:stderr=>
"Warning: Permanently added '172.29.193.78' (ECDSA) to the list of known hosts.\r\n",
:exit_code=>0}}
2016-08-01 21:24:16 INFO [431] 0b0e2702-7eb4-4d7e-b785-70933c72aa3e: stdout: stderr: Warning: Permanently added '172.29.193.78' (ECDSA) to the list of known hosts.
exit code: 0
2016-08-01 21:24:16 INFO [431] Provision summary: time was spent 00:01:41
2016-08-01 21:24:16 ERROR [431] Error running provisioning: end of file reached, trace:
["/usr/share/ruby/net/protocol.rb:153:in `read_nonblock'",
"/usr/share/ruby/net/protocol.rb:153:in `rbuf_fill'",
"/usr/share/ruby/net/protocol.rb:134:in `readuntil'",
"/usr/share/ruby/net/protocol.rb:144:in `readline'",
"/usr/share/ruby/net/http/response.rb:39:in `read_status_line'",
"/usr/share/ruby/net/http/response.rb:28:in `read_new'",
"/usr/share/ruby/net/http.rb:1406:in `block in transport_request'",
"/usr/share/ruby/net/http.rb:1403:in `catch'",
"/usr/share/ruby/net/http.rb:1403:in `transport_request'",
"/usr/share/ruby/net/http.rb:1376:in `request'",
"/usr/share/ruby/net/http.rb:1322:in `request_post'",
"/usr/share/ruby/xmlrpc/client.rb:475:in `do_rpc'",
"/usr/share/ruby/xmlrpc/client.rb:279:in `call2'",
"/usr/share/ruby/xmlrpc/client.rb:260:in `call'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler.rb:93:in `item_exists'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler.rb:108:in `system_exists?'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler.rb:161:in `netboot'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler_manager.rb:132:in `block in netboot_nodes'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler_manager.rb:128:in `each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/cobbler_manager.rb:128:in `netboot_nodes'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:180:in `remove_nodes'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:526:in `prepare_nodes'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/provision.rb:45:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/orchestrator.rb:127:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/dispatcher.rb:52:in `provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/dispatcher.rb:37:in `image_provision'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:187:in `dispatch_message'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:146:in `block in dispatch'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:64:in `call'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:64:in `block in each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:56:in `each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/task_queue.rb:56:in `each'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:144:in `each_with_index'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:144:in `dispatch'",
"/usr/share/gems/gems/astute-9.0.0/lib/astute/server/server.rb:121:in `block in perform_main_job'"]
The root cause of this is apache proxy, that breaks connection. It can be tuned with:
cat /etc/httpd/conf.modules.d/20-reqtimeout.conf
<IfModule reqtimeout_module>
RequestReadTimeout header=60 body=60
</IfModule>
After this - provision started to work.
We need to make this parameters configurable.
Could you please gather a diagnostic snapshot.