[astute] Deployment failed with MCollective call failed in agent 'uploadfile', method 'upload'

Bug #1320779 reported by Andrey Sledzinskiy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Vladimir Sharshov

Bug Description

http://jenkins-product.srt.mirantis.net:8080/view/0_0_swarm/job/master_fuelmain.system_test.centos.thread_3/61/testReport/%28root%29/deploy_stop_reset_on_ha/deploy_stop_reset_on_ha/

Steps:
1. Create cluster - Centos, HA, Flat Nova Network, Cinder for volumes
2. Add 3 controllers and start deploy cluster
3. Stop on provisioning
4. Wait for nodes get online state
5. Add 2 compute nodes
6. Start deployment

It failed on deploying stage with errors
Unexpected error 65f0347c-8afc-490b-9f42-683f766ff94d: MCollective call failed in agent 'uploadfile', method 'upload', failed nodes:
ID: 4 - Reason: Input/output error - /var/lib/astute
ID: 3 - Reason: Input/output error - /var/lib/astute
 traceback /usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/mclient.rb:116:in `check_results_with_retries'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/mclient.rb:62:in `method_missing'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:173:in `block (2 levels) in upload_ssh_keys'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:169:in `each'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:169:in `block in upload_ssh_keys'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:167:in `each'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:167:in `upload_ssh_keys'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:45:in `block in deploy'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:42:in `each_slice'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/deployment_engine.rb:42:in `deploy'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/orchestrator.rb:43:in `deploy'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/dispatcher.rb:105:in `deploy'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/server.rb:126:in `dispatch_message'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/server.rb:89:in `block in dispatch'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/task_queue.rb:64:in `call'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/task_queue.rb:64:in `block in each'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/task_queue.rb:56:in `each'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/task_queue.rb:56:in `each'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/server.rb:87:in `each_with_index'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/server.rb:87:in `dispatch'
/usr/lib64/ruby/gems/2.1.0/gems/astute-0.0.2/lib/astute/server/server.rb:72:in `block in perform_main_job'

Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Changed in fuel:
assignee: Fuel Astute Team (fuel-astute) → Vladimir Sharshov (vsharshov)
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Could not reproduce.

This error 'Input/output error' give us ability to talk about broken filesystem in node 3 and 4.

ID: 4 - Reason: Input/output error - /var/lib/astute
ID: 3 - Reason: Input/output error - /var/lib/astute

but 2,1,5 was fine.

2014-05-18T23:07:44 debug: [402] 65f0347c-8afc-490b-9f42-683f766ff94d: MC agent 'uploadfile', method 'upload', results: {:sender=>"2", :statuscode=>0, :statusmsg=>"OK", :data=>{:msg=>"File was uploaded!"}}
2014-05-18T23:07:44 debug: [402] 65f0347c-8afc-490b-9f42-683f766ff94d: MC agent 'uploadfile', method 'upload', results: {:sender=>"5", :statuscode=>0, :statusmsg=>"OK", :data=>{:msg=>"File was uploaded!"}}
2014-05-18T23:07:44 debug: [402] 65f0347c-8afc-490b-9f42-683f766ff94d: MC agent 'uploadfile', method 'upload', results: {:sender=>"4", :statuscode=>5, :statusmsg=>"Input/output error - /var/lib/astute", :data=>{:msg=>nil}}
2014-05-18T23:07:44 debug: [402] 65f0347c-8afc-490b-9f42-683f766ff94d: MC agent 'uploadfile', method 'upload', results: {:sender=>"1", :statuscode=>0, :statusmsg=>"OK", :data=>{:msg=>"File was uploaded!"}}
2014-05-18T23:07:44 debug: [402] 65f0347c-8afc-490b-9f42-683f766ff94d: MC agent 'uploadfile', method 'upload', results: {:sender=>"3", :statuscode=>5, :statusmsg=>"Input/output error - /var/lib/astute", :data=>{:msg=>nil}}

Also we cann't say about problem with filesystem erase command because OS installation was success for all nodes in cluster.

Changed in fuel:
status: New → Incomplete
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

Bug was reproduced again
http://jenkins-product.srt.mirantis.net:8080/view/0_0_swarm/job/master_fuelmain.system_test.centos.thread_3/66/testReport/%28root%29/deploy_stop_reset_on_ha/deploy_stop_reset_on_ha/
Logs are attached

But the problem is that controllers after stop went offline with ext4-fs errors so that is probably the problem of our virtual environment

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Mike Scherbakov (mihgen)
Changed in fuel:
milestone: 5.0 → 5.1
Revision history for this message
Nastya Urlapova (aurlapova) wrote :
Dmitry Ilyin (idv1985)
summary: - Deployment failed with MCollective call failed in agent 'uploadfile',
- method 'upload'
+ [astute] Deployment failed with MCollective call failed in agent
+ 'uploadfile', method 'upload'
Revision history for this message
Vladimir Sharshov (vsharshov) wrote :

Reproduced only twice and last more then month ago. Close it.

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.