Test HA scalability fails: 'Deployment has failed. Method deploy. Disabling the upload of disk image because glance was not installed properly'

Bug #1390480 reported by Artem Panchenko
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
In Progress
Critical
Dmitry Ilyin

Bug Description

api: '1.0'
astute_sha: c72dac7b31646fbedbfc56a2a87676c6d5713fcf
auth_required: true
build_id: 2014-11-05_21-28-15
build_number: '79'
feature_groups:
- mirantis
fuellib_sha: 03c63d4233f6e1c182cc58213cd4542aba5b706d
fuelmain_sha: 35eed79fd6dcfc86a9206ffcd90d385ba5dcca56
nailgun_sha: 01579a53c2be841dc4cafb0be00213243dde7667
ostf_sha: 9c6fadca272427bb933bc459e14bb1bad7f614aa
production: docker
release: '6.0'

Adding of controller nodes fails - deployment finishes with error:

'Deployment has failed. Method deploy. Disabling the upload of disk image because glance was not installed properly.
Inspect Astute logs for the details'

This bug was reproduced on CI during system tests:

http://jenkins-product.srt.mirantis.net:8080/view/6.0_swarm/job/6.0_fuelmain.system_test.ubuntu.thread_4/17/?

Steps to reproduce:

 1. Create environment HA, Ubuntu, NovaFlat.
 2. Add 1 controller node.
 3. Deploy the cluster. Result - success.
 4. Add 2 controller nodes
 5. Deploy changes. Result - success.
 6. Add 2 controller nodes.
 7. Deploy changes.

Expected result:

 - cluster is successfully deployed

Actual:

 - deployment fails (see error above)

Here is the part of Astute's logs:

http://paste.openstack.org/show/130464/

and output of 'pcs status' command on controllers:

http://paste.openstack.org/show/130465/

As you can see after 3rd deployment (step 7) 2 controllers were marked as 'error' (nodes which were added on step 4), but new 2 controllers were successfully deployed. Puppet logs on failed nodes (node-2, node-3) contain the following errors:

http://paste.openstack.org/show/130476/

Also, I tried to execute 'glance image-list' command few times and it randomly returned errors:

http://paste.openstack.org/show/130478/

Diagnostic snapshot is attached.

Tags: ha
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :

It failed again at the same controllers (node-3, node-4) specified in desription with
2014-11-10 09:11:13 ERR

 (/Stage[corosync_setup]/Osnailyfacter::Cluster_ha::Virtual_ips/Cluster::Virtual_ips[management]/Cluster::Virtual_ip[management]/Cs_resource[vip__management]) Could not evaluate: Execution of '/usr/sbin/cibadmin --patch --sync-call --xml-text <diff crm_feature_set='3.0.7'>

Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Changed in fuel:
status: New → Confirmed
importance: High → Critical
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

This bug is blocking validation of Linux kernel upgrade.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This could be failing because of swift configuration. This was fixed recently by Andrew Woodward.

Dmitry Ilyin (idv1985)
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Dmitry Ilyin (idv1985)
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Is this bug In Progress? Is Matt's comment #5 correct?

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Artem, can you confirm this bug is reproducable after swift fixes were merged?

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Dmitry Ilyin (idv1985) wrote :

This bug is related to our cs_resource behaving not as good as it should be. I've tried to port upstream versions but found it both too hard and in vain because upstream version is far from perfect too.

I should repair this provider somehow.

Dmitry Ilyin (idv1985)
summary: - Test HA scalability fails: 'Deployment has failed. Method deploy.
+ [crmTest HA scalability fails: 'Deployment has failed. Method deploy.
Disabling the upload of disk image because glance was not installed
properly'
summary: - [crmTest HA scalability fails: 'Deployment has failed. Method deploy.
+ Test HA scalability fails: 'Deployment has failed. Method deploy.
Disabling the upload of disk image because glance was not installed
properly'
Changed in fuel:
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/134964
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c045ce3078c3d5e0d5df786041875332b8f3fac2
Submitter: Jenkins
Branch: master

commit c045ce3078c3d5e0d5df786041875332b8f3fac2
Author: Dmitry Ilyin <email address hidden>
Date: Thu Nov 27 18:11:16 2014 +0300

    Fix idempotency of cs_resource

    * insync? to drop status metadata from checks
    * code cleanup
    * fix rspec for cs_resource type
    * switch location add implementation from pcs
      to cibadmin --patch to solve problems with
      cib changes not being synced to other nodes

    related-blueprint: pacemaker-improvements
    Related-Bug: 1391599
    Related-Bug: 1390480
    Related-Bug: 1396481

    Change-Id: I5410b91ea01fc8c6805de6becdf0800d0d486188
    Signed-off-by: Sergii Golovatiuk <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.