On Ubuntu in HA mode on vcenter's machine, secondary controllers are deploying of openstack in parallel with primary controller.

Bug #1371044 reported by Tatyana Dubyk
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Fuel Library (Deprecated)
5.1.x
Invalid
High
Fuel Library (Deprecated)
6.0.x
Invalid
High
Fuel Library (Deprecated)

Bug Description

On Ubuntu in HA mode on vcenter's machine, secondary controllers are deploying of openstack in parallel with primary controller.

==============vcenter settings===========================
export VCENTER_IP='172.16.0.254'
export <email address hidden>'
export VCENTER_PASSWORD='Qwer!1234'
export VCENTER_CLUSTERS='Cluster1,Cluster2'

=====================================================
Configuration:
===================================================
steps to reproduce:
1.set up lab on vcenter's machine from 5.1-9(RC3) iso
2.create env and start deploy:
   OS: Ubuntu (HA mode)
   roles: 1 controller + 1 cinder (vmdk) on each of 3 nodes
3.start deploy

Error: unable to connect to node 'rabbit@node-2': nodedown

node-2 is primary controller. And it is available.

Expected result: Deployment process of openstack must be finished at first on
                 primary controller then must be started on secondary controllers.
Actual result: Secondary controllers are deploying of openstack in
                 parallel with primary controller.

---------------------fuel-version-------------------------------
[root@nailgun ~]# fuel --fuel-version
api: '1.0'
astute_sha: f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13
auth_required: true
build_id: 2014-09-17_04-49-39
build_number: '9'
feature_groups:
- mirantis
fuellib_sha: d9b16846e54f76c8ebe7764d2b5b8231d6b25079
fuelmain_sha: 8ef433e939425eabd1034c0b70e90bdf888b69fd
nailgun_sha: 51231834c61920a5dea8ce402ad027b2505d632d
ostf_sha: 64cb59c681658a7a55cc2c09d079072a41beb346
production: docker
release: '5.1'
release_versions:
  2014.1.1-5.1:
    VERSION:
      api: '1.0'
      astute_sha: f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13
      build_id: 2014-09-17_04-49-39
      build_number: '9'
      feature_groups:
      - mirantis
      fuellib_sha: d9b16846e54f76c8ebe7764d2b5b8231d6b25079
      fuelmain_sha: 8ef433e939425eabd1034c0b70e90bdf888b69fd
      nailgun_sha: 51231834c61920a5dea8ce402ad027b2505d632d
      ostf_sha: 64cb59c681658a7a55cc2c09d079072a41beb346
      production: docker
      release: '5.1'

Tags: vcenter
Revision history for this message
Tatyana Dubyk (tdubyk) wrote :
Revision history for this message
Tatyana Dubyk (tdubyk) wrote :
Revision history for this message
Tatyana Dubyk (tdubyk) wrote :
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

I'm not sure but it looks like it's not a problem. I think the situation is:

1. Each node has two roles: controller and cinder.
2. Priorities: Primary Controller (500), Controller (600) and Cinder (700).
3. Therefore, the node has been deployed as primary controller and stops since we need to deploy secondary controllers right after primary controller.
4. When the secondary controllers was deployed, all the nodes starts deploy cinder in paralled and finishes as ready.

So I think it's ok, but I believe it should be confirmed by some guy from library team.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

In my experience, deployment wasn't done in parallel. I tested with the latest RC for 5.1 and got a different failure:
 (/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[172.16.0.128-172.16.0.254]) Could not evaluate: Oops - not sure what happened: 751: unexpected token at '<html><body><h1>504 Gateway Time-out</h1>. Will attach diagnostic snapshot.

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Primary controller should always go before the other ones, otherwise galera cluster won't be deploted

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

According to logs in #1, node-1 deployment was failed and node-2, node-3 deployment was triggered - I cannot see any broken execution ordering for controllers:

2014-09-17T12:37:57.939591 node-1 ./node-1.test.domain.local/puppet-apply.log:2014-09-17T12:37:57.939591+01:00 err: (/Stage[netconfig]/L23network::L2/Package[openvswitch-datapath]/ensure) change from absent to present failed: SIGTERM
...
2014-09-17T13:50:13.468241 node-2 ./node-2.test.domain.local/puppet-apply.log:2014-09-17T13:50:13.468241+01:00 notice: Finished catalog run in 988.10 seconds
2014-09-17T14:07:51.098376 node-3 ./node-3.test.domain.local/puppet-apply.log:2014-09-17T14:07:51.098376+01:00 notice: Finished catalog run in 1026.78 seconds

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.