Neutron API inaccessible during deployment due to multiple AMQP reconnects

Bug #1387792 reported by Dmitry Tyzhnenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Fuel Library (Deprecated)

Bug Description

api: '1.0'
astute_sha: 97eea90efe0a1f17b4934919d6e459d270c10372
auth_required: true
build_id: 2014-10-30_04-21-22
build_number: '63'
feature_groups:
- mirantis
- techpreview
fuellib_sha: 45b6fc42091a0a33d3e48fbe78b782ce743aedc1
fuelmain_sha: 2ade7c571380a091048d103a6affff634b5b2520
nailgun_sha: 02c6bb2e54bbec76da33167eaf5f2e0b3e2e50a7
ostf_sha: f47fd1d66a7255213ee075d5c11b8f111f922000
production: docker
release: 6.0-techpreview
release_versions:
  2014.2-6.0-techpreview:
    VERSION:
      api: '1.0'
      astute_sha: 97eea90efe0a1f17b4934919d6e459d270c10372
      build_id: 2014-10-30_04-21-22
      build_number: '63'
      feature_groups:
      - mirantis
      - techpreview
      fuellib_sha: 45b6fc42091a0a33d3e48fbe78b782ce743aedc1
      fuelmain_sha: 2ade7c571380a091048d103a6affff634b5b2520
      nailgun_sha: 02c6bb2e54bbec76da33167eaf5f2e0b3e2e50a7
      ostf_sha: f47fd1d66a7255213ee075d5c11b8f111f922000
      production: docker
      release: 6.0-techpreview

Environment config:

Ubuntu HA + Neutron VLAN
Ceph for Images and Volumes

3 Controller + Ceph
36 Compute + Ceph
13 Compute

Whats wrong:

Deployment finished with error - Timeout of deployment is exceeded.

Revision history for this message
Dmitry Tyzhnenko (dtyzhnenko) wrote :
Revision history for this message
Dmitry Tyzhnenko (dtyzhnenko) wrote :

Network configuration.
Public, management and storage networks without vlan id

eth0 - admin(pxe)
eth1 - Public
eth2 - Management
eth3 - Storage
eth4 - Private VLANS

Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Here is that I can figure out from the logs http://pastebin.com/NA0V2r4q :

Looks like RC is Neutron API inaccessibility caused by rabbitmq was too much reconnecting (21 times) in a short period of deployment: from 15:11 to 16:19 (astute reported as expired) which is 1 reconnect per each 4 min

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

I've renamed the bug as per Bogdan's comment, the original title was too generic.

summary: - Deployment has failed. Timeout of deployment is exceeded.
+ Neutron API inaccessible during deployment due to multiple AMQP
+ reconnects
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

ceph debug messages are a red herring, the puppet resource is just making sure that ceph cluster is not already created.

Changed in fuel:
status: New → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I believe this bug was a root cause for reconnects https://bugs.launchpad.net/fuel/+bug/1389299
Due to start action running in foreground instead of detached, pcs killed agents on each timeout and respawned back again...

Please try to reproduce this issue with patch merged from #1389299

Changed in fuel:
status: Confirmed → Incomplete
assignee: Fuel Library Team (fuel-library) → Fuel QA Team (fuel-qa)
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

The patch from bug #1389299 is included in ISO #76 and later.

Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → Fuel Library Team (fuel-library)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

this bug should have been fixed with https://bugs.launchpad.net/fuel/+bug/1389299 as well

Changed in fuel:
status: Incomplete → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.