Failed tasks: Task[openstack-network-networks/14] (503 from neutron-api)

Bug #1613785 reported by Sergey Galkin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Sergey Galkin
Mitaka
Confirmed
High
Sergey Galkin

Bug Description

I have tried to deploy 69 nodes from 484.
Deployment failed with error 'Deployment has failed. All nodes are finished. Failed tasks: Task[openstack-network-networks/14] Stopping the deployment process!'
on the node-14 in puppet.log I have found
Execution of '/usr/bin/neutron subnet-list --format=csv --column=id --quote=none' returned 1: <html><body><h1>503 Service Unavailable<
http://paste.openstack.org/show/558417/

in the /var/log/neutron/server.log a lot of:
2016-08-16 03:04:47.064 3228 ERROR oslo.messaging._drivers.impl_rabbit [req-913b8054-603f-4bb4-bdc6-6a110a96107a - - - - -] AMQP server on 10.41.0.8:5673 is unreachable: [Errno 111] ECONNREFUSED. Trying again in
 1 seconds.

grep 'ERROR oslo.messaging._drivers.impl_rabbit' /var/log/neutron/server.log | wc -l
9994

in the /<email address hidden> a lot of

=ERROR REPORT==== 15-Aug-2016::22:24:44 ===
closing AMQP connection <0.4366.2> (10.41.0.8:41296 -> 10.41.0.36:5673):
missed heartbeats from client, timeout: 60s

but now /usr/bin/neutron subnet-list from node-14 sometime working sometime retrun

root@node-14:/var/log# /usr/bin/neutron subnet-list

<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

Revision history for this message
Sergey Galkin (sgalkin) wrote :

[root@fuel ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 598
cat /etc/fuel_build_number:
 598
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6349.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-notify-9.0.0-1.mos8460.noarch
 fuel-ostf-9.0.0-1.mos936.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8743.noarch
 fuel-mirror-9.0.0-1.mos141.noarch
 fuel-openstack-metadata-9.0.0-1.mos8743.noarch
 rubygem-astute-9.0.0-1.mos750.noarch
 fuel-misc-9.0.0-1.mos8460.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-utils-9.0.0-1.mos8460.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 nailgun-mcagents-9.0.0-1.mos750.noarch
 fuel-library9.0-9.0.0-1.mos8460.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-migrate-9.0.0-1.mos8460.noarch
 python-packetary-9.0.0-1.mos141.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-nailgun-9.0.0-1.mos8743.noarch

Revision history for this message
Sergey Galkin (sgalkin) wrote :
tags: added: area-library
Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
milestone: none → 10.0
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Oslo (mos-oslo)
tags: removed: area-library
Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

I have looked into the env while it was alive. The problem in debugging that issue is that master node lacked /var/log/remote/node-X/lrmd.log for controllers. As a result, it was impossible to say why Pacemaker periodically restarts RabbitMQ.

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

So, in order to proceed we do need these logs enabled, though in general it feels that the issue is not in RabbitMQ or Pacemaker, but rather in hardware or network equipment. Sergey, please move this bug to new/confirmed state once the issue reproduces and lrmd.log is available.

Changed in fuel:
assignee: MOS Oslo (mos-oslo) → Sergey Galkin (sgalkin)
status: Confirmed → Incomplete
tags: added: move-to-9.2
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Invalid as it stays in Incomplete for a month

Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
Dave Johnston (dave-johnston) wrote :

I can replicate this issue repeatedly using Fuel 9.2 and a virtual-box environment consisting of 1 x Controller 1 x Compute.
The controller is used as the ODL node.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.