Mirantis OpenStack

Bug #1463802
Comment #0

Comment 0 for bug 1463802

Revision history for this message

Artem Panchenko (apanchenko-8) wrote on 2015-06-10: Nova can't boot instances after primary controller graceful shutdown 'MessagingTimeout: Timed out waiting for a reply to message ID xxx'

Fuel version info (6.1 build #521 RC1): http://paste.openstack.org/show/277715/

After shutting down of primary controller OSTF tests which create Nova instances fail, because all new booted instances have ERROR state:

http://paste.openstack.org/show/281028/

Here is a part of nova-conductor.log (node-16):

http://paste.openstack.org/show/281014/

RabbitMQ cluster status looks good:

[root@fuel-lab-cz5558 ~]# runc 2 rabbitmqctl cluster_status
DEPRECATION WARNING: /etc/fuel/client/config.yaml exists and will be used as the source for settings. This behavior is deprecated. Please specify the path to your custom settings file in the FUELCLIENT_CUSTOM_SETTINGS environment variable.
node-16.mirantis.com
Cluster status of node 'rabbit@node-16' ...
[{nodes,[{disc,['rabbit@node-16','rabbit@node-7']}]},
{running_nodes,['rabbit@node-7','rabbit@node-16']},
{cluster_name,<<"<email address hidden>">>},
{partitions,[]}]
...done.
node-7.mirantis.com
Cluster status of node 'rabbit@node-7' ...
[{nodes,[{disc,['rabbit@node-16','rabbit@node-7']}]},
{running_nodes,['rabbit@node-16','rabbit@node-7']},
{cluster_name,<<"<email address hidden>">>},
{partitions,[]}]
...done.

Here is AMQP queues info:

http://paste.openstack.org/show/281029/

Steps to reproduce:

1. Create environment: Ubuntu, NeutronGRE, Ceph, Sahara, Ceilometer
2. Add 1 controller, 2 controller+ceph, 1 compute and 3 mongo nodes
3. Deploy changes.
4. Run OSTF
5. Shutdown primary controller (gracefully using `poweroff` command)
6. Run OSTF

Expected result:

- all tests passed except 'Check that required services are running'

Actual:

- all tests which create Nova instances fail

Also, I didn't find why, but all API requests to Nova take a long time, for example `nova list` simple command execution takes 17 seconds:

http://paste.openstack.org/show/281031/

Diagnostic snapshot (environment ID - 2, nodes: 5,16,7,6,11,13,14): https://drive.google.com/file/d/0BzaZINLQ8-xkNDFrX2RKRS1GOWs/view?usp=sharing

Fuel version info (6.1 build #521 RC1): http://paste.openstack.org/show/277715/

After shutting down of primary controller OSTF tests which create Nova instances fail, because all new booted instances have ERROR state:

http://paste.openstack.org/show/281028/

Here is a part of nova-conductor.log (node-16):

http://paste.openstack.org/show/281014/

RabbitMQ cluster status looks good:

[root@fuel-lab-cz5558 ~]# runc 2 rabbitmqctl cluster_status
DEPRECATION WARNING: /etc/fuel/client/config.yaml exists and will be used as the source for settings. This behavior is deprecated. Please specify the path to your custom settings file in the FUELCLIENT_CUSTOM_SETTINGS environment variable.
node-16.mirantis.com
Cluster status of node 'rabbit@node-16' ...
[{nodes,[{disc,['rabbit@node-16','rabbit@node-7']}]},
 {running_nodes,['rabbit@node-7','rabbit@node-16']},
 {cluster_name,<<"rabbit@node-5.mirantis.com">>},
 {partitions,[]}]
...done.
node-7.mirantis.com
Cluster status of node 'rabbit@node-7' ...
[{nodes,[{disc,['rabbit@node-16','rabbit@node-7']}]},
 {running_nodes,['rabbit@node-16','rabbit@node-7']},
 {cluster_name,<<"rabbit@node-5.mirantis.com">>},
 {partitions,[]}]
...done.

Here is AMQP queues info:

http://paste.openstack.org/show/281029/

Steps to reproduce:

Expected result:

- all tests passed except 'Check that required services are running'

Actual:

- all tests which create Nova instances fail

Also, I didn't find why, but all API requests to Nova take a long time, for example `nova list` simple command execution takes 17 seconds:

http://paste.openstack.org/show/281031/

Diagnostic snapshot (environment ID - 2, nodes: 5,16,7,6,11,13,14): https://drive.google.com/file/d/0BzaZINLQ8-xkNDFrX2RKRS1GOWs/view?usp=sharing