aodh listener does not retry to connect to rpc if connection failure

Bug #1557154 reported by Emilien Macchi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Aodh
Invalid
Medium
Liusheng

Bug Description

If Aodh Listener can't connect to AMQP, it won't loop to try again later, like it's done in many other OpenStack services, and the process will fail.

See:
Aodh Listener tries to start:
http://logs.openstack.org/99/286899/60/check/gate-puppet-openstack-integration-scenario001-tempest-dsvm-centos7/069651c/console.html#_2016-03-14_17_12_07_627

But RabbitMQ resources for Aodh are created after:
http://logs.openstack.org/99/286899/60/check/gate-puppet-openstack-integration-scenario001-tempest-dsvm-centos7/069651c/console.html#_2016-03-14_17_13_15_741

Aodh Listener fails to start:
http://logs.openstack.org/99/286899/60/check/gate-puppet-openstack-integration-scenario001-tempest-dsvm-centos7/069651c/logs/aodh/notifier.txt.gz#_2016-03-14_17_13_14_909

And never try again. That's a bug because other OpenStack services use to loop again.

gordon chung (chungg)
Changed in aodh:
status: New → Triaged
importance: Undecided → Medium
Liusheng (liusheng)
Changed in aodh:
assignee: nobody → Liusheng (liusheng)
milestone: none → newton-1
Revision history for this message
Liusheng (liusheng) wrote :

I have tried to reproduce this issue, after I stop rabbitmq-server in my devstack and then I try to start aodh-notifier, it will try again to connect rabbitmq, see:

2016-04-07 11:28:19.743 98183 DEBUG oslo_service.service [-] coordination.backend_url = None log_opt_values /usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py:2525
2016-04-07 11:28:19.743 98183 DEBUG oslo_service.service [-] coordination.check_watchers = 10.0 log_opt_values /usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py:2525
2016-04-07 11:28:19.744 98183 DEBUG oslo_service.service [-] coordination.heartbeat = 1.0 log_opt_values /usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py:2525
2016-04-07 11:28:19.744 98183 DEBUG oslo_service.service [-] ******************************************************************************** log_opt_values /usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py:2527
2016-04-07 11:28:19.753 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 1 seconds.
2016-04-07 11:28:20.759 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 2 seconds.
2016-04-07 11:28:22.767 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 4 seconds.
2016-04-07 11:28:26.778 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 6 seconds.
2016-04-07 11:28:32.789 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 8 seconds.
2016-04-07 11:28:40.802 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 10 seconds.
2016-04-07 11:28:50.818 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 12 seconds.
2016-04-07 11:29:02.835 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 14 seconds.
2016-04-07 11:29:16.857 98183 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.229.40.107:5672 is unreachable: [Errno 111] Connection refused. Trying again in 16 seconds.

so this the problem in your log seems not because aodh-notifier dosen't retry to connect rabbitmq.

Changed in aodh:
status: Triaged → Incomplete
Revision history for this message
gordon chung (chungg) wrote :

no feedback.

Changed in aodh:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.