Ceilometer agent compute cannot reconnect to rabbitmq after RabbitMQ failover
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Invalid
|
High
|
Dmitry Sutyagin | ||
6.1.x |
Invalid
|
High
|
Dmitry Sutyagin |
Bug Description
After RabbitMq failover the following error is observed in log:
node-3:
Thu Mar 3 16:12:34 UTC 2016
node-3:
2016-03-02 03:38:01.240 17372 INFO ceilometer.agent [-] Polling pollster network.
2016-03-02 03:38:01.347 17372 INFO ceilometer.agent [-] Polling pollster memory.usage in the context of meter_source
2016-03-02 03:38:01.357 17372 INFO ceilometer.agent [-] Polling pollster instance in the context of meter_source
2016-03-02 03:38:01.461 17372 INFO ceilometer.agent [-] Polling pollster network.
2016-03-02 03:39:00.040 17372 ERROR oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
2016-03-02 03:39:00.040 17372 TRACE oslo.messaging.
No more new messages after that, though the service is running,
Strace shows that service is constantly calling epoll_wait:
node-3:
Process 17372 attached - interrupt to quit
epoll_wait(5, {}, 1023, 26) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 49) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
epoll_wait(5, {}, 1023, 0) = 0
Changed in mos: | |
status: | New → Confirmed |
importance: | Undecided → High |
Best I could do in terms of getting a traceback:
(gdb) bt 64-linux- gnu/libc. so.6 CallObjectWithK eywords () python2. 7/dist- packages/ greenlet. so python2. 7/dist- packages/ greenlet. so python2. 7/dist- packages/ greenlet. so eExFlags () 64-linux- gnu/libc. so.6
#0 0x00007f127bd64f82 in epoll_wait () from /lib/x86_
#1 0x0000000000512b8e in ?? ()
#2 0x00000000004b5d01 in PyEval_EvalFrameEx ()
#3 0x00000000004b6257 in PyEval_EvalFrameEx ()
#4 0x00000000004bc463 in PyEval_EvalCodeEx ()
#5 0x00000000004b645b in PyEval_EvalFrameEx ()
#6 0x00000000004bc463 in PyEval_EvalCodeEx ()
#7 0x00000000004491df in ?? ()
#8 0x000000000041b10a in PyObject_Call ()
#9 0x00000000004306be in ?? ()
#10 0x000000000041b10a in PyObject_Call ()
#11 0x00000000004b54d6 in PyEval_
#12 0x00007f127b652a66 in ?? () from /usr/lib/
#13 0x00007f127b6523b0 in ?? () from /usr/lib/
#14 0x00007f127b652f36 in ?? () from /usr/lib/
#15 0x00000000004b5d01 in PyEval_EvalFrameEx ()
#16 0x00000000004b6257 in PyEval_EvalFrameEx ()
#17 0x00000000004bc463 in PyEval_EvalCodeEx ()
#18 0x00000000004b645b in PyEval_EvalFrameEx ()
#19 0x00000000004b6257 in PyEval_EvalFrameEx ()
#20 0x00000000004b6257 in PyEval_EvalFrameEx ()
#21 0x00000000004b6257 in PyEval_EvalFrameEx ()
#22 0x00000000004b6257 in PyEval_EvalFrameEx ()
#23 0x00000000004b6257 in PyEval_EvalFrameEx ()
#24 0x00000000004bc463 in PyEval_EvalCodeEx ()
#25 0x00000000004b645b in PyEval_EvalFrameEx ()
#26 0x00000000004bc463 in PyEval_EvalCodeEx ()
#27 0x00000000004b645b in PyEval_EvalFrameEx ()
#28 0x00000000004b6257 in PyEval_EvalFrameEx ()
#29 0x00000000004bc463 in PyEval_EvalCodeEx ()
#30 0x00000000004bcf12 in PyEval_EvalCode ()
#31 0x00000000004dc202 in ?? ()
#32 0x00000000004dcdf4 in PyRun_FileExFlags ()
#33 0x00000000004dd8fe in PyRun_SimpleFil
#34 0x00000000004ee202 in Py_Main ()
#35 0x00007f127bc9276d in __libc_start_main () from /lib/x86_
#36 0x000000000041cbd9 in _start ()
Not really useful I guess.