Nailgun Receiver hangs, and do not process Astute messages
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
Medium
|
Georgy Kibardin | ||
Mitaka |
In Progress
|
Medium
|
Bulat Gaifullin | ||
Newton |
Fix Committed
|
Medium
|
Georgy Kibardin |
Bug Description
Sometimes, if something goes wrong (unexpected RabbitMQ crash or something similar), the TCP connections are invalidated. However, Nailgun Receiver may not receive this information and may stuck on 'recv' execution.
[root@nailgun ~]# strace -p 44
Process 44 attached
recvfrom(3,
where
3 is a socket connection to RabbitMQ
44 is a PID of receiver process
However, on RabbitMQ side we can see there's no consumers of Nailgun queue (Nailgun queue is used to retrieve messages from Astute), and there's some messages in the queue:
[root@nailgun /]# rabbitmqctl list_queues name consumers messages
Listing queues ...
nailgun 0 7
naily 7 0
naily_
naily_
naily_
naily_
naily_
naily_
naily_
Moreover, the socket on Receiver side has only "FREAD" flag, while it should be at least "O_NONBLOCK" (since it's initially opened this way). This is super strange, and leads to the fact that we hang waiting for the input from socket and will figure out that it's dead only when TCP keepalive will check connection (7200 seconds by default).
Apparently, we must have some mechanism to check it sooner. So we need either enable RMQ heartbits for Receiver, or decrease TCP keepalive timeout to, let's say, 1 minute.
P.S: the issue with hanged receiver has been occurred few times on QA envs.
tags: | added: module-nailgun |
Changed in fuel: | |
assignee: | Fuel Python Team (fuel-python) → Georgy Kibardin (gkibardin) |
status: | Confirmed → In Progress |
tags: | added: keep-in-9.0 |
(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:
actual result
steps to reproduce
For more detailed information on the contents of each of the listed sections see https:/ /wiki.openstack .org/wiki/ Fuel/How_ to_contribute# Here_is_ how_you_ file_a_ bug