Rabbit reports "file descriptor limit alarm set", does not accept connections

Bug #1905423 reported by Peter Sabaini
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

We had an incident where rabbitmq-server would stop accepting connections, with the following message in the log:

=WARNING REPORT==== 24-Nov-2020::09:46:02 ===
file descriptor limit alarm set.

********************************************************************
*** New connections will not be accepted until this alarm clears ***
********************************************************************

However, when checking the fd limits these are set quite high:

root@juju-efcc41-12-lxd-0:~# cat /etc/default/rabbitmq-server
# Generated by juju
# bump ulimit so rabbit can support lots of connections
ulimit -n 65536

root@juju-efcc41-12-lxd-0:~# grep Limit /lib/systemd/system/rabbitmq-server.service
LimitNOFILE=65536

On the other hand, checking fd usage via lsof, `lsof -u rabbitmq` would only report abt. 900 open fds

After bouncing rabbitmq it resumed accepting connects

Versions:

rabbitmq-server 3.5.7-1ubuntu0.16.04.2
Ubuntu 16.04.5 LTS

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thank you for taking the time to file a bug report.

Would you be able to come up with a way for us to reproduce this bug? I understand that it must be happening in a production environment where thousands of connections are being made, but maybe we can come up with a way to reproduce it locally as well.

Does this bug happen often, or is it the first time it happened?

I tried finding some similar cases on the internet, but what I found were reports of real scenarios where increasing the fd limit solved the problem. Curiously, one of the solutions was to add the LimitNOFILE directive to the systemd unit file, which is already present in your case.

I'm subscribing the Ubuntu Server team to this bug so that we can follow it closely.

Changed in rabbitmq-server (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for rabbitmq-server (Ubuntu) because there has been no activity for 60 days.]

Changed in rabbitmq-server (Ubuntu):
status: Incomplete → Expired
Changed in rabbitmq-server (Ubuntu):
status: Expired → New
Revision history for this message
Peter Sabaini (peter-sabaini) wrote :

I unfortunately do not have a way to reproduce this. This indeed happened in a large-ish production env with lots of clients. It's the first time I saw this.

One thing I'm wondering, if rabbit were started with

$ sudo rabbitmqctl start_app

the systemd unit file probably would not go into effect, and the default limits would apply, right?

It's possible that by way of troubleshooting this rabbit instance had been started up via rabbitmqctl (I don't know for sure though).

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thanks for the reply, Peter.

Yes, if you start rabbitmqctl directly via the command line then the systemd unit will not be invoked, and process will use the default limits specified by the system. If that is the way you invoked rabbitmqctl, then what you can do is modify the limits before executing the command, and see if it has any effect in mitigating the issue.

Since it is still unclear how to reproduce the issue reliably, and given that the program might have been executed without setting the limits first, I'm marking this as Incomplete again. Let us know if you come up with more details about the issue. Thanks!

Changed in rabbitmq-server (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for rabbitmq-server (Ubuntu) because there has been no activity for 60 days.]

Changed in rabbitmq-server (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers