Root file descriptor limits being applied to non-root processes with higher limits
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu |
Triaged
|
Medium
|
Unassigned |
Bug Description
We've found that with Zimbra Collaboration Suite, when we run it on Ubuntu 10.04 LTS, we often get "bad file descriptor" errors. We can work around this by modifying limits.conf to give root a higher number of file descriptors. However, the processes in question are running as the zimbra user, which already have higher file descriptor limits. This issue did not occur in Ubuntu 8.04 LTS.
Details from the java process:
1. no matter we use jdk1.5.08 or jdk1.6.22, without and with
http://
close, do not repeat the close operation to close a valid file descriptor just
open) , we still see this bug.
2. I instruct source code of SocketOutputStream. when we hit the BFD exception
prior to writing to fd, fd's num is positive integer which indicate the fd is
valid,
strace did show something different for the thread hit exception,
when thread attempts to write to the output stream, the underlying trace
showing only connect without poll
connect(1029, {sa_family=AF_INET, sin_port=
sin_addr=
progress
while a health connection usually has a poll immediately after connect returns
EINPROGRESS.
[pid 2609] 14:43:18 connect(176, {sa_family=AF_INET, sin_port=
sin_addr=
progress)
[pid 2609] 14:43:18 poll([{fd=176, events=POLLOUT}], 1, 30000) = 1 ([{fd=176,
revents=POLLOUT}])
The socket is non-blocking and the connection cannot be completed immediately.
It is possible to select(2) or poll(2) for completion by selecting the socket
for writing. After select indicates writability, use getsockopt(2) to read the
SO_ERROR option at level SOL_SOCKET to determine whether connect completed
why the poll call is missing.
3. we also experience java crash along with BFD.
https:/
4. this only happens to
DISTRIB_ID=Ubuntu
DISTRIB_
DISTRIB_
DISTRIB_
Linux zmp-2 2.6.32-25-server #45-Ubuntu SMP Sat Oct 16 20:06:58 UTC 2010 x86_64
GNU/Linux
5. in lab, BFD can always be worked around by raising # fd for root.
both BFD and java crash are gone.
6. in this ubuntu 10.04 server, the nginx will hit segfault if the root fd is
1024 while the default worker_connections is 10240. in other platform or ubuntu
8, the nginx will not work properly, but no segfault.
Looks to be related to:
https:/ /bugzilla. zimbra. com/show_ bug.cgi? id=42870
( On a personal note, thanks for the thorough report, having the right information at hand is nice :) )