Root file descriptor limits being applied to non-root processes with higher limits

Bug #672749 reported by Quanah Gibson-Mount
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu
Triaged
Medium
Unassigned

Bug Description

We've found that with Zimbra Collaboration Suite, when we run it on Ubuntu 10.04 LTS, we often get "bad file descriptor" errors. We can work around this by modifying limits.conf to give root a higher number of file descriptors. However, the processes in question are running as the zimbra user, which already have higher file descriptor limits. This issue did not occur in Ubuntu 8.04 LTS.

Details from the java process:

1. no matter we use jdk1.5.08 or jdk1.6.22, without and with
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6593729(After failed file
close, do not repeat the close operation to close a valid file descriptor just
open) , we still see this bug.

2. I instruct source code of SocketOutputStream. when we hit the BFD exception
prior to writing to fd, fd's num is positive integer which indicate the fd is
valid,
strace did show something different for the thread hit exception,
when thread attempts to write to the output stream, the underlying trace
showing only connect without poll
 connect(1029, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress

while a health connection usually has a poll immediately after connect returns
EINPROGRESS.
[pid 2609] 14:43:18 connect(176, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
[pid 2609] 14:43:18 poll([{fd=176, events=POLLOUT}], 1, 30000) = 1 ([{fd=176,
revents=POLLOUT}])

The socket is non-blocking and the connection cannot be completed immediately.
It is possible to select(2) or poll(2) for completion by selecting the socket
for writing. After select indicates writability, use getsockopt(2) to read the
SO_ERROR option at level SOL_SOCKET to determine whether connect completed

why the poll call is missing.

3. we also experience java crash along with BFD.
https://bugzilla.zimbra.com/show_bug.cgi?id=52285#c2

4. this only happens to
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.1 LTS"
Linux zmp-2 2.6.32-25-server #45-Ubuntu SMP Sat Oct 16 20:06:58 UTC 2010 x86_64
GNU/Linux

5. in lab, BFD can always be worked around by raising # fd for root.
both BFD and java crash are gone.

6. in this ubuntu 10.04 server, the nginx will hit segfault if the root fd is
1024 while the default worker_connections is 10240. in other platform or ubuntu
8, the nginx will not work properly, but no segfault.

Revision history for this message
Paul Tagliamonte (paultag) wrote :

Looks to be related to:

https://bugzilla.zimbra.com/show_bug.cgi?id=42870

( On a personal note, thanks for the thorough report, having the right information at hand is nice :) )

Changed in ubuntu:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.