EAGAIN on file when using RNG after daemon fork

Bug #430908 reported by Stephen Day
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
paramiko
Confirmed
Medium
Robey Pointer

Bug Description

Basically, I ran into this in the construction of a daemon that uses paramiko. I did not have an issue until I made the import of the paramiko module earlier than the fork point. The solution was to move the import into a function after the fork.

Here is the backtrace:

2009-09-16 11:19:25,713 DEBUG [paramiko.transport] starting thread (client mode): 0x9692d8cL
2009-09-16 11:19:25,767 INFO [paramiko.transport] Connected (version 2.0, client OpenSSH_4.3)
2009-09-16 11:19:25,768 ERROR [paramiko.transport] Unknown exception: [Errno 11] Resource temporarily unavailable
2009-09-16 11:19:25,768 ERROR [paramiko.transport] Traceback (most recent call last):
2009-09-16 11:19:25,768 ERROR [paramiko.transport] File "build/bdist.linux-i686/egg/paramiko/transport.py", line 1510, in run
2009-09-16 11:19:25,768 ERROR [paramiko.transport] self._send_kex_init()
2009-09-16 11:19:25,768 ERROR [paramiko.transport] File "build/bdist.linux-i686/egg/paramiko/transport.py", line 1675, in _send_kex_init
2009-09-16 11:19:25,768 ERROR [paramiko.transport] m.add_bytes(randpool.get_bytes(16))
2009-09-16 11:19:25,769 ERROR [paramiko.transport] File "build/bdist.linux-i686/egg/paramiko/rng.py", line 107, in get_bytes
2009-09-16 11:19:25,769 ERROR [paramiko.transport] entropy_data = self.entropy.read(N)
2009-09-16 11:19:25,769 ERROR [paramiko.transport] File "build/bdist.linux-i686/egg/paramiko/rng_posix.py", line 33, in read
2009-09-16 11:19:25,769 ERROR [paramiko.transport] return self.file.read(bytes)
2009-09-16 11:19:25,769 ERROR [paramiko.transport] IOError: [Errno 11] Resource temporarily unavailable
2009-09-16 11:19:25,769 ERROR [paramiko.transport]

Looking at the code, we can see that all StrongLockingRandomPool objects will use a global rng_device:

if ((platform is not None and platform.system().lower() == 'windows') or
        sys.platform == 'win32'):
    # MS Windows
    from paramiko import rng_win32
    rng_device = rng_win32.open_rng_device()
else:
    # Assume POSIX (any system where /dev/urandom exists)
    from paramiko import rng_posix
    rng_device = rng_posix.open_rng_device()

It seems this file descriptor would get closed when doing a daemon fork if this module is imported prior to daemonizing (I am using python-daemon). I would argue that each transport should have their own file descriptor to the random number generator. This way, when the Transport is created, which is likely in the context of the daemon, we also control the creation of the file descriptor. The other option is to catch this error and reopen the fd.

Robey Pointer (robey)
Changed in paramiko:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Robey Pointer (robey)
Revision history for this message
Robey Pointer (robey) wrote :

that might use up file descriptors fast. odd that the error is EAGAIN instead of some error that would indicate the file is closed. i wonder if the kernel is catching that 2 processes have the same pool reference and that's it's passive-aggressive way of saying quit it.

might make sense to catch EAGAIN there and re-open /dev/urandom?

Revision history for this message
Stephen Day (sjaday) wrote : Re: [Bug 430908] Re: EAGAIN on file when using RNG after daemon fork
Download full text (3.8 KiB)

You are right about the file descriptors, but moving away from a global
variable would make things a little less pathological.

It's been awhile since I filed this, but if my memory serves me correctly,
your suggestion should work. You could also look at how often the random
numbers are actually required; perhaps, explicitly opening and closing the
file as needed, rather than leaving it open, might be cleaner if accesses
are sparse.

Thanks for the reply on this issue.

On Sun, Nov 1, 2009 at 9:36 PM, Robey Pointer <email address hidden>wrote:

> that might use up file descriptors fast. odd that the error is EAGAIN
> instead of some error that would indicate the file is closed. i wonder
> if the kernel is catching that 2 processes have the same pool reference
> and that's it's passive-aggressive way of saying quit it.
>
> might make sense to catch EAGAIN there and re-open /dev/urandom?
>
> --
> EAGAIN on file when using RNG after daemon fork
> https://bugs.launchpad.net/bugs/430908
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in Paramiko SSH2 library for python: Confirmed
>
> Bug description:
> Basically, I ran into this in the construction of a daemon that uses
> paramiko. I did not have an issue until I made the import of the paramiko
> module earlier than the fork point. The solution was to move the import into
> a function after the fork.
>
> Here is the backtrace:
>
> 2009-09-16 11:19:25,713 DEBUG [paramiko.transport] starting thread (client
> mode): 0x9692d8cL
> 2009-09-16 11:19:25,767 INFO [paramiko.transport] Connected (version 2.0,
> client OpenSSH_4.3)
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport] Unknown exception:
> [Errno 11] Resource temporarily unavailable
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport] Traceback (most recent
> call last):
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport] File
> "build/bdist.linux-i686/egg/paramiko/transport.py", line 1510, in run
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport]
> self._send_kex_init()
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport] File
> "build/bdist.linux-i686/egg/paramiko/transport.py", line 1675, in
> _send_kex_init
> 2009-09-16 11:19:25,768 ERROR [paramiko.transport]
> m.add_bytes(randpool.get_bytes(16))
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport] File
> "build/bdist.linux-i686/egg/paramiko/rng.py", line 107, in get_bytes
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport] entropy_data =
> self.entropy.read(N)
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport] File
> "build/bdist.linux-i686/egg/paramiko/rng_posix.py", line 33, in read
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport] return
> self.file.read(bytes)
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport] IOError: [Errno 11]
> Resource temporarily unavailable
> 2009-09-16 11:19:25,769 ERROR [paramiko.transport]
>
> Looking at the code, we can see that all StrongLockingRandomPool objects
> will use a global rng_device:
>
> if ((platform is not None and platform.system().lower() == 'windows') or
> sys.platform == 'win32'):
> # MS Windows
> from paramiko import rng_win32
> rng_device = rng...

Read more...

Robey Pointer (robey)
Changed in paramiko:
status: Confirmed → Incomplete
status: Incomplete → Confirmed
Revision history for this message
Juan J. Martínez (jjmartinez) wrote :

I can't re-open the file, and I've tried several approaches.

Finally I could workaround the problem with:

 1. When creating the daemon context, use files_preserve to point the fileno that you don't want to close. That's not easy because there may be unused file descriptors before the rng file handle.
 2. When you open the daemon context, you need to call Crypto.Random.atfork()

In that way I can use paramiko with daemon and seems to work properly, although figuring out what's the file descriptor for that rng file it's a little bit tricky.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.