Paramiko livelock on SSH session rekeying

Bug #378511 reported by Andreux Fort
2
Affects Status Importance Assigned to Milestone
paramiko
Fix Released
Medium
Robey Pointer

Bug Description

This took us a little while (like 2 months ;) to track down, but we have a patch that fixes it.

In paramiko 1.7.4.

From my notes:

     When Paramiko's transport drops out after we initiate key exchange, we
     can block forever. Avoid this blocking forever without other
     modifications to the logic, by closing transports that don't complete key
     exchange within some timeout.

The livelock occurs because:

The transport decides (in packet.py) that rekeying is required because we've received more than 2^30 bytes or packets. In our case, it's 2^30 bytes.

In transport.py: _clear_to_send (a threading.Event) is cleared at the beginning of key exchange, so that other frames that come in must be part of the key exchange.

The connection at this point dies.

The _clear_to_send is NEVER set, because to be set, we have to receive packets from the transport that progress the SSH protocol.

Because there's no timeout on the clear_to_send.wait() in Transport.SendUserMessage, we block forever when this happens. This breaks all of the running paramiko, since every call to Transport.SendUserMessage on the same transport

Anyhow, here's a patch to transport.py that works for us. Excuse all the extra debugging messages, feel free to remove those.

----

$ diff transport.py#2

paramiko/transport.py#2 - /home/afort/src/paramiko/transport.py ====
--- /tmp/tmp.5989.25 2009-05-19 16:14:40.000000000 -0700
+++ /home/afort/src/paramiko/transport.py 2009-05-19 16:13:54.101609000 -0700
@@ -25,11 +25,11 @@
 import string
 import struct
 import sys
+import thread
 import threading
 import time
 import weakref

-import google3
 from paramiko import util
 from paramiko.auth_handler import AuthHandler
 from paramiko.channel import Channel
@@ -329,6 +329,7 @@
         self.saved_exception = None
         self.clear_to_send = threading.Event()
         self.clear_to_send_lock = threading.Lock()
+ self.clear_to_send_timeout = 30.0
         self.log_name = 'paramiko.transport'
         self.logger = util.get_logger(self.log_name)
         self.packetizer.set_log(self.logger)
@@ -819,6 +820,8 @@
         @param bytes: the number of random bytes to send in the payload of the
             ignored packet -- defaults to a random number from 10 to 41.
         @type bytes: int
+
+ @raise SSHException: if the session ends prematurely.
         """
         m = Message()
         m.add_byte(chr(MSG_IGNORE))
@@ -884,6 +887,8 @@
             request was successful (or an empty L{Message} if C{wait} was
             C{False}); C{None} if the request was denied.
         @rtype: L{Message}
+
+ @raise SSHException: if the session fails during send_user_message.
         """
         if wait:
             self.completion_event = threading.Event()
@@ -962,7 +967,7 @@
         @param pkey: a private key to use for authentication, if you want to
             use private key authentication; otherwise C{None}.
         @type pkey: L{PKey<pkey.PKey>}
-
+
         @raise SSHException: if the SSH2 negotiation fails, the host key
             supplied by the server is incorrect, or authentication fails.
         """
@@ -1392,7 +1397,10 @@
         """
         send a message, but block if we're in key negotiation. this is used
         for user-initiated requests.
+
+ @raise SSHException: if the session times out during key exchange.
         """
+ start = time.time()
         while True:
             self.clear_to_send.wait(0.1)
             if not self.active:
@@ -1402,6 +1410,10 @@
             if self.clear_to_send.isSet():
                 break
             self.clear_to_send_lock.release()
+ if time.time() > start + self.clear_to_send_timeout:
+ raise SSHException('Key-exchange timed out after waiting for '
+ 'clear_to_send for %.1f seconds.'
+ % self.clear_to_send_timeout)
         try:
             self._send_message(data)
         finally:
@@ -1487,6 +1499,8 @@

             while self.active:
                 if self.packetizer.need_rekey() and not self.in_kex:
+ self._log(DEBUG, '%s packetizer.need_rekey() == True' %
+ thread.get_ident())
                     self._send_kex_init()
                 try:
                     ptype, m = self.packetizer.read_message()
@@ -1577,6 +1591,8 @@
         # throws SSHException on anything unusual
         self.clear_to_send_lock.acquire()
         try:
+ self._log(DEBUG, '%s _negotiate_keys: clear_to_send.clear()' %
+ thread.get_ident())
             self.clear_to_send.clear()
         finally:
             self.clear_to_send_lock.release()
@@ -1629,6 +1645,8 @@
         """
         self.clear_to_send_lock.acquire()
         try:
+ self._log(DEBUG, '%s _send_kex_init(): clear_to_send.clear()' %
+ thread.get_ident())
             self.clear_to_send.clear()
         finally:
             self.clear_to_send_lock.release()
@@ -1662,9 +1680,16 @@
         m.add_int(0)
         # save a copy for later (needed to compute a hash)
         self.local_kex_init = str(m)
+
+ # The problem occurs after clearing clear_to_send, above. The
+ # session can go down before we send the message below or
+ # receive the MSG_NEWKEYS message, re-entering _parse_newkeys to
+ # set clear_to_send. So we wait on clear_to_send with a strict
+ # timeout, after which the session is closed.
         self._send_message(m)

     def _parse_kex_init(self, m):
+ self._log(DEBUG, '%s in _parse_kex_init()' % thread.get_ident())
         cookie = m.get_bytes(16)
         kex_algo_list = m.get_list()
         server_key_algo_list = m.get_list()
@@ -1849,6 +1874,8 @@
             self.in_kex = False
         self.clear_to_send_lock.acquire()
         try:
+ self._log(DEBUG, '%s _parse_newkeys: clear_to_send.set()' %
+ thread.get_ident())
             self.clear_to_send.set()
         finally:
             self.clear_to_send_lock.release()

---

Revision history for this message
Robey Pointer (robey) wrote :

makes sense to me. will apply.

Changed in paramiko:
assignee: nobody → Robey Pointer (robey)
importance: Undecided → Medium
milestone: none → 1.7.5
status: New → Confirmed
status: Confirmed → Fix Committed
Robey Pointer (robey)
Changed in paramiko:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.