remote interface opens and closes connections too frequently

Bug #1212341 reported by Jonathan Maron
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Sahara
Fix Released
Critical
Dmitry Mescheryakov

Bug Description

The remote interface (savanna/utils/remote.py) opens and closes the SSH/SFTP connection per interaction (e.g. execute_command()). In our testing, this constant re-opening and closing of connections yields the following error:

2013-08-13 23:46:40.288 6711 ERROR paramiko.transport [-]
Traceback (most recent call last):
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 346, in fire_timers
    timer()
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/semaphore.py", line 121, in _do_acquire
    waiter.switch()
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/greenthread.py", line 194, in main
    result = function(*args, **kwargs)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/context.py", line 132, in wrapper
    func(*args, **kwargs)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/plugins/hdp/hadoopserver.py", line 37, in provision_ambari
    self._setup_and_start_ambari_agent(ambari_info.host.internal_ip)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/plugins/hdp/hadoopserver.py", line 90, in _setup_and_start_ambari_agent
    ambari_server_ip)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 131, in replace_remote_string
    with contextlib.closing(self.ssh_connection()) as ssh:
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 112, in ssh_connection
    self.instance.node_group.cluster.private_key)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 31, in setup_ssh_connection
    ssh.connect(host, username=username, pkey=private_key)
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/paramiko/client.py", line 311, in connect
    t.start_client()
  File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/paramiko/transport.py", line 465, in start_client
    raise e
SSHException: Error reading SSH protocol bannerSecond simultaneous read on fileno 9 detected. Unless you really know what you're doing, make sure that only one greenthread can read any particular socket. Consider using a pools.Pool. If you do know what you're doing and want to disable this error, call eventlet.debug.hub_prevent_multiple_readers(False)

  Although the error seems to indicate a threading issue, we have clearly seen from our testing that the actual cause is related to the frequent opening and closing connections.

  Rather than opening and closing these connections on a per invocation basis, the code should be modified to leverage a single SSH/SFTP connection for the duration of its use.

Tags: 0.3
Changed in savanna:
status: New → Triaged
importance: Undecided → High
milestone: none → 0.3a1
Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

The issue needs further digging, but I can suggest a workaround for now:

Change ambariplugin.py in the following way:

    def _spawn(self, description, func, args):
        #context.spawn(description, func, args)
        func(args)

That will make deployment single-threaded

Revision history for this message
Jonathan Maron (jmaron) wrote : Re: [Bug 1212341] Re: remote interface opens and closes connections too frequently
Download full text (4.5 KiB)

Although it appears to manifest as a threading issue, we've identified this as an issue related to the frequent closing and opening of ssh connections. Caching the connection solves the issue.

On Aug 20, 2013, at 11:43 AM, Dmitry Mescheryakov <email address hidden> wrote:

> The issue needs further digging, but I can suggest a workaround for now:
>
> Change ambariplugin.py in the following way:
>
> def _spawn(self, description, func, args):
> #context.spawn(description, func, args)
> func(args)
>
> That will make deployment single-threaded
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> Matching subscriptions: Savanna bugs
> https://bugs.launchpad.net/bugs/1212341
>
> Title:
> remote interface opens and closes connections too frequently
>
> Status in Savanna project:
> Triaged
>
> Bug description:
> The remote interface (savanna/utils/remote.py) opens and closes the
> SSH/SFTP connection per interaction (e.g. execute_command()). In our
> testing, this constant re-opening and closing of connections yields
> the following error:
>
> 2013-08-13 23:46:40.288 6711 ERROR paramiko.transport [-]
> Traceback (most recent call last):
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/hubs/hub.py", line 346, in fire_timers
> timer()
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/hubs/timer.py", line 56, in __call__
> cb(*args, **kw)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/semaphore.py", line 121, in _do_acquire
> waiter.switch()
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/eventlet/greenthread.py", line 194, in main
> result = function(*args, **kwargs)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/context.py", line 132, in wrapper
> func(*args, **kwargs)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/plugins/hdp/hadoopserver.py", line 37, in provision_ambari
> self._setup_and_start_ambari_agent(ambari_info.host.internal_ip)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/plugins/hdp/hadoopserver.py", line 90, in _setup_and_start_ambari_agent
> ambari_server_ip)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 131, in replace_remote_string
> with contextlib.closing(self.ssh_connection()) as ssh:
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 112, in ssh_connection
> self.instance.node_group.cluster.private_key)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/savanna/utils/remote.py", line 31, in setup_ssh_connection
> ssh.connect(host, username=username, pkey=private_key)
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/paramiko/client.py", line 311, in connect
> t.start_client()
> File "/root/dev/savanna/.tox/venv/lib/python2.6/site-packages/paramiko/transport.py", line 465, in start_client
> raise e
> SSHException: Error reading SSH protocol bannerSecond simultaneous read o...

Read more...

ruhe (ruhe)
Changed in savanna:
assignee: nobody → Dmitry Mescheryakov (dmitrymex)
ruhe (ruhe)
Changed in savanna:
importance: High → Critical
milestone: 0.3a1 → 0.2.2-rc1
ruhe (ruhe)
tags: added: 0.2.2 0.3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to savanna (master)

Fix proposed to branch: master
Review: https://review.openstack.org/45716

Changed in savanna:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to savanna (master)

Reviewed: https://review.openstack.org/45716
Committed: http://github.com/stackforge/savanna/commit/a57790da420f296ff4a184fd610cb7a516721d86
Submitter: Jenkins
Branch: master

commit a57790da420f296ff4a184fd610cb7a516721d86
Author: Dmitry Mescheryakov <email address hidden>
Date: Mon Aug 26 01:27:46 2013 +0400

    Wrapping ssh calls into subprocesses

    Eventlet does not work properly with Paramiko when several connections
    are opened concurrently (see bug #1212341). The fix moves ssh calls
    from main code to subprocess to avoid the issue.

    Also changed:
     * added timeout to all remote operations
     * old SSH utilities were moved from remote.py to integration tests,
       because new ones can not be utilized there

    Fixes: bug #1212341

    Change-Id: Ib89af3a3bbcb587af46dad3431d512a21d1ba826

Changed in savanna:
status: In Progress → Fix Committed
Changed in savanna:
status: Fix Committed → Fix Released
Changed in savanna:
milestone: 0.2.2-rc1 → 0.3a1
status: Fix Released → Fix Committed
tags: removed: 0.2.2
Changed in savanna:
status: Fix Committed → Fix Released
Revision history for this message
Geraint North (geraint-north) wrote :

Just FYI, this issue just got fixed in paramiko 1.11.2:

https://github.com/paramiko/paramiko/pull/156

In in Paramiko 1.11.2

+v1.11.2 (27th Sep 2013)
+-----------------------
+
+* #156: Fix potential deadlock condition when using Channel objects as sockets
+ (e.g. when using SSH gatewaying). Thanks to Steven Noonan and Frank Arnold
+ for catch & patch.
+
https://github.com/paramiko/paramiko/commit/e1851788768b5132181690e5ab03d4d65c466e42

Their experience of the defect was different (not using eventlet), but the fix is the same as the one that I'd come up with to work properly with eventlet.

Changed in savanna:
milestone: 0.3a1 → 0.3
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.