Comment 0 for bug 1455357

Revision history for this message
Dennis Dmitriev (ddmitriev) wrote :

In the bug https://bugs.launchpad.net/fuel/+bug/1416365 we tried to fix the issue by updating fuel-devops, where method SSHClient.clear() was added to automate closing paramiko transports when SSHClient is deleted/exited [1].

But it doesn't work because of two things:
- we don't use SSHClient as a context manager;
- SSHClient instances are deleted only when Proboscis is finished, because we initialize it not using Node model from devops, but directly by IP address. So SSHClient is not deleted when Environment model of fuel-devops is destroyed after each test case, but exists to the end of whole job because Proboscis never delete classes after test case is finished;
- stop_thread() method in Paramiko transport is not designed for such load when several hundreds threads are closing at the same time [2].

As the result:
- SSHClient.clear() is called not at the end of each test case, but at the end of whole test group , when it exits.
- This method is correctly called by python interpreter, but because of 10-seconds timeout of waiting for thread (see [2]) , some threads are finished on time, some are not.
- Those thread which are not finished on time - cause the exceptions.

What we can do:
1) Always call remote.clear() from methods in fuel-devops/fuel-qa ;
2) Use SSHClient as a context manager everywhere in fuel-devops/fuel-qa ;
3) Store all opened transports in the 'Environment' model of fuel-devops, re-use it to reduce the amount of opened transports, delete them when 'Environment' model is destroyed after each test case. This will be not very good, but easiest solution to avoid refactoring whole fuel-qa source code. 10 seconds must be enough to close <10 transports.

[1] https://github.com/stackforge/fuel-devops/blob/master/devops/helpers/helpers.py#L243
[2] https://github.com/paramiko/paramiko/blob/master/paramiko/transport.py#L1420-L1421