Going back to ec2 tests and comment 19, dims pointed out that the boto packages has some connection pooling, and there might be bugs there cause issues with networking.
I noticed that boto 2.29.1 was released on 5/30 and that's what we're running with, and there was a connection pooling related change in there, I'm wondering if that caused any regression:
The various duplicate bugs for this ssh timeout failure in tempest showed up around 6/6 which is why we reverted that tempest change which added more server load to the runs, but maybe that was just exposing a limitation in the boto connection pooling code? I'm not really sure but it's a theory at least.
Going back to ec2 tests and comment 19, dims pointed out that the boto packages has some connection pooling, and there might be bugs there cause issues with networking.
I noticed that boto 2.29.1 was released on 5/30 and that's what we're running with, and there was a connection pooling related change in there, I'm wondering if that caused any regression:
https:/ /github. com/boto/ boto/commit/ fb3a7b407488c8b 2374502d10a90d4 31daf0aef9
The various duplicate bugs for this ssh timeout failure in tempest showed up around 6/6 which is why we reverted that tempest change which added more server load to the runs, but maybe that was just exposing a limitation in the boto connection pooling code? I'm not really sure but it's a theory at least.