For what I have been able to investigate in the last few days, thanks to the help of dickinsm, I've found that there is no real leak, but probably just a circular dependency. In fact, a simple gc.collect() inside the get callback is going to behave the same as a normal tcp connection.
I'm attaching a small script that shows which objects' reference has been increased between two requests, and three log files:
ssl.log - is the log generated by running the file as-it-is: (t2w)[maker@tumbolandia t2w]$ python leak.4.py > ssl.log [maker@tumbolandia t2w]$ for i in {1..1000}; do curl --insecure https://localhost:4443/; done
tcp.log - has been created using listenTCP instead of listenSSL, and run with: (t2w)[maker@tumbolandia t2w]$ python leak.4.py > tcp.log [maker@tumbolandia t2w]$ for i in {1..1000}; do curl http://localhost:4443/; done
collecting.log - is the log generated by simply incommenting the gc.collect() in line 26. (t2w)[maker@tumbolandia t2w]$ python leak.4.py > collecting.log [maker@tumbolandia t2w]$ for i in {1..1000}; do curl --insecure https://localhost:4443/; done
So my wonder now is, why delayedcall objects are kept this long in memory? Seems like contains the logs for the current request, accessible through Site._logDateTimeCall.
Probably we can get rid of the circular dependency between BIOProtocol and HTTPChannel setting the ssl attributes to None on the lostConnection method of the ssl factory; but this would concern twisted itself. Shall I open another issue there?
For what I have been able to investigate in the last few days, thanks to the help of dickinsm, I've found that there is no real leak, but probably just a circular dependency. In fact, a simple gc.collect() inside the get callback is going to behave the same as a normal tcp connection.
I'm attaching a small script that shows which objects' reference has been increased between two requests, and three log files:
(t2w)[maker@ tumbolandia t2w]$ python leak.4.py > ssl.log
[ maker@tumboland ia t2w]$ for i in {1..1000}; do curl --insecure https:/ /localhost: 4443/; done
ssl.log - is the log generated by running the file as-it-is:
tcp.log - has been created using listenTCP instead of listenSSL, and run with:
( t2w)[maker@ tumbolandia t2w]$ python leak.4.py > tcp.log
[ maker@tumboland ia t2w]$ for i in {1..1000}; do curl http:// localhost: 4443/; done
collecting.log - is the log generated by simply incommenting the gc.collect() in line 26.
( t2w)[maker@ tumbolandia t2w]$ python leak.4.py > collecting.log
[ maker@tumboland ia t2w]$ for i in {1..1000}; do curl --insecure https:/ /localhost: 4443/; done
So my wonder now is, why delayedcall objects are kept this long in memory? Seems like contains the logs for the current request, accessible through Site._logDateTi meCall.
Probably we can get rid of the circular dependency between BIOProtocol and HTTPChannel setting the ssl attributes to None on the lostConnection method of the ssl factory; but this would concern twisted itself. Shall I open another issue there?