https: client can cause nova/cinder to leak sockets for 'get' 'show' 'delete' 'update'

Bug #1423165 reported by Stuart McLaren
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
Glance Client
Fix Released
High
Stuart McLaren
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

    Other OpenStack services which instantiate a 'https' glanceclient using
    ssl_compression=False and insecure=False (eg Nova, Cinder) are leaking
    sockets due to glanceclient not closing the connection to the Glance
    server.

    This could happen for a sub-set of calls, eg 'show', 'delete', 'update'.

    netstat -nopd would show the sockets would hang around forever:

    ... 127.0.0.1:9292 ESTABLISHED 9552/python off (0.00/0/0)

    urllib's ConnectionPool relies on the garbage collector to tear down
    sockets which are no longer in use. The 'verify_callback' function used to
    validate SSL certs was holding a reference to the VerifiedHTTPSConnection
    instance which prevented the sockets being torn down.

------------------

to reproduce, set up devstack with nova talking to glance over https (must be performing full cert verification) and
perform a nova operation such as:

 $ nova image-meta 53854ea3-23ed-4682-abf7-8415f2d6b7d9 set foo=bar

you will see connections from nova to glance which have no timeout (off):

 $ netstat -nopd | grep 9292

 tcp 0 0 127.0.0.1:34204 127.0.0.1:9292 ESTABLISHED 9552/python off (0.00/0/0)

Changed in python-glanceclient:
assignee: nobody → Stuart McLaren (stuart-mclaren)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-glanceclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/156975

Changed in python-glanceclient:
status: New → In Progress
summary: - ttps: client can cause nova/cinder to leak sockets for 'get' 'show'
+ https: client can cause nova/cinder to leak sockets for 'get' 'show'
'delete' 'update'
Revision history for this message
Stuart McLaren (stuart-mclaren) wrote :

Note: eventually nova would no longer be able to service requests ('too many open file descriptors').

Louis Taylor (kragniz)
Changed in python-glanceclient:
importance: Undecided → High
Louis Taylor (kragniz)
Changed in python-glanceclient:
milestone: none → v0.16.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to python-glanceclient (master)

Reviewed: https://review.openstack.org/156975
Committed: https://git.openstack.org/cgit/openstack/python-glanceclient/commit/?id=ef9fd9fca05f8da8325ccaa6632e34d1321130bf
Submitter: Jenkins
Branch: master

commit ef9fd9fca05f8da8325ccaa6632e34d1321130bf
Author: Stuart McLaren <email address hidden>
Date: Tue Feb 17 17:36:56 2015 +0000

    https: Prevent leaking sockets for some operations

    Other OpenStack services which instantiate a 'https' glanceclient using
    ssl_compression=False and insecure=False (eg Nova, Cinder) are leaking
    sockets due to glanceclient not closing the connection to the Glance
    server.

    This could happen for a sub-set of calls, eg 'show', 'delete', 'update'.

    netstat -nopd would show the sockets would hang around forever:

    ... 127.0.0.1:9292 ESTABLISHED 9552/python off (0.00/0/0)

    urllib's ConnectionPool relies on the garbage collector to tear down
    sockets which are no longer in use. The 'verify_callback' function used to
    validate SSL certs was holding a reference to the VerifiedHTTPSConnection
    instance which prevented the sockets being torn down.

    Change-Id: Idb3e68151c48ed623ab89d05d88ea48465429838
    Closes-bug: 1423165

Changed in python-glanceclient:
status: In Progress → Fix Committed
Changed in python-glanceclient:
status: Fix Committed → Fix Released
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Thanks for the fast release. Sadly this doesn't help us for our icehouse deployment, as there the client version is pinned to 0.14.2. Is there a possibility to do a 0.14.3 release that would include this bugfix?

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Nova stable/juno is still affected by this issue, since the fix is not available there currently due to the version cap on python-glanceclient.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Cinder stable/juno is still affected by this issue, since the fix is not available there currently due to the version cap on python-glanceclient.

Changed in nova:
status: New → Invalid
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Davanum: Can you explain why you consider this bug invalid for nova? It causes nova-api-os-compute daemon to run out of file descriptors after some time, leading to failing services with very obscure errors. Same for cinder-volume and other services, though at a lower rate at least in our setup.

Revision history for this message
Ian Cordasco (icordasc) wrote :

Jens, likely because glanceclient has been released with the fix and the problem isn't actually in Nova.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

O.k., but as I tried to explain above, the fixed glanceclient cannot to be used for nova in stable/juno due to version caps. The cap also cannot be easily lifted due to the Oslo namespace changes. And in the end it still is most of my nova daemons dying regularly, so for me there still is some unsolved bug here. If one could add python-glanceclient with branch stable/juno as affected and set that to Unresolved, I'd be fine with that, but as long a there is not such branch, this IMHO is a bug in nova stable/juno.

Revision history for this message
Joe Gordon (jogo) wrote :

I agree this is a bug, but its not a nova bug per se.

It sounds like this is a glanceclient issue / issue in https://github.com/openstack/requirements/tree/stable/juno

It sounds like the right answer here is to create a stable/juno glanceclient branch

Revision history for this message
John Griffith (john-griffith) wrote :

Going to close it for Cinder as well, as I don't know of a way to fix a broken glanceclient from the consumer end.

If you're interested however I did throw together a patched version of 0.14.2 here:
https://github.com/j-griffith/python-glanceclient/tree/stable/icehouse

Maybe you or somebody else could test it out, and we could convince the glance folks to push a branch for it; or people that need it can maybe just use it.

Thanks

Changed in cinder:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.