Comment 60 for bug 1493303

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/270233
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=58359269b0e971e52f0eb7f97221566ca2148014
Submitter: Jenkins
Branch: master

commit 58359269b0e971e52f0eb7f97221566ca2148014
Author: Samuel Merritt <email address hidden>
Date: Tue Dec 8 16:36:05 2015 -0800

    Fix memory/socket leak in proxy on truncated SLO/DLO GET

    When a client disconnected while consuming an SLO or DLO GET response,
    the proxy would leak a socket. This could be observed via strace as a
    socket that had shutdown() called on it, but was never closed. It
    could also be observed by counting entries in /proc/<pid>/fd, where
    <pid> is the pid of a proxy server worker process.

    This is due to a memory leak in SegmentedIterable. A SegmentedIterable
    has an 'app_iter' attribute, which is a generator. That generator
    references 'self' (the SegmentedIterable object). This creates a
    cyclic reference: the generator refers to the SegmentedIterable, and
    the SegmentedIterable refers to the generator.

    Python can normally handle cyclic garbage; reference counting won't
    reclaim it, but the garbage collector will. However, objects with
    finalizers will stop the garbage collector from collecting them* and
    the cycle of which they are part.

    For most objects, "has finalizer" is synonymous with "has a __del__
    method". However, a generator has a finalizer once it's started
    running and before it finishes: basically, while it has stack frames
    associated with it**.

    When a client disconnects mid-stream, we get a memory leak. We have
    our SegmentedIterable object (call it "si"), and its associated
    generator. si.app_iter is the generator, and the generator closes over
    si, so we have a cycle; and the generator has started but not yet
    finished, so the generator needs finalization; hence, the garbage
    collector won't ever clean it up.

    The socket leak comes in because the generator *also* refers to the
    request's WSGI environment, which contains wsgi.input, which
    ultimately refers to a _socket object from the standard
    library. Python's _socket objects only close their underlying file
    descriptor when their reference counts fall to 0***.

    This commit makes SegmentedIterable.close() call
    self.app_iter.close(), thereby unwinding its generator's stack and
    making it eligible for garbage collection.

    * in Python < 3.4, at least. See PEP 442.

    ** see PyGen_NeedsFinalizing() in Objects/genobject.c and also
       has_finalizer() in Modules/gcmodule.c in Python.

    *** see sock_dealloc() in Modules/socketmodule.c in Python. See
        sock_close() in the same file for the other half of the sad story.

    This closes CVE-2016-0738.

    Closes-Bug: 1493303

    Co-Authored-By: Kota Tsuyuzaki <email address hidden>

    Change-Id: Ib86c4c45641485ce1034212bf6f53bb84f02f612