Ring reloads cause ballooning memory

Bug #1910157 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Confirmed
Undecided
Unassigned

Bug Description

Proxy servers need to have all of the rings loaded in their heads to be useful, so we do that as part of the app initialization: https://github.com/openstack/swift/blob/2.26.0/swift/proxy/server.py#L233-L235

When we load them, we do it *off each storage policy*, meaning the ring reference is stuffed in the swift.common.storage_policy.POLICIES singleton.

We also call loadapp before forking off any workers to sanity-check configs: https://github.com/openstack/swift/blob/2.26.0/swift/common/wsgi.py#L1077-L1081 As a result, all object rings should be loaded pre-fork, and they can be shared between workers. This is, by and large, a good thing.

Ring files periodically check whether the on-disk data has changed, and reload it if appropriate: https://github.com/openstack/swift/blob/2.26.0/swift/common/ring/ring.py#L280-L283 This is also good and useful (don't want to be using out-of-date ring files!), but *each worker* must reload the ring and the memory is no longer shared. If you've got beefy boxes and default workers (auto; i.e. `multiprocessing.cpu_count()`), this can inflate the amount of memory used for rings from something measured in tens of MB to something measured in GB.

Note that many background daemons work around this by effectively restarting themselves whenever they see a ring change, and backend servers avoid it by never loading up a ring. If we ever want to change that (there's been talk about having the container-server respond with its ring version as part of sharding leader-election, for example), we'll want to keep this in mind.

clayg (clay-gerrard)
Changed in swift:
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.