replicator rsync established connections confusing

Bug #1632807 reported by clayg on 2016-10-12
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Undecided
Unassigned

Bug Description

The object replicator makes outgoing connections to other object-server's via HTTP REPLICATE requests.

The object replicator forks out new rsync processes via subprocess.

The object replicator can be doing both of these things at the "same time" in different greenthreads.

When it happens the forked rsync process will inherit the established connections from it's parent and the netstat output looks stupid:

tcp 0 0 172.1.1.14:29880 172.1.1.185:6003 ESTABLISHED 757/rsync
tcp 0 0 172.1.1.14:29879 172.1.1.185:6003 ESTABLISHED 757/rsync
tcp 0 0 172.1.1.14:29881 172.1.1.185:6003 ESTABLISHED 757/rsync
tcp 0 547528 172.1.1.14:54406 172.1.1.185:873 ESTABLISHED 757/rsync

This is taken from the .14 node - the bottom connection from .14:54406 to remote rsyncd port 185:873 is legit - all the other connections to .185:6003 probably got inherited from the parent replicator process doing REPLICATE requests. I think these show'd up because .185 is being slow/stupid.

Darrell suggested we add the `close_fds` kwarg to the Popen call when the replicator fires up the rsync:

https://docs.python.org/2/library/subprocess.html#popen-constructor

clayg (clay-gerrard) wrote :

I think I saw this in another context recently, with a bunch of lingering TIME_WAIT connections between a storage node some other replication object server ports.

Seems like adding the kwarg would be pretty cheap - maybe we could just do it and see if it still works?

tags: added: low-hanging-fruit
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers