Comment 3 for bug 1326105

Revision history for this message
Michi Henning (michihenning) wrote :

> 1. How will the scope-registry handle when either /run/user/[0-9]*/zmq/Registry-s or /run/user/[0-9]*/zmq/Registry-p already exists?

The algorithm is as follows:

- Check if a connect() to the endpoint succeeds (that is, the other accepts the connection). If so, disconnect and report an error stating that another instance of the service is already running at that endpoint. Otherwise, attempt to bind to the endpoint.

If a file is in the way with the same name, the file is unlinked before binding to the endpoint, so the file will just disappear to make way for the socket.

That latter behavior is out of our control and implemented by Zmq.

If a *directory* is in the way, the attempt to bind to the endpoint fails with an "address already in use" error.

I guess we could modify the code to try and recursively remove any directory that is in the way. But I'm reluctant to do that:

- An innocent configuration error during development could potentially blow away most of the filesystem.

- Removal of the directory before binding suffers from race conditions. An unrelated process can put files or directories back there while the removal is in progress.

BTW, the endpoints for the registry are now called zmq/Registry-R, zmq/Registry-s, and zmq/Registry-p.

> 2. In addition to dealing with /run/user/[0-9]*/zmq/c-*-r possibly already existing

The same applies here: any file that exists there already will be blown away. Any *directory* with the same name will cause the bind to fail. But I'm not too worried about that scenario because the c-*-r name contains a UUID that includes 16 bits of pseudo-randomness from a Mersenne twister (the other 16 bits are a counter). It would be difficult to guess the correct name because each process seeds the random number generator differently, using std::random_device (which is implemented using /dev/random).

> For '2', standard defensive programming should also be used, but that isn't enough. I suggested at the sprint that these
> endpoints should be made application specific by their name like with the other endpoints, but was told this is problematic.

I considered using a directory <scope_id>. Something like /run/user/<uid>/zmq/<scope_id>/endpoint-here

It makes it a pain to test though. I guess we could still do this, creating the scope_id directory if it doesn't exist.

Jamie, let me know how strongly you feel about this. If you think we need it, I'll look at implementing it.