Websocketd graceful shutdown support
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenSRF |
Fix Released
|
Wishlist
|
Unassigned |
Bug Description
OpenSRF 3.1
When stopping websocketd-osrf, the flow of data is immediately broken between the client and the websocketd-osrf instance. If a client is in the middle of a request at shutdown time, the client will be disconnected before the response is delivered. Because of this, it's not possible to gracefully "detach" a server from a load-balanced group (e.g. for maintenance) without potentially disrupting clients.
Contrary to what I originally thought, there does not appear to be a way to make websocketd send a signal then wait before closing STDIN on the websocket-osrf instance. (There is however a way to add a delay between closing STDIN and sending SIGTERM, which doesn't help us here).
I propose a graceful shutdown signal, similar to the apache-websockets graceful shutdown signal.
Essentially, we inform the websocketd-osrf back-end instances of a pending websocket shutdown. Once received, the instance will enter shutdown mode, where it continues replying to the client until a gap in the communication opens where no requests or stateful connections are pending, at which point the websockted-osrf back-end instance disconnects the client and shuts itself down.
At this point, the client will detect the severed websocket connection and open a new connection with another available server.
The key differences between this and the apache2-websockets is it can be done without threads (in the main event loop) and we will likely have to send the signal ourselves to the back-end processes (via process group?) instead of signaling websocketd directly, which IIUC does not relay signals.
Changed in opensrf: | |
importance: | Undecided → Wishlist |
status: | New → Confirmed |
assignee: | nobody → Galen Charlton (gmc) |
Changed in opensrf: | |
assignee: | Galen Charlton (gmc) → nobody |
Changed in opensrf: | |
status: | Fix Committed → Fix Released |
osrf-websocket- stdio.c changes pushed to:
http:// git.evergreen- ils.org/ ?p=working/ OpenSRF. git;a=shortlog; h=refs/ heads/user/ berick/ lp1803182- websocketd- graceful- shutdown
This teaches the back-ends to perform a graceful shutdown when receiving a SIGUSR1 signal. I have confirmed the websocketd ignores this signal. I've also confirmed this works as expected:
kill -s USR1 -<websocketd- parent- pid>
Next question is whether we can make these changes to the sample systemd service files and/or if we need to add websocketd stop/start support to osrf_control.