WebSockets Gateway and JS Library

Bug #1268619 reported by Bill Erickson on 2014-01-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenSRF
Wishlist
Unassigned

Bug Description

Background:

WebSockets are a standards-compliant, cross-browser mechanism to support streaming, bi-directional communication between the browser and the server. Worth a read: http://en.wikipedia.org/wiki/WebSocket

Until recently, OpenSRF supported streaming responses to Mozilla-based browsers via multipart/mixed-replace messages. Apart from only working in XULRunner and Firefox, it suffered other problems which were never fully resolved. (Bug #1004019 is marked as fixed, but it still occurs in other, undiagnosed forms). And now, multipart/mixed messages are officially deprecated by Mozilla (bug #1198983).

With no mechanism for streaming responses, we lose the ability to support long-running, real-time conversations with the client, in which case interfaces, the most obvious one in Evergreen being Vandelay, that rely on long-running streams of responses would need to be redesigned to use a polling mechanism instead.

Also in the context of Evergreen, where much of the staff client was developed even before the streaming translator existed (using the non-streaming gateway), supporting streaming going forward will allow us to continue migrating bulky call-and-response APIs to the network friendlier call-once-and-stream-response style APIs.

Beyond streaming, WebSockets also operates over and always-on connection. This reduces the overall network traffic required to communicate over the long run with the server, which should further reduce the effects of latency, particularly on interfaces / actions where lots of network I/O is needed.

For more of this type of discussion, see my 3-part blog post, starting with
http://blog.esilibrary.com/2013/01/14/exploring-websockets-and-evergreen/

Code:

I have working code for server and client (JS) components.

http://git.evergreen-ils.org/?p=working/OpenSRF.git;a=shortlog;h=refs/heads/collab/berick/websockets

The server runs as an Apache plugin, much like osrf_http_translator and osrf_json_gateway, however it requires a 3rd-party library. To install, see the instructions in the file README.websocket at the base of the working repository.

WebSockets are not (currently) enabled by default, so you have to turn them on in your scripts. To activate the standard, single-browser-tab connections, add this line of code to your opensrf JS script.

OpenSRF.Session.transport = OSRF_TRANSPORT_TYPE_WS;

Once that's set, it will just work. Note that w/ websockets, all communication is asynchronous, so setting a timeout to force synchronous mode will not work.

Very recently I added support for sharing a single websocket connection with multiple browser tabs via Shared Web Workers (http://www.w3.org/TR/workers/#shared-workers-introduction). With this, each client need only open one socket to the server, instead of a socket per browser tab. This will reduce the process overhead on the server-side.

Shared web workers are supported by Chrome, Safari, and Opera. Firefox adds provisional support in version 28 (I've tested this), but it requires an about:config change to activate it, since the feature is still in beta (dom.workers.sharedWorkers.enabled). Given the recent rate of FF releases, I expect it to catch up and support shared workers out of the box fairly soon. In the meantime, we will have to stick to per-tab connections in Firefox.

To activate shared connections, use OpenSRF.Session.transport =OSRF_TRANSPORT_TYPE_WS_SHARED.

TODO:

 * testing, testing, and more testing, particularly in different environments.
 * more docs
 * improve the install process

Bill Erickson (berick) wrote :

Testing with Evergreen:

There is a lot of JS code in Evergreen (dojo libs and UI code) which assume synchronous communication, which does not work with Websockets, as noted above. Because of this, it's not possible to use the OpenSRF WebSocket JS libs as a drop-in replacement for Evergreen.

Testing for now has to happen a lower level, using OpenSRF JS directly or, for the adventurous.. The browser-based staff client prototype code assumes all async and I have confirmed that it works fine with WebSockets.

Bill Erickson (berick) wrote :

Another TODO:

Since we're using Apache, we can prevent new WS connections on a given server the same as we do with gateway/translator connections, via ldirector (etc.) ping files. However, we also need a way to prevent existing clients from sending new requests when an apache server wants to disconnect.

Related, we also need to add support for a timeout-based disconnects. Forcing clients to disconnect after a period of inactivity ensures Apache processes are given a chance to clear the Apache memory pool for the request and, more importantly, will allow the Apache process count to settle down after periods of high activity. In other words, if your browser is sitting idle for an hour, there is no reason it should hog an Apache process.

The client code is already capable to detecting a broken socket and reconnecting. Whats' more, the reconnect process is roughly equivalent to a single XMLHttpRequest, so it's not something the user should generally notice. So, we only need to the server pieces to set a timeout (via apache config) and send a disconnect to the client when the timeout is reached. With this, we can also support graceful shutdown by setting the timeout temporarily very low (say, via signal), which will cause a disconnect of all clients while they are idle, forcing them to reconnect to another server.

Bill Erickson (berick) wrote :

Pushed code to do the following:

* Added support for a configurable inactivity timeout. After each message, inbound and outbound, are processed, we set a last activity time. A separate thread wakes periodically to see if the last activity time occurred beyond the timeout interval. After a period of inactivity, we send a disconnect to the client. This allows the websocket handler to exit and return control to Apache.

* How often we wake to check for inactivity is also configurable.

* Using the inactivity logic, if we receive a SIGUSR1 -- the Apache graceful / reload signal -- a shortened inactivity timeout is applied so that the client can be disconnected as soon as all open conversations are completed.

* Apache configuration for websockets lives in /etc/apache2-websockets/envvars. I put them there because standard Apache configuration directives cannot be created, since our websocket module is not an Apache module proper, but a shared library loaded by an apache module. Also, for reasons that are not clear to me, SetEnv and SetEnvIf failed to pass the environment to our websocket module. Exporting them in envvars works fine, though, and it means we don't have to enable either of the env mods, which would add to the memory requirements. (TODO: add example to repo)

* Javascript client lib now performs the connect dance under the covers, no need to manually initialize.

* When a JS client is disconnected from the server, it will reconnect when needed -- when the next request is sent -- and not automatically, so that we are not unnecessarily hogging Apache processes.

Bill Erickson (berick) wrote :

Pushed settings info to installer (readme.websockets) docs.

Bill Erickson (berick) wrote :

General update:

The Apache gateway code is nominally done, but needs a whole lot of testing, poking, prodding, and sharp stick pointing.

The JS bits still need work to get the multi-tab (shared web worker) and non multi-tab versions in sync. Ditto testing.

Installer process and docs need more work.

Galen Charlton (gmc) on 2014-02-28
Changed in opensrf:
milestone: none → 2.4.0-alpha
Bill Erickson (berick) wrote :

* overhauled the apache config to use a single, stripped-down apache2.conf instead of wrestling with disabling mods, ports.conf, etc. Just copy the file into place and restart.

* Apache setup docs are considerably simpler now. It's practically scriptable at this point.

* removed (at least for now) the trailing-behind single-tab websocket JS client implementation. maintaining 2 different implementations will complicate things and we probably don't want multi-tab clients to have the ability to (trivially) open numerous WS connections, anyway. Single connection (via SharedWorkers) across all tabs is the future.

Regarding SharedWorkers and Firefox. Current stable FF (version 28) requires SharedWorkers to be turned on in about: config (search for "shared"). It will be officially supported (and on by default) in FF version 29.

I believe we're ready for some brave soul to test it out. Here's a simple HTML test.

-------------
<html>
  <head>
    <script src='/js/dojo/opensrf/JSON_v1.js'></script>
    <script src='/js/dojo/opensrf/opensrf.js'></script>
    <script>
      OpenSRF.Session.transport = OSRF_TRANSPORT_TYPE_WS_SHARED;
        var ses = new OpenSRF.ClientSession('open-ils.actor'); // Evergreen
        ses.request({
          method : 'opensrf.system.echo',
          params : ['hello', 'websockets'],
          onresponse : function(r) {
            alert('onresponse() => "' + r.recv().content() + '"');
          },
          oncomplete : function() {
            alert('oncomplete()');
          }
      }).send();
    </script>
  <head>
  <body><h1>If you get alerts, you win!</h1></body>
<html>
----------------

The only thing special about this test is the OSRF_TRANSPORT_TYPE_WS_SHARED bit (and that it's all async).

Note that we don't have to load the websocket JS, because it's loaded as a shared worker from within opensrf.js when activated.

Bill Erickson (berick) wrote :

To recap:

1. Install the OpenSRF WebSockets branch (collab/berick/websockets) like normal.
2. Follow the directions in README.websockets (as root)
3. Put the test script above on your server and see if it works IN CHROME (or Opera; see notes above about FF).
4. Profit

Warren Layton (warren-layton) wrote :

Just a quick note: the closing </head> and </html> tags in the sample HTML above are missing the '/'. Not sure it makes a difference but here's what I used:

-------------
<html>
  <head>
    <script src='/js/dojo/opensrf/JSON_v1.js'></script>
    <script src='/js/dojo/opensrf/opensrf.js'></script>
    <script>
      OpenSRF.Session.transport = OSRF_TRANSPORT_TYPE_WS_SHARED;
        var ses = new OpenSRF.ClientSession('open-ils.actor'); // Evergreen
        ses.request({
          method : 'opensrf.system.echo',
          params : ['hello', 'websockets'],
          onresponse : function(r) {
            alert('onresponse() => "' + r.recv().content() + '"');
          },
          oncomplete : function() {
            alert('oncomplete()');
          }
      }).send();
    </script>
  </head>
  <body><h1>If you get alerts, you win!</h1></body>
</html>
----------------

Bill Erickson (berick) on 2014-05-19
tags: added: pullrequest
Galen Charlton (gmc) wrote :

I've tested and pushed to master (with some squashing). Thanks, Bill!

Galen Charlton (gmc) on 2014-08-20
Changed in opensrf:
importance: Undecided → Wishlist
status: New → Fix Committed
Galen Charlton (gmc) on 2016-02-04
Changed in opensrf:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers