Eventlet 0.22.0+ changed how graceful shutdowns work

Bug #1792615 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
New
Undecided
Unassigned

Bug Description

This causes failures in test/probe/test_signals.py that look like

ERROR: test_account_container_reload (test.probe.test_signals.TestWSGIServerProcessHandling)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vagrant/swift/test/probe/test_signals.py", line 104, in test_account_container_reload
    self._check_reload(server, node['ip'], node['port'])
  File "/vagrant/swift/test/probe/test_signals.py", line 78, in _check_reload
    conn.send(body)
  File "/usr/lib/python2.7/httplib.py", line 858, in send
    self.sock.sendall(data)
  File "/usr/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 32] Broken pipe

The issue boils down to a change in how eventlet handles graceful shutdowns. Previously (following [1]), our server processes would continue handling requests indefinitely from any open connections and wait for the client to close the connection before completely shutting down. Now (with [2]) eventlet will close the connections between pipelined requests. This followed some bug reports [3][4] complaining about long-lived server processes (which we've "solved" with swift-orphans), and it seems to be consistent with how (some?) other HTTP servers behave [5].

At the moment, this presents a slight inconvenience to clients (who will find their connection unexpectedly closed; there's no Connection: close header being sent), but they should be tolerant of such errors and know to get a new connection and retry the request. This should not greatly impact proxy <-> backend communications at the moment, as backend requests are not pipelined.

This definitely *is* a pitfall that needs to be considered for protocol enhancements that would add pipelined requests, however. In particular, replacing the multipart-MIME protocol with PUT/POST/POST or STAGE/UPDATE/COMMIT [6] should probably keep this in mind and (1) not require any in-process state between requests and (2) be willing to try a new connection to the same backend server before declaring the Putter failed. Otherwise, graceful restarts will have an increased risk of PUT failures and/or creating non-durable fragments.

[1] https://github.com/eventlet/eventlet/commit/1c30e9b
[2] https://github.com/eventlet/eventlet/commit/7f53465
[3] https://github.com/eventlet/eventlet/issues/188
[4] https://bugs.launchpad.net/keystone/+bug/1408612
[5] http://mailman.nginx.org/pipermail/nginx/2016-October/052202.html
[6] https://review.openstack.org/#/c/427911/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/602526
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=050f8799ca82f121f9d33c7e773b982b9763f074
Submitter: Zuul
Branch: master

commit 050f8799ca82f121f9d33c7e773b982b9763f074
Author: Tim Burke <email address hidden>
Date: Fri Sep 14 04:42:34 2018 +0000

    Use latest eventlet in probe tests

    Note that eventlet 0.22.0+ closes connections between requests when
    it stops accepting connections.

    Partial-Bug: #1792615
    Change-Id: Ia8d9ab95e2aad40e8d797acc3423a917e809ffdb

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.