OAuth dance is broken with TransactionRollbackError

Bug #537525 reported by Stuart Langridge
128
This bug affects 21 people
Affects Status Importance Assigned to Milestone
Ubuntu One Servers
Fix Released
Critical
Stuart Langridge

Bug Description

This bug will affect any user trying to "add their computer" to the their Ubuntu One account. This applies to all Ubuntu One sync services.

The OAuth dance is broken. Requests to /oauth/request/ go through fine, but then the request to /oauth/access/ to get an access token is dying after time with an error at least 80% of the time. The error actually returned to the desktop is a 504 gateway timeout (which will show in ~/.cache/ubuntuone/log/oauth-login.log), because haproxy's 50-second timeout cuts off the request. When the request is run without haproxy in the way, a traceback in the logs appears as below.

This problem prevents anybody from authorising any computer to Ubuntu One.

ERROR:root:Dump of OOPS 1531appserver78651 Unhandled
Traceback (most recent call last):
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/canonical/oops/wsgi.py", line 77, in __call__
    return self.app(environ, response_trapper)
  File "/var/lib/python-support/python2.5/paste/translogger.py", line 67, in __call__
    return self.application(environ, replacement_start_response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/django/core/handlers/wsgi.py", line 243, in __call__
    response = middleware_method(request, response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/django/middleware.py", line 77, in process_response
    transaction.commit()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_manager.py", line 93, in commit
    return self.get().commit()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_transaction.py", line 322, in commit
    self._commitResources()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_transaction.py", line 414, in _commitResources
    rm.tpc_begin(self)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/zope/zstorm.py", line 251, in tpc_begin
    self._store.flush()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/store.py", line 490, in flush
    self._flush_one(obj_info)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/store.py", line 506, in _flush_one
    self._connection.execute(expr, noresult=True)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/databases/postgres.py", line 249, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 235, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/databases/postgres.py", line 259, in raw_execute
    return Connection.raw_execute(self, statement, params)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 317, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 362, in _check_disconnect
    return function(*args, **kwargs)
TransactionRollbackError: could not serialise access due to concurrent update

ERROR:root:OOPSBz2Serializer.dump failed.
Traceback (most recent call last):
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/canonical/oops/serializer.py", line 246, in dump
    self._write_file(dump_filename, oops)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/canonical/oops/serializer.py", line 365, in _write_file
    bits = self._get_bits( data )
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/canonical/oops/serializer.py", line 471, in _get_bits
    result['dur'] = float(meta['request-end'])-float(meta['request-start'])
KeyError: 'request-end'
Traceback (most recent call last):
  File "/usr/lib/python2.5/SocketServer.py", line 464, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.5/SocketServer.py", line 254, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.5/SocketServer.py", line 522, in __init__
    self.handle()
  File "/var/lib/python-support/python2.5/paste/httpserver.py", line 432, in handle
    BaseHTTPRequestHandler.handle(self)
  File "/usr/lib/python2.5/BaseHTTPServer.py", line 316, in handle
    self.handle_one_request()
  File "/var/lib/python-support/python2.5/paste/httpserver.py", line 427, in handle_one_request
    self.wsgi_execute()
  File "/var/lib/python-support/python2.5/paste/httpserver.py", line 287, in wsgi_execute
    self.wsgi_start_response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/ubuntuone/wsgi/middleware/meliae_middleware.py", line 29, in __call__
    return self.app(environ, start_response)
  File "/var/lib/python-support/python2.5/paste/deploy/config.py", line 276, in __call__
    return self.app(environ, start_response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/ubuntuone/wsgi/middleware/opstats.py", line 45, in __call__
    return self.app(environ, ResponseTrapper(start_response))
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/ubuntuone/wsgi/middleware/bzrheader.py", line 18, in __call__
    return self.app(environ, ResponseTrapper(start_response))
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/canonical/oops/wsgi.py", line 77, in __call__
    return self.app(environ, response_trapper)
  File "/var/lib/python-support/python2.5/paste/translogger.py", line 67, in __call__
    return self.application(environ, replacement_start_response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/django/core/handlers/wsgi.py", line 243, in __call__
    response = middleware_method(request, response)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/django/middleware.py", line 77, in process_response
    transaction.commit()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_manager.py", line 93, in commit
    return self.get().commit()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_transaction.py", line 322, in commit
    self._commitResources()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/transaction/_transaction.py", line 414, in _commitResources
    rm.tpc_begin(self)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/zope/zstorm.py", line 251, in tpc_begin
    self._store.flush()
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/store.py", line 490, in flush
    self._flush_one(obj_info)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/store.py", line 506, in _flush_one
    self._connection.execute(expr, noresult=True)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/databases/postgres.py", line 249, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 235, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/databases/postgres.py", line 259, in raw_execute
    return Connection.raw_execute(self, statement, params)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 317, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/srv/ubuntuone.com/production/ubunet-rev-2411/utilities/../lib/storm/database.py", line 362, in _check_disconnect
    return function(*args, **kwargs)
TransactionRollbackError: could not serialise access due to concurrent update

Changed in ubuntuone-servers:
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Stuart Langridge (sil)
visibility: private → public
tags: added: desktop+ u1-lucid
description: updated
Revision history for this message
Tim Cole (tcole) wrote :

So, the issue turned out to be that postgres transactions were being held open in ubuntuone.oauth.store.fetch_access_token while we interacted with CouchDB, as part of processing the oauth_token_grant signal. This resulted in long-lived transactions which would conflict with others, producing TransactionRollbackErrors on the database side, and gateway timeouts on the proxy side.

Revision history for this message
Tim Cole (tcole) wrote :

The temporary fix we cowboyed into production was to disable the cloud_server event handlers.

Long-term, we need to make these event handlers asynchronous so that we are less vulnerable to CouchDB slowness/issues.

Also, doing significant IO in a transaction is bad. :)

Revision history for this message
John O'Brien (jdobrien) wrote :

Changes to the way events are handled in couchdb have corrected this. No more cowboy.

Changed in ubuntuone-servers:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.