Web worker can't recover from MongoDB Failover

Bug #1361346 reported by Kurt Griffiths
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zaqar
Invalid
High
Unassigned

Bug Description

Using uwsgi, once a primary fails over, it would appear that the workers never recover. This sort of error is repeated again and again.

Only recourse is to kill -HUP `cat /tmp/uwsgi-master.pid`

uri=mongodb://10.184.3.182,10.184.3.178,10.184.3.183/?replicaSet=bench&w=majority&socketTimeoutMS=5000&connectTimeoutMS=5000

2014-08-25 19:56:08.980 14829 ERROR zaqar.queues.storage.mongodb.utils [-] No primary available
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils Traceback (most recent call last):
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils File "/root/code/marconi/zaqar/queues/storage/mongodb/utils.py", line 254, in wrapper
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils return func(*args, **kwargs)
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils File "/root/code/marconi/zaqar/queues/storage/mongodb/queues.py", line 248, in create
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils 'c': counter})
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 415, in insert
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils self.uuid_subtype, client)
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_replica_set_client.py", line 1493, in _send_message
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils member = self.__find_primary()
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_replica_set_client.py", line 1291, in __find_primary
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils raise AutoReconnect(rs_state.error_message)
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils AutoReconnect: No primary available
2014-08-25 19:56:08.980 14829 TRACE zaqar.queues.storage.mongodb.utils
2014-08-25 19:56:08.981 14829 ERROR zaqar.queues.transport.wsgi.v1_0.queues [-]
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues Traceback (most recent call last):
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues File "/root/code/marconi/zaqar/queues/transport/wsgi/v1_0/queues.py", line 44, in on_put
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues queue_name, project=project_id)
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues File "/root/code/marconi/zaqar/common/pipeline.py", line 99, in consumer
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues result = target(*args, **kwargs)
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues File "/root/code/marconi/zaqar/queues/storage/mongodb/utils.py", line 257, in wrapper
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues raise storage_errors.ConnectionError()
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues ConnectionError
2014-08-25 19:56:08.981 14829 TRACE zaqar.queues.transport.wsgi.v1_0.queues

Revision history for this message
Flavio Percoco (flaper87) wrote :

Is this still happening? Anyway we can test this?

Changed in zaqar:
status: New → Incomplete
importance: Undecided → High
Changed in zaqar:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.