DB connection error at heavy load

Bug #1213099 reported by Malini Kamalambal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zaqar
Fix Released
Critical
Ozgur Akan

Bug Description

The logs show a DB connection error at heavy loads. This causes the server to return a http 503.
Subsequent requests to the server (even light loads) throw the same error.

At the time marconi returns this exception, a test application that inserts/deletes from marconi messages collection runs without any issues.

Exception we get;

2013-08-15 23:38:36.194 16236 ERROR marconi.storage.mongodb.utils [-] ConnectionFailure caught
2013-08-15 23:38:36.194 16236 ERROR marconi.storage.mongodb.utils [-] ConnectionFailure caught
2013-08-15 23:38:36.195 16236 ERROR marconi.transport.wsgi.claims [-] ConnectionFailure caught
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims Traceback (most recent call last):
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims File "/usr/lib/python2.6/site-packages/marconi/transport/wsgi/claims.py", line 70, in on_post
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims **claim_options)
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/utils.py", line 170, in wrapper
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims raise storage_exceptions.ConnectionError(msg)
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims ConnectionError: ConnectionFailure caught
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims
2013-08-15 23:38:36.195 16236 ERROR marconi.transport.wsgi.claims [-] ConnectionFailure caught
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims Traceback (most recent call last):
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims File "/usr/lib/python2.6/site-packages/marconi/transport/wsgi/claims.py", line 70, in on_post
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims **claim_options)
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/utils.py", line 170, in wrapper
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims raise storage_exceptions.ConnectionError(msg)
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims ConnectionError: ConnectionFailure caught
2013-08-15 23:38:36.195 16236 TRACE marconi.transport.wsgi.claims
[pid: 16236|app: 0|req: 11691/67781] 10.183.251.3 () {54 vars in 919 bytes} [Thu Aug 15 23:38:31 2013] POST /v1/queues/q4/claims?limit=1 => generated 135 bytes in 5014 msecs (HTTP/1.1 503) 3 headers in 121 bytes (3 switches on core 1969)

Ozgur Akan (ozgurakan)
description: updated
Revision history for this message
Kurt Griffiths (kgriffs) wrote :

If you repro this with the MMS agent installed, do the stats show any anomalies right before you start getting lots of connection errors?

Revision history for this message
Ozgur Akan (ozgurakan) wrote :
Download full text (6.1 KiB)

Logs after the log patch.

2013-08-16 13:05:46.621 17363 ERROR marconi.storage.mongodb.utils [-] No primary available
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils Traceback (most recent call last):
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/utils.py", line 165, in wrapper
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils return func(*args, **kwargs)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/messages.py", line 645, in delete
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils self._col.remove(query['_id'], w=0)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/newrelic-1.13.1.31/newrelic/api/object_wrapper.py", line 237, in __call__
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils self._nr_instance, args, kwargs, **self._nr_kwargs)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/newrelic-1.13.1.31/newrelic/api/function_trace.py", line 93, in literal_wrapper
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils return wrapped(*args, **kwargs)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 563, in remove
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils options, self.__uuid_subtype), safe)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1344, in _send_message
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils member = self.__find_primary()
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/pymongo/mongo_replica_set_client.py", line 1164, in __find_primary
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils raise AutoReconnect(rs_state.error_message)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils AutoReconnect: No primary available
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils
2013-08-16 13:05:46.621 17363 ERROR marconi.storage.mongodb.utils [-] No primary available
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils Traceback (most recent call last):
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/utils.py", line 165, in wrapper
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils return func(*args, **kwargs)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib/python2.6/site-packages/marconi/storage/mongodb/messages.py", line 645, in delete
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils self._col.remove(query['_id'], w=0)
2013-08-16 13:05:46.621 17363 TRACE marconi.storage.mongodb.utils File "/usr/lib64/python2.6/site-packages/newrelic-1.13.1.31/newre...

Read more...

Kurt Griffiths (kgriffs)
Changed in marconi:
importance: Undecided → High
milestone: none → havana-3
assignee: nobody → Flavio Percoco (flaper87)
status: New → Incomplete
status: Incomplete → Confirmed
Kurt Griffiths (kgriffs)
Changed in marconi:
importance: High → Critical
Revision history for this message
Ozgur Akan (ozgurakan) wrote :

This is related to https://bugs.launchpad.net/marconi/+bug/1216950
DB gets so slow that we start to get errors.

Revision history for this message
Ozgur Akan (ozgurakan) wrote :

with the latest 204 and large queue patches in place, I can't reproduce this bug.

Changed in marconi:
assignee: Flavio Percoco (flaper87) → Ozgur Akan (ozgurakan)
status: Confirmed → In Progress
Revision history for this message
Ozgur Akan (ozgurakan) wrote :

waiting for related patches to be merged.

Revision history for this message
Kurt Griffiths (kgriffs) wrote :

Changing to "invalid" (no repro after recent patches merged)

Changed in marconi:
status: In Progress → Fix Released
status: Fix Released → Invalid
Ozgur Akan (ozgurakan)
Changed in marconi:
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers