perf: Requests get slower, when queues have a lot of messages

Bug #1216950 reported by Malini Kamalambal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zaqar
Fix Released
Critical
Kurt Griffiths

Bug Description

The requests are taking long to complete, when the queue has a lot of pre-existing messages.

Steps to Repro:
1. Run the tsung benchmark tests
2. Run a tsung test to pre-load the db with a lot of messages , on a specific queue
3. Re-run the tsung benchmark tests

Expected Results:
There will no significant perf degradation between Steps 1 & 3.

Actual Results:
The response times get slower in Step 3 (compared to Step 1)

Kurt Griffiths (kgriffs)
Changed in marconi:
status: New → Confirmed
importance: Undecided → Critical
assignee: nobody → Kurt Griffiths (kgriffs)
Revision history for this message
Ozgur Akan (ozgurakan) wrote :

When we have 500K messages, claim messages takes up to 12 seconds to return. On mongodb site it is 1-2 seconds run time.

These are some of the slow queries;

{ "$query" : { "q" : "q19" , "p" : "806067" , "c.e" : { "$lte" : { "$date" : "2013-08-27T02:03:51.366Z"}} , "e" : { "$gt" : { "$date" : "2013-08-27T02:03:51.366Z"}}} , "$hint" : { "q" : 1 , "p" : 1 , "k" : 1 , "e" : 1 , "c.e" : 1} , "$orderby" : { "k" : 1}}
{ "$query" : { "q" : "q1" , "p" : "806067" , "c.e" : { "$lte" : { "$date" : "2013-08-27T02:03:51.085Z"}} , "e" : { "$gt" : { "$date" : "2013-08-27T02:03:51.085Z"}}} , "$hint" : { "q" : 1 , "p" : 1 , "k" : 1 , "e" : 1 , "c.e" : 1} , "$orderby" : { "k" : 1}}
{ "$query" : { "q" : "q13" , "p" : "806067" , "c.e" : { "$lte" : { "$date" : "2013-08-27T02:03:50.910Z"}} , "e" : { "$gt" : { "$date" : "2013-08-27T02:03:50.910Z"}}} , "$hint" : { "q" : 1 , "p" : 1 , "k" : 1 , "e" : 1 , "c.e" : 1} , "$orderby" : { "k" : 1}}
{ "$query" : { "q" : "q14" , "p" : "806067" , "c.e" : { "$lte" : { "$date" : "2013-08-27T01:58:14.643Z"}} , "e" : { "$gt" : { "$date" : "2013-08-27T01:58:14.643Z"}}} , "$hint" : { "q" : 1 , "p" : 1 , "k" : 1 , "e" : 1 , "c.e" : 1} , "$orderby" : { "k" : 1}}
{ "$query" : { "q" : "q5" , "p" : "806067" , "c.e" : { "$lte" : { "$date" : "2013-08-27T02:03:50.925Z"}} , "e" : { "$gt" : { "$date" : "2013-08-27T02:03:50.925Z"}}} , "$hint" : { "q" : 1 , "p" : 1 , "k" : 1 , "e" : 1 , "c.e" : 1} , "$orderby" : { "k" : 1}}

Details on one of these (took 940ms to run);
{
        "op" : "query",
        "ns" : "marconi.messages",
        "query" : {
                "$query" : {
                        "q" : "q9",
                        "p" : "806067",
                        "c.e" : {
                                "$lte" : ISODate("2013-08-27T02:03:51.072Z")
                        },
                        "e" : {
                                "$gt" : ISODate("2013-08-27T02:03:51.072Z")
                        }
                },
                "$hint" : {
                        "q" : 1,
                        "p" : 1,
                        "k" : 1,
                        "e" : 1,
                        "c.e" : 1
                },
                "$orderby" : {
                        "k" : 1
                }
        },
        "cursorid" : NumberLong("2177673334012593318"),
        "ntoreturn" : 10,
        "ntoskip" : 0,
        "nscanned" : 21255,
        "keyUpdates" : 0,
        "numYield" : 20,
        "lockStats" : {
                "timeLockedMicros" : {
                        "r" : NumberLong(906938),
                        "w" : NumberLong(0)
                },
                "timeAcquiringMicros" : {
                        "r" : NumberLong(920436),
                        "w" : NumberLong(20696)
                }
        },
        "nreturned" : 10,
        "responseLength" : 1510,
        "millis" : 940,
        "ts" : ISODate("2013-08-27T02:03:52.140Z"),
        "client" : "10.177.38.136",
        "allUsers" : [ ],
        "user" : ""
}

Revision history for this message
Kurt Griffiths (kgriffs) wrote :

I am going to try try emoving range query on expiration and removing thresholding from GC, then just relying on the GC to clean up expired messages.

Changed in marconi:
milestone: none → havana-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to marconi (master)

Fix proposed to branch: master
Review: https://review.openstack.org/44340

Changed in marconi:
status: Confirmed → In Progress
Kurt Griffiths (kgriffs)
summary: - Requests get slower, when queues have a lot of messages
+ perf: Requests get slower, when queues have a lot of messages
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to marconi (master)

Reviewed: https://review.openstack.org/44340
Committed: http://github.com/stackforge/marconi/commit/72974b7be1ed66476ab7027bb5b3eb6c9c3e2340
Submitter: Jenkins
Branch: master

commit 72974b7be1ed66476ab7027bb5b3eb6c9c3e2340
Author: kgriffs <email address hidden>
Date: Wed Aug 28 20:30:08 2013 -0500

    fix: Requests get slower when queues have a lot of messages

    The primary culprit was an additional scan operation
    in the MongoDB driver's _list(...) method, triggered by
    a range query on the message's expiration date.

    This patch removes that field from the index, so the filtering
    is done using a single scan pass. It also removes 'e' from the
    index used for counting messages, since in testing it was
    slightly faster to leave it out (and should also slightly speed
    up insertion time since it's once less index field that mongo
    has to worry about.)

    Also in this patch, the order of the project and
    queue name fields were swapped in the index in order
    to optimize for selectivity. In addition, some minor
    edits were made for style and correctness.

    Finally, this patch also removes thresholding from
    the GC so that we don't have expired messages hanging
    around for a long time that have to be filtered out.

    Change-Id: Ie4bd125e966612f4a8022fd6af133314d05fe428
    Closes-Bug: #1216950

Changed in marconi:
status: In Progress → Fix Committed
Changed in marconi:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.