/search/devicegroup takes too long

Bug #1407625 reported by Ingeborg Hellemo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Network Administration Visualized
Fix Released
Medium
John-Magne Bredal

Bug Description

/search/devicegroup/<GROUP> takes ages to load.

I have tested the same devicegroup in NAV 3.15.4 (/info/devicegroup/<GROUP>) and the page takes 10 sec to load.

In NAV 4.2.1 it takes 1 minute 30 seconds.

The time is not spend in database search (debug shows all times as almost 0)

Revision history for this message
John-Magne Bredal (john-m-bredal) wrote :

One could argue that waiting for something is not necessarily bad. Especially if what you are waiting for is something good. However, I would guess, the amount of time spent waiting should most probably reflect the amount of joy, fun or other attribute that makes the thing waiting for "good". Otherwise the thrill of waiting would turn in to impatience (or worse). The amount of time needed for this transition to occur, and I now speak based on personal experience, varies quite a lot depending on the person waiting.

In this case, where what you are waiting for is a table on a webpage, I would guess that the Time To Impatience is rather short. I have therefore made quite an effort to make sure that the wait next time is considerably shorter.

Changed in nav:
assignee: nobody → John-Magne Bredal (john-m-bredal)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
John-Magne Bredal (john-m-bredal) wrote :
Changed in nav:
status: In Progress → Fix Committed
Revision history for this message
Ingeborg Hellemo (ingeborg-hellemo) wrote : Re: [Bug 1407625] Re: /search/devicegroup takes too long

Better, but still a long wait. Now the same page takes about 40 seconds to
load.

Revision history for this message
John-Magne Bredal (john-m-bredal) wrote :

It would be interesting to see if it is the fetching of graphite-data that uses that much time. Please apply the following patch, turn on debug-logging for nav.metrics.data and report back any logs you get.

https://nav.uninett.no/hg/stable/rev/3b0fda80032b

Revision history for this message
Ingeborg Hellemo (ingeborg-hellemo) wrote :

Probably something is missing:

NameError at /search/devicegroup/VAKT

global name '_logger' is not defined

Request Method: GET
Request URL: https://nav.uit.no/search/devicegroup/VAKT
Django Version: 1.4.15
Exception Type: NameError
Exception Value:

global name '_logger' is not defined

Exception Location: /usr/local/lib/python2.7/nav/metrics/data.py in
get_metric_average, line 53

Revision history for this message
John-Magne Bredal (john-m-bredal) wrote :

Ack.

There have been changes in the file before the patch. This patch is needed to enable logging. Maybe you need to start on a fresh file, you'll see.

https://nav.uninett.no/hg/stable/rev/59fe7af05ca3

Revision history for this message
Ingeborg Hellemo (ingeborg-hellemo) wrote :

Patch applied:

[Wed Jan 07 13:48:31.224891 2015] [:error] [pid 15924] [Wed Jan 07 13:48:31
2015] [DEBUG] [pid=15924 nav.metrics.data] get_metric_data(['nav.devices...
8<---cut--->8

[Wed Jan 07 13:48:44.703794 2015] [:error] [pid 15924] [Wed Jan 07 13:48:44
2015] [DEBUG] [pid=15924 nav.metrics.data] get_metric_data: returning 486
results
[Wed Jan 07 13:48:44.754908 2015] [:error] [pid 15924] [Wed Jan 07 13:48:44
2015] [DEBUG] [pid=15924 nav.metrics.data] Got metric average for 486 targets
in 0:00:13.532866 seconds

That accounts for 13 seconds out of 44 (this time measured with stopwatch in
stead of one-onethousand-two-onethousand...)

Revision history for this message
John-Magne Bredal (john-m-bredal) wrote :

Thanks for the report! The request is actually sent 3 times, and 3 x 13.5 ~= 40+ which sounds like we've found the problem.

You have a slow Graphite server. If you're not running Graphite on a separate server, I would recomment that.

Anyway, in the name of happy customers and good spirit I have shaved some more (quite a lot I would hope) seconds from the page-load. I'm guessing this patch will get you down to 15 seconds. Let me know if I am right!

https://nav.uninett.no/hg/stable/rev/4c462de025ab

Changed in nav:
milestone: none → 4.2.2
Revision history for this message
Ingeborg Hellemo (ingeborg-hellemo) wrote :

<email address hidden> said:
> You have a slow Graphite server. If you're not running Graphite on a separate
> server, I would recomment that.

We have a separate graphite server with all the bells and whistles (lots of
RAM, SSD)

<email address hidden> said:
> I'm guessing this patch will get you down to 15 seconds. Let me know if I am
> right!

I guess my original python/nav/metrics/data.py is not quite up to date. It
only has 115 lines and the patch fails.

Revision history for this message
Morten Brekkevold (mbrekkevold) wrote :

On Thu, 08 Jan 2015 07:21:54 -0000 Ingeborg Hellemo wrote:

> We have a separate graphite server with all the bells and whistles (lots of
> RAM, SSD)

I don't think it matters much in this case, but you have you set up your
graphite-web instance with a memcache?

>> I'm guessing this patch will get you down to 15 seconds. Let me know if I am
>> right!
>
> I guess my original python/nav/metrics/data.py is not quite up to date. It
> only has 115 lines and the patch fails.

We'll be releasing version 4.2.2 today, which will include John-Magne's
latest changes.

Changed in nav:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.