snapshots very slow when there is a long directory listing

Bug #1185376 reported by James Tunnicliffe
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linaro-license-protection
Fix Released
High
James Tunnicliffe

Bug Description

http://snapshots.linaro.org/openembedded/sources for example takes >1 minute to display.

Related branches

Changed in linaro-license-protection:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
James Tunnicliffe (dooferlad) wrote :

See the problem locally when have a large quantity of data in a directory. Just having 1000 small files doesn't slow it down much, but 50 large ones does. Can recreate by doing this in a directory under sampleroot and running the dev server:

#!/usr/bin/python

for index in range(0, 50):
    blob = "0" * 1024 * 20
    with open("file_%d.txt" % index, "w") as f:
        f.write("file_%d.txt" % index)
        chunk = 0

        while chunk < 10000:
            chunk += 1
            f.write(blob)

Just need to profile to find out what the issue is. Given the quantity of data seems to be key I suspect that on each listing all the files in the directory are being read completely.

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

Ignore comment #1.

Revision history for this message
James Tunnicliffe (dooferlad) wrote :
Download full text (3.2 KiB)

*** PROFILER RESULTS ***
file_server (/home/dooferlad/dev/license-protection/bigdir/license_protected_downloads/views.py:395)
function called 1 times

         118210541 function calls (118202699 primitive calls) in 84.096 seconds

   Ordered by: cumulative time, internal time, call count
   List reduced from 990 to 40 due to restriction <40>

   ncalls tottime percall cumtime percall filename:lineno(function)
        1 0.003 0.003 84.096 84.096 views.py:395(file_server)
        1 0.024 0.024 83.456 83.456 views.py:66(dir_list)
      434 0.006 0.000 83.364 0.192 buildinfo.py:10(__init__)
      434 0.199 0.000 83.045 0.191 buildinfo.py:58(getInfoForFile)
      433 0.005 0.000 83.005 0.192 views.py:223(is_protected)
    94453 0.340 0.000 82.588 0.001 fnmatch.py:24(fnmatch)
    94453 0.272 0.000 82.219 0.001 fnmatch.py:67(fnmatchcase)
    93242 0.653 0.000 58.888 0.001 re.py:226(_compile)
    90616 0.074 0.000 58.771 0.001 re.py:188(compile)
    92416 0.454 0.000 57.999 0.001 sre_compile.py:495(compile)
    92416 0.395 0.000 37.048 0.000 sre_parse.py:663(parse)
92421/92416 0.209 0.000 36.218 0.000 sre_parse.py:301(_parse_sub)
92423/92416 12.157 0.000 35.925 0.000 sre_parse.py:379(_parse)
    90609 6.842 0.000 23.030 0.000 fnmatch.py:81(translate)
    92416 0.265 0.000 20.197 0.000 sre_compile.py:480(_code)
  7969627 15.356 0.000 16.175 0.000 re.py:204(escape)
  8582145 4.500 0.000 15.046 0.000 sre_parse.py:201(get)
    92416 8.838 0.000 14.642 0.000 sre_compile.py:361(_compile_info)
  8764750 9.121 0.000 10.832 0.000 sre_parse.py:182(__next)
92436/92416 3.820 0.000 5.281 0.000 sre_compile.py:32(_compile)
  8036564 3.655 0.000 4.777 0.000 sre_parse.py:138(append)
92448/92428 3.798 0.000 3.876 0.000 sre_parse.py:140(getwidth)
 32908351 3.593 0.000 3.593 0.000 {method 'append' of 'list' objects}
  1359340 2.356 0.000 3.053 0.000 sre_parse.py:257(_escape)
27486205/27486190 2.669 0.000 2.669 0.000 {len}
  7944167 0.848 0.000 0.848 0.000 {ord}
  7969759 0.820 0.000 0.820 0.000 {method 'join' of 'str' objects}
  2824089 0.496 0.000 0.496 0.000 {method 'get' of 'dict' objects}
   543357 0.181 0.000 0.285 0.000 sre_parse.py:195(match)
    92416 0.086 0.000 0.269 0.000 sre_parse.py:178(__init__)
    97988 0.190 0.000 0.266 0.000 posixpath.py:68(join)
      434 0.115 0.000 0.265 0.001 buildinfo.py:108(parseData)
    92416 0.214 0.000 0.214 0.000 {_sre.compile}
        1 0.000 0.000 0.199 0.199 __init__.py:14(render_to_response)
        1 0.000 0.000 0.199 0.199 loader.py:158(render_to_string)
      2/1 0.000 0.000 0.195 0.195 base.py:136(render)
      3/1 0.000 0.000 0.195 0.195 base.py:133(_render)
   1305/1 0.015 0.000 0.195 0.195 base.py:819(render)
   2190/1 0.002 0.000 0.193 0.193 debug.py:72(render_node)
     ...

Read more...

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

For every file in the directory we look at every key in BUILDINFO.txt. Since this BUILDINFO.txt has 435 entries in it and there are 435 files in the directory, this results in many calls to fnmatch (94453), which eats 82 seconds of CPU time...

Revision history for this message
James Tunnicliffe (dooferlad) wrote :
Download full text (3.3 KiB)

Think I have a fix. Will push a MP:

*** PROFILER RESULTS ***
file_server (/home/dooferlad/dev/license-protection/bigdir/license_protected_downloads/views.py:395)
function called 1 times

         824015 function calls (816173 primitive calls) in 0.786 seconds

   Ordered by: cumulative time, internal time, call count
   List reduced from 985 to 40 due to restriction <40>

   ncalls tottime percall cumtime percall filename:lineno(function)
        1 0.003 0.003 0.786 0.786 views.py:395(file_server)
        1 0.016 0.016 0.552 0.552 views.py:66(dir_list)
      433 0.004 0.000 0.305 0.001 views.py:223(is_protected)
      434 0.004 0.000 0.290 0.001 buildinfo.py:10(__init__)
      434 0.116 0.000 0.256 0.001 buildinfo.py:122(parseData)
        1 0.000 0.000 0.178 0.178 __init__.py:14(render_to_response)
        1 0.000 0.000 0.177 0.177 loader.py:158(render_to_string)
      433 0.002 0.000 0.173 0.000 models.py:11(all_with_hashes)
[29/May/2013 14:34:26] "GET /sources HTTP/1.1" 200 228034
      2/1 0.000 0.000 0.173 0.173 base.py:136(render)
      3/1 0.000 0.000 0.173 0.173 base.py:133(_render)
   1305/1 0.012 0.000 0.173 0.173 base.py:819(render)
   2190/1 0.002 0.000 0.171 0.171 debug.py:72(render_node)
        1 0.000 0.000 0.171 0.171 loader_tags.py:100(render)
        1 0.000 0.000 0.168 0.168 loader_tags.py:48(render)
        1 0.009 0.009 0.165 0.165 defaulttags.py:131(render)
      433 0.001 0.000 0.155 0.000 query.py:616(filter)
      433 0.002 0.000 0.154 0.000 query.py:630(_filter_or_exclude)
     3468 0.014 0.000 0.109 0.000 debug.py:82(render)
      433 0.002 0.000 0.105 0.000 query.py:853(_clone)
      433 0.009 0.000 0.102 0.000 query.py:235(clone)
   188790 0.087 0.000 0.087 0.000 buildinfo.py:42(_set)
5196/1732 0.018 0.000 0.083 0.000 copy.py:145(deepcopy)
     1302 0.003 0.000 0.056 0.000 defaulttags.py:269(render)
  866/433 0.005 0.000 0.045 0.000 copy.py:234(_deepcopy_tuple)
      433 0.004 0.000 0.044 0.000 query.py:1219(add_q)
     6074 0.009 0.000 0.037 0.000 base.py:568(resolve)
     3509 0.009 0.000 0.036 0.000 functional.py:170(wrapper)
      433 0.008 0.000 0.035 0.000 query.py:1030(add_filter)
      433 0.002 0.000 0.035 0.000 copy.py:306(_reconstruct)
     5196 0.033 0.000 0.035 0.000 copy.py:267(_keep_alive)
   193462 0.032 0.000 0.032 0.000 {method 'strip' of 'str' objects}
        1 0.000 0.000 0.029 0.029 bzr_version.py:10(get_my_bzr_revno)
     4384 0.028 0.000 0.028 0.000 {posix.stat}
        1 0.000 0.000 0.027 0.027 branch.py:177(open)
     4778 0.004 0.000 0.026 0.000 base.py:717(resolve)
        1 0.000 0.000 0.026 0.026 bzrdir.py:1070(open_branch)
        1 0.000 0.000 0.026 0.026 branch.py:1995(open)
     3466 0.008 0.000 0.025 0.000 html.py:32(escape)
 ...

Read more...

Changed in linaro-license-protection:
assignee: nobody → James Tunnicliffe (dooferlad)
status: Confirmed → In Progress
Revision history for this message
Milo Casagrande (milo) wrote :

Un-marking as duplicate: bug 1183411 was due to another problem, but I witnessed this issue while investigating the other one.

Changed in linaro-license-protection:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.