LR should be faster when getting file stats

Bug #476711 reported by Facundo Batista
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu One Client
Fix Released
High
Natalia Bidart
ubuntuone-client (Ubuntu)
Fix Released
Medium
dobey

Bug Description

os.listdir() + for f in that: os.stat(f) is kind of slow for a lot of files.

We could use something like readdir to access that info directly.

Related branches

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 476711] [NEW] LR should be faster when getting file stats

On Fri, 2009-11-06 at 17:33 +0000, Facundo Batista wrote:
> Public bug reported:
>
> os.listdir() + for f in that: os.stat(f) is kind of slow for a lot of
> files.
>
> We could use something like readdir to access that info directly.

For optimal access to this, use bzrlib's functions:
bzrlib.osutils._walkdirs_utf8 and

Ciao,
Rob

tags: added: chicharra-lucid-speed
removed: facundo-lucid
tags: added: u1-lucid
Roman Yepishev (rye)
tags: added: metadata-loading
tags: added: chicharra
Changed in ubuntuone-client:
importance: Medium → High
Changed in ubuntuone-client:
assignee: Facundo Batista (facundo) → Naty Bidart (nataliabidart)
status: Confirmed → In Progress
Revision history for this message
Natalia Bidart (nataliabidart) wrote :

Robert,

after analyzing the different options to read a dir, we found out that bzrlib is not faster than using (listdir + os.stat).
Besides, seems like there is no straightforward way of adding bzrlib as a ubuntuone-client dependency, so at first we're leaving the current LR code as is. It's worth noting that the big bottle neck on SyncDaemon startup is the loading of metadata, not the local rescan time.

Below some samples and the results. Please let me know if you think we're missing something!

nessita@miro:~$ cat test_bzrlib.py
from bzrlib._readdir_pyx import UTF8DirReader

dirpath = "/usr/lib"
i = 0
for info in UTF8DirReader().read_dir(dirpath, dirpath):
    i += 1

print "Called bzrlib.UTF8DirReader on %s. Read %i entries." % (dirpath, i)

nessita@miro:~$ cat test_listdir.py
import os

dirpath = "/usr/lib"
i = 0
for something in os.listdir(dirpath):
    fullname = os.path.join(dirpath, something)
    stat_result = os.lstat(fullname)
    i += 1

print "Called listdir + os.lstat on %s. Read %i entries." % (dirpath, i)

nessita@miro:~$ time python test_listdir.py
Called listdir + os.lstat on /usr/lib. Read 2224 entries.

real 0m0.054s
user 0m0.028s
sys 0m0.024s

nessita@miro:~$ time python test_bzrlib.py
Called bzrlib.UTF8DirReader on /usr/lib. Read 2224 entries.

real 0m0.053s
user 0m0.032s
sys 0m0.020s
nessita@miro:~$

Revision history for this message
Natalia Bidart (nataliabidart) wrote :

This report will be closed as won't fix since the improvement that can be done is not significant and, as mentioned in the previous comment, adding a new dependency to ubuntuone-client can be tedious.

The bottleneck on syncdaemon startup is, in fact, the loading of metadata, as it can be seen in the logs:

2010-03-10 11:52:14,697 - ubuntuone.SyncDaemon.fsm - INFO - loading updated metadata
2010-03-10 11:52:52,937 - ubuntuone.SyncDaemon.fsm - INFO - initialized: idx_path: 191, idx_node_id: 191, shares: 7

metadata load: 38~ seconds

2010-03-10 11:52:54,336 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan starting...
.... tons of debug messages ...
2010-03-10 11:52:54,945 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan finished!

local rescan: less than a second!

Revision history for this message
Natalia Bidart (nataliabidart) wrote :

This is log file from Roman:

http://paste.ubuntu.com/392696/

In this case Local rescan takes much longer, because he has UDFs created. I'll try with an UDF of my own.

Revision history for this message
Robert Collins (lifeless) wrote :

:!./test-scan
cold cache
bzr: [2.4035110473632812, 2.3174529075622559, 2.8207039833068848, 2.3048679828643799, 2.360321044921875]
osl: [2.4310698509216309, 2.3457200527191162, 2.2898061275482178, 2.2754058837890625, 2.3426480293273926]
hot cache
bzr: [0.014010165929794311, 0.013987584114074707, 0.014174475193023681]
osl: [0.032505748987197873, 0.032332829952239989, 0.032404668092727663]
cold factor: 1.01294806315
hot factor: 0.432612429371

Some notes on the numbers: these timings are with my /usr/lib which is 2802 files long.
Making a recursive loop is doable too - bzr has a tuned recursive loop though; if you want to see its performance on a whole tree, I suggest using that (I'd be happy to create a profiler for that vs os.walkdirs, for instance).
Roughly, what you see above is:
on completely cold cache, stating files takes a long time.
on hot cache, bzrlib is much faster at getting the results for a bunch of files.

Revision history for this message
Natalia Bidart (nataliabidart) wrote :

In order to add more data to make a decision, I've been gathering logs from people to analyze the trade off between metadata loading time and local rescan running time.

For a "real" user on #ubuntuone (psyphercode), from http://paste.ubuntu.com/393987/

2010-03-12 07:15:59,194 - ubuntuone.SyncDaemon.fsm - INFO - loading updated metadata
2010-03-12 07:54:56,636 - ubuntuone.SyncDaemon.fsm - INFO - initialized: idx_path: 195799, idx_node_id: 14082, shares: 1
2010-03-12 07:54:56,642 - ubuntuone.SyncDaemon.HQ - INFO - HashQueue: _hasher started
2010-03-12 07:54:57,239 - ubuntuone.SyncDaemon.DBus - INFO - DBusInterface initialized.
...
2010-03-12 07:54:57,241 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan starting...
2010-03-12 07:54:57,241 - ubuntuone.SyncDaemon.local_rescan - INFO - start scan all shares
2010-03-12 08:05:48,303 - ubuntuone.SyncDaemon.Main - NOTE - ---- MARK (state: READING; queues: metadata: 0; content: 0; hash: 0, fsm-cache: hit=8 miss=587391) ----
...
2010-03-12 08:10:24,726 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan finished!
2010-03-12 08:10:24,727 - ubuntuone.SyncDaemon.Main - INFO - hash queue empty. We are ready!

 * meatdata loading ~ 40 minutes
 * local rescan ~16 minutes

Revision history for this message
Natalia Bidart (nataliabidart) wrote :

Roman, would you please upload your SyncDaemon logs? I know the startup is slow for you too.

Also, if you get the chance to talk with psyphercode again, would you please ask him the output of:

find ~/.local/share/ubuntuone/syncdaemon/fsm -type f | wc -l

Thank you!

Revision history for this message
Roman Yepishev (rye) wrote :

Mar 11 12:42:49 <rye> psyphercode, find ~/.local/share/ubuntuone/syncdaemon -type f | wc -l
Mar 11 12:43:12 <psyphercode> 384034

Revision history for this message
Natalia Bidart (nataliabidart) wrote :

I've been analyzing logs for <psyphercode> as per what he attached in Bug #539573. He suffered from:

2010-03-15 07:36:33,858 - ubuntuone.SyncDaemon.fsm - INFO - loading updated metadata
2010-03-15 08:21:47,186 - ubuntuone.SyncDaemon.fsm - INFO - initialized: idx_path: 195799, idx_node_id: 16100, shares: 1

-> 45.23 minutes of metadata loading

2010-03-15 08:21:47,869 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan starting...
...
2010-03-15 11:09:19,785 - ubuntuone.SyncDaemon.Main - NOTE - Local rescan finished!

-> 167.53 minutes (2.8 hours) of local rescan

tags: added: package
Changed in ubuntuone-client:
milestone: none → lucid-beta-2
Changed in ubuntuone-client (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Naty Bidart (nataliabidart)
milestone: none → ubuntu-10.04-beta-2
Revision history for this message
Natalia Bidart (nataliabidart) wrote :

After a series of profiling made over Roman's account (thank you!), we've added some tweaks to local rescan code making the run 14% faster.

Revision history for this message
Natalia Bidart (nataliabidart) wrote :
Download full text (11.9 KiB)

We started with a local rescan that took:

         121249081 function calls (121023286 primitive calls) in 665.469 CPU seconds

Latest profile info for Local Rescan shows:

         17240440 function calls (17018317 primitive calls) in 506.979 CPU seconds

And the next bottleneck to tackle is cPickle.load, as per:

   ncalls tottime percall cumtime percall filename:lineno(function)
   354949 180.393 0.001 183.109 0.001 {cPickle.load}
     2581 83.872 0.032 319.376 0.124 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/filesystem_manager.py:507(get_mdobjs_in_dir)
     4539 42.603 0.009 42.603 0.009 {posix.listdir}
    58545 36.384 0.001 36.384 0.001 {posix.remove}
    58476 26.243 0.000 26.243 0.000 {method 'remove' of 'collections.deque' objects}
   354949 16.301 0.000 16.301 0.000 {open}
    37542 12.212 0.000 12.212 0.000 {posix.lstat}
   459845 12.203 0.000 271.947 0.001 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:212(__getitem__)
   886460 10.724 0.000 17.639 0.000 /usr/lib/python2.6/posixpath.py:59(join)
   354949 9.861 0.000 211.277 0.001 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:118(_load_pickle)
   459845 8.220 0.000 11.197 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:292(purge)
   393979 6.764 0.000 23.727 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:67(key_file)
   459845 4.862 0.000 7.537 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:256(__getitem__)
  2375292 3.857 0.000 3.857 0.000 {len}
  1732359 3.702 0.000 3.702 0.000 {method 'startswith' of 'str' objects}
   459845 3.496 0.000 5.244 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:287(update)
   218207 3.465 0.000 46.556 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/filesystem_manager.py:950(changed)
444246/222123 3.455 0.000 4.842 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/filesystem_manager.py:141(__init__)
  1674418 3.345 0.000 3.345 0.000 {method 'endswith' of 'str' objects}
   354949 3.238 0.000 235.622 0.001 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:142(__getitem__)
   354949 2.819 0.000 16.586 0.000 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/file_shelf.py:270(__setitem__)
     2581 2.778 0.001 369.058 0.143 /home/rtg/Projects/ubuntuone/speed-up-that-local-rescan/ubuntuone/syncdaemon/local_rescan.py:182(_get_share_info)
   354813 2.715 0.000 2.715 0.000 /usr/lib/python2.6/os.py:721(_make_stat_result)
   218207 1.992 0.000 3.211 0.000 /usr/lib/python2.6/posixpath.py:79(split)
   900176 1.743 0.000 1.743 0.000 {method 'get' of...

Revision history for this message
Facundo Batista (facundo) wrote :

Naty, the improvement is of 24%.

Getting this advance with this simple change is good enough for Lucid, so go ahead and commit this.

Changed in ubuntuone-client:
status: In Progress → Fix Committed
Changed in ubuntuone-client (Ubuntu):
assignee: Naty Bidart (nataliabidart) → Rodney Dawes (dobey)
Changed in ubuntuone-client (Ubuntu):
status: In Progress → Fix Released
tags: removed: package
Changed in ubuntuone-client:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.