Ubiquity should advise kernel to discard pages from copied files

Bug #197579 reported by John McCabe-Dansted
4
Affects Status Importance Assigned to Milestone
ubiquity (Ubuntu)
Won't Fix
Wishlist
Unassigned

Bug Description

The linux kernel supports the O_STREAMING flag for open(), which suggests to the kernel that the file will only be used once and so the pages should be discarded from the cache as soon as they are used. This could make the performance of the LiveCD much better when ubiquity is copying files.

This would be a trivial improvement if python supported O_STREAMING; perhaps upstream python should be persuaded to support O_STREAMING. See:
http://irclogs.ubuntu.com/2007/04/17/%23ubuntu-kernel.txt

Revision history for this message
TerryG (tgalati4) wrote :

Thanks for your submission. Sounds like a great idea. My last gutsy install was 18 minutes on 2-year old hardware. If we can reduce that even more that would be cool.

Marking as Confirmed.

Changed in ubiquity:
status: New → Confirmed
Revision history for this message
Colin Watson (cjwatson) wrote :

O_STREAMING doesn't actually exist any more. It looks like the modern replacement is something like:

  posix_fadvise(fd, offset, len, POSIX_FADV_DONTNEED);

The downside is that you have to call this after each write. Python doesn't support this at all, although I'm sure it could be done with a trivial extension.

Changed in ubiquity:
importance: Undecided → Wishlist
Revision history for this message
John McCabe-Dansted (gmatht) wrote : POSIX_FADV_DONTNEED

Actually, that could make it even easier, since there is already a tool to set POSIX_FADV_DONTNEED etc. via LD_PRELOAD.

I'll see if I can get
  http://userweb.kernel.org/~akpm/pagecache-management/
packaged and polished.

The question then becomes what policy do we want to implement. In the current tool we can adjust PAGECACHE_MAX_BYTE (It defaults to 256KB, but I'd set it to more like 30MB).

However we may want a different sort of policy, e.g. cache reads but not writes. This would still reduce the pagecache usage by two thirds and would be very good on 1GB machines where filesystem.squashfs can easily fit in memory.

Revision history for this message
TerryG (tgalati4) wrote :

Since policy changes will affect the overall speed of ubiquity for a given set of installation conditions:

Is there a record generated in an ubiquity log file that captures this installation time? If not, could a simple script be added so that interested folks could keep track of installation times? Perhaps something a little more sophisticated than (pseudo code): dmesg | grep "Installation finished" - dmesg | grep "Installation started"

This would be helpful going forward. Faster installs ==> greater adoption.

Revision history for this message
John McCabe-Dansted (gmatht) wrote :

I am not sure this will make the install significantly faster. The current naive implementation of pagecache-management may well make the install slower as it forces each file to be written to disk before we start writing the next file. OTOH, I am sure it can make installs significantly more responsive, as it may prevent GUI code being continually swapped out. I have had cases of machines appearing to be frozen, taking several seconds to respond to any input, during installs.

I was thinking of doing something like
  pagecache-management.sh time cp -r /rofs /target
Just to check that the policy wouldn't actively slow down the install.

Revision history for this message
Colin Watson (cjwatson) wrote :

I'm also sort of worried that it'll multiply up the number of syscalls (never a great idea for straight-through performance on Unix), although you may well be right about responsiveness.

Revision history for this message
Colin Watson (cjwatson) wrote :

I doubt that we'd actually want to use pagecache-management; the copy happens in-process in order that we can display a useful progress bar, and I don't think we'd want to kill the page cache for everything done by that process. A Python extension (or work in the Python core) to expose posix_fadvise would be best.

Revision history for this message
John McCabe-Dansted (gmatht) wrote :

> I'm also sort of worried that it'll multiply up the number of syscalls (never a great idea for straight-through performance on Unix), although you may well be right about responsiveness.

I have been optimising pagecache-management for this task. This does not seem to be a problem with the version in SVN:
  http://code.google.com/p/pagecache-mangagement/
With this version we can use need one extra syscall after the file has been closed to drop the cache.

This benchmarking was done on a Dual Core. Different CPUs might have a larger overhead, OTOH, most of the CPU legwork is going to be done in decompressing squashfs.

> I doubt that we'd actually want to use pagecache-management; the copy happens in-process in order that we can display a useful progress bar,

In some sense the ideal would be to have a utility that communicated with ubiquity but made use of SMP, nice and ionice.

 > and I don't think we'd want to kill the page cache for everything done by that process. A Python extension (or work in the Python core) to expose > posix_fadvise would be best.

I have been benchmarking a few options. One is just to flush writes after several seconds. This seems fairly harmless as ubiquity would be writing to either tmpfs or /target, perhaps more ideal would be to only flush writes to /target so as to reduce the number of syscalls. Since rofs is compressed and /target is decompressed this would reduce page thrashing by a factor of more than three, and on many systems we might want to keep the entirety of rofs in memory anyway.

Revision history for this message
John McCabe-Dansted (gmatht) wrote :

Actually this probably isn't very important, as according to
   http://www.csn.ul.ie/~mel/projects/vm/guide/pdf/understand.pdf
The Linux kernel (now?) has a 2Q file cache replacement policy which will cache pages that have been accessed only once in a small "A1in" cache, keeping the remainder of the cache for files that have recently been accessed twice. For a description of 2Q, see:
  http://www.vldb.org/conf/1994/P439.PDF.

Unfortunately, I cannot find any reference to when Linux switched to a 2Q policy. In 2006 the Linux kernel was still using a file cache replacement policy that was worse than the LRU (where as 2Q is better), see:
   http://nikitadanilov.blogspot.com/2006/10/previous-item.html

Revision history for this message
Phillip Susi (psusi) wrote :

What we really want is POSIX_FADV_NOREUSE, but linux does not currently implement it. Also it won't make a difference during install since there aren't other, more important pages that should be cached instead, so I'm going to close this bug.

Changed in ubiquity (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.