Comment 506 for bug 595047

Revision history for this message
In , kernel (kernel-linux-kernel-bugs) wrote :

(In reply to comment #470)
> (In reply to comment #469)
> >
> > 1. The file cache is *very* aggressive, even pushing out to swap stuff I
> think
> > I might be using.
> >
>
> Now, I'm not a kernel hacker, but a programmer afterall, and to me it seems
> to
> be a an easier job to fix the aggressive file cache than to fix this "large
> I/O
> operations ......"-thing - which is not at all that concrete and varies over
> platforms, machine specs etc.

Isn't there already a knob for controlling the kernel's preference for swapping anonymous pages out to disk versus retaining cached/buffered block-device pages?

/proc/sys/vm/swappiness — http://kerneltrap.org/node/3000

Our apps are appearing to hang because their GUI threads have stalled while waiting on pages (containing either executable code or auxiliary data like pixmaps) to come back into RAM from the disk. Reading those pages back in is taking forever because the disk queue is full of writes. The situation is worsened because reading the pages is not pipelined since the requests are being submitted from the page fault handler, so a program executing while huge disk activity is in progress will submit a request to load one page from disk and stall; then when that request is fulfilled, the program will execute a few hundred instructions more until its instruction pointer crosses into another page that isn't loaded from disk, whereupon the page fault handler will be invoked again, a new request will be submitted to the disk queue, and the application will hang again. Repeat ad infinitum. Meanwhile, while the program is stalled waiting for the page it needs to be loaded in from disk, all the rest of its pages are being evicted from RAM to make room for the huge disk buffers, thus perpetuating the problem.

I would think the easiest and most reliable solution to this problem would be for the kernel to prefer fulfilling page-in requests ahead of dirtying blocks. If there are any requests to read pages in from disk to satisfy page faults, those requests should be fulfilled and a process's request to dirty a new page should be blocked. In other words, as dirty blocks are flushed to disk, thus freeing up RAM, the process performing the huge write shouldn't be allowed to dirty another block (thus consuming that freed RAM) if there are page-ins waiting to be fulfilled.