Comment 603 for bug 620074

Revision history for this message
In , perlover (perlover-linux-kernel-bugs) wrote :

Continuing of post #571

Sorry, my English is not good as i want :)

Now i have Fedora Core with 3.3.2-6.fc16.x86_64 kernel. My server has 48Gb memory and hardware RAID1 array.

Now i use my server with settings (good settings for me):

echo 1000 > /proc/sys/vm/dirty_writeback_centisecs
echo 20 > /proc/sys/vm/dirty_background_ratio
echo 9000000 > /proc/sys/vm/dirty_expire_centisecs
echo 30 > /proc/sys/vm/dirty_ratio

Before these settings as i wrote in #571 post i had regulary freezings up to 10-20 seconds every 2-5 minutes. I found that reason of this is writeback phase of dirty pages. During writeback phase (we can see it by "watch -n1 grep -A 1 dirty /proc/vmstat" command as nr_writeback value - written to disk dirty pages now). For example writeback phase can be started by 'sync' command or when will be expired dirty pages in memory (common settings - 30 seconds). If in next time of writeback we have many dirty pages (even 2000-3000 amount) my server has been frozen by this stage.

Now i have a above settings and one day i do 'sync' from crontab (when load is minimum). During this phase my server increase load average from 1-2 up to 80-90 and this doing ~ 1-2 minutes. My system is frozen during 1-2 minutes! In other time ( 24 hours * 60 minutes - 3 minutes ) i have now load average 1-2, no freezings I/O. Before these settings i had load average 8-9. I know that if power of server will be turned off i will have oldest data in disk (up to 24 hours oldest)

I think that system stops I/O for as long as all dirty pages marked as written to disk to be written to disk. I think normal system should not block all I/O and should split write process of dirty pages to times.

And i noticed that i don't have this problem with my second server where same OS, same kernel version and same RAM volume. There is software RAID1 (/dev/md*). During writeback process this server works smoothly. I think there software raid has an other buffer mechanism of writting to disk. So may be somebody from you will test these problems with software raid?

And i think this article will be useful and related with this:

http://lwn.net/Articles/405076/
https://lwn.net/Articles/456904/

But as i understood this feature partly realized in kernel 3.3 but i didn't get a better things with new kernel. As i understood this is developing now.

Sorry for my English

Bye! :)