Comment 527 for bug 620074

Revision history for this message
In , michiel (michiel-linux-kernel-bugs) wrote :

To tackle this bug, there needs to be deep digging by the people who have these bugs, or good debug data has to be generated. And good info has to be given on the system.

Because there can be serveral bugs out there with the same symptoms as this one. To solve this bug, the best you could do individual bug reports with complete information. If you cannot give complete information, don't post that report, because then you are sure it cannot be solved. The more relevant info we get, the easier it becomes to detect the problems.

First install the newest kernel. Because that has the newest code and it will reduce the change that you'll run into an old and fixed bug. On time of writing it's: 2.6.36. Then test again, if it still happens, file a bug report.

First give correct system information:
Kernel: uname -a and cat /proc/version
Architecture: also from uname -a
Distro: name and version (could be handy for distro specific patches)
CPU info: cat /proc/cpuinfo | grep -e '\(model name\|bogomips\|MHz\|flags\)'
Mem info: cat /proc/meminfo | grep MemTotal
IO scheduler used: cat /sys/block/sdX/queue/scheduler

harddisk configuration: has raid, type of disks, speed of disk, partitions used and filessystems used

harddisk speed by hdparm:
hdparm -tT --direct /dev/sdX
hdparm -tT /dev/sdX

give dumps of the following commands:
lshw
dmesg
lsmod
cat /proc/swaps
cat /proc/meminfo
cat /proc/cmdline
cat /proc/config.gz | gunzip -

and give dumps of the following files:
for every disk:
 /sys/block/<disk>/queue/*
 /sys/block/<disk>/queue/iosched/*
/proc/sys/vm/*

This is for information, so the developers can detect what configuration the system has. And if there are known configurations or drivers which are bad and maybe giving the same symptoms, they will be noticed earlier.

If you want to use a script for that to help you collect the information, you can use the script located at: http://github.com/meghuizen/systeminfo which will build a tar.bz2 for you you can give as attachment, so you'll have complete information.

After that learn a bit on the I/O scheduler. To make it easier for yourself to debug and understand the situation:
  - http://www.linuxjournal.com/article/6931 (info on I/O schedulers)
  - http://www.devshed.com/c/a/BrainDump/Linux-IO-Schedulers/
  - http://kerneltrap.org/node/7637
  - kernel-source/Documentation/block/iosched-description.txt (see: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/block;hb=HEAD)
  - http://www.westnet.com/~gsmith/content/linux-pdflush.htm
  - http://www.docunext.com/blog/2009/10/debugging-and-reducing-io-wait.html

There are some tools which are very handy to use. The Linux Perf tool, is for example very handy to debug slowness and latencies and stuff in your system.

For some documentation on perf see:
  - https://perf.wiki.kernel.org/index.php/Main_Page
  - http://anton.ozlabs.org/blog/2010/01/10/using-perf-the-linux-performance-analysis-tool-on-ubuntu-karmic/
  - http://blog.fenrus.org/?p=5

perf --help gives you also a lot of information.

And other profiling tools:
  - http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/basic_profiling.txt;hb=HEAD

So to debug these options, perf output is rather handy. So if there are slowdowns happening again, try to get at the same time get some perf record dumps and maybe as well as perf timechart dumps, so the developers can analyze that as well.

for example perf top gives you what's currently happening in the kernel.

perf bench can help you benchmark your system, so you could test changes with patches and kernel versions and tuning parameters.