very sub-optimal default readahead settings on device and unused readahead setting in LVM

Bug #129488 reported by James Troup
70
This bug affects 7 people
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
Invalid
Medium
Unassigned
Nominated for Gutsy by Lowell Alleman
Nominated for Hardy by Lowell Alleman
Nominated for Intrepid by Lowell Alleman
Nominated for Jaunty by Shaya Potter
Nominated for Karmic by Stefano Maioli

Bug Description

Binary package hint: lvm2

When you create an LV, the device ends up with a readahead of 256 512-byte sectors. A normal (non-LVM) device's readahead appears to default to 8192 512-byte sectors. This makes LVM FSs benchmark (and perform) very badly (our read rate dropped from 320M/s to 90M/s).

To make matters worse, LVM has an internal 'read ahead sectors' variable which is apparently unused, but still displayed by e.g. lvdisplay which just adds to the confusion.

This is apparently a known issue upstream:

http://linux.msede.com/lvm_mlist/archive/2004/06/0108.html

Revision history for this message
Kees Cook (kees) wrote :

From irc, work-around after boot is:

  blockdev --setra 8192 /dev/vg/lv

We need to find the right place to fix this by default. Kernel patch needed?

Changed in lvm2:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Kees Cook (kees) wrote :

And here's a udev hack...

James Troup (elmo)
description: updated
Revision history for this message
Alasdair G. Kergon (agk2) wrote : Re: [Bug 129488] Re: insane default readahead settings on device and unused readahead setting in LVM

On Tue, Jul 31, 2007 at 09:06:14PM -0000, James Troup wrote:
> This is apparently a known issue upstream, but not considered a
> priority:

Actually patches for better readahead support are being worked on
by Zdenek Kabelac and are nearly ready.

Alasdair
--
<email address hidden>

Revision history for this message
Rex de Jong (rex-de-jong) wrote : Re: insane default readahead settings on device and unused readahead setting in LVM

At this time, the bug is still persistent in Gutsy. Difference at my setup: 70 mb/s vs 250 mb/s. Is it possible to give an update of the expected solution date?

James Troup (elmo)
description: updated
Revision history for this message
Lowell Alleman (lowell-alleman) wrote :

This issue appears to still be present on a fresh install of Hardy (Ubuntu 8.04) as well.

I'm in the process of setting up a new server, and I applied the "gross hack" posted by Kees Cook. I have applied this to my desktop system a while back, but was hoping that it would have been fixed by now.

Just to be sure this hasn't been fixed in some other way, I ran some quick read performance tests with "hdparm -t /dev/vg/lv". I tried values from 256 (default) up to 16384. I confirmed that 8192 seems to give the best performance (57 MB/sec), and 256 gives the lowest performance (29 MB/sec).

Anybody have an update on when this will get fixed officially?

Revision history for this message
Shaya Potter (spotter) wrote :

this bug still exists in intrepid and the fix mentioned helps.

Revision history for this message
Ulrik Mikaelsson (rawler) wrote :

Seems to exist in Jaunty as well

Revision history for this message
Jack Wasey (jackwasey) wrote :

n.b. readahead is only half of the problem.

my 4 disk RAID10 array is 2/3 of the speed of my disk disk RAID0 array for big sequential reads. this is complete madness, as there are twice as many disks to read from in the RAID10 case. ANyway, not strictly related to this bug, but the interaction of raid, lvm and overlying fs has clearly never been thought through for desktop users (who often have lots of disks nowadays).

my eyes were watering after 15 mins looking at the lvm, mdadm and ext3 man pages. stride, chunks, superblock offsets, mayhem.

Revision history for this message
Jack Wasey (jackwasey) wrote :

lvchange -r 8192 your-lv

seems to be safe and permanent, but does OS ignore this value (as suggested by the original poster)?

Revision history for this message
guidol (monk-e) wrote : Re: [Bug 129488] Re: very sub-optimal default readahead settings on device and unused readahead setting in LVM

I'm not so sure it works in Ubuntu already. But it's been working fine in Debian
Testing for a while now.

Jack Wasey schreef:
> lvchange -r 8192 your-lv
>
> seems to be safe and permanent, but does OS ignore this value (as
> suggested by the original poster)?
>

Revision history for this message
Jack Wasey (jackwasey) wrote :

sudo blockdev --report

sudo blockdev --setra 16384 sd[abcd]

i've tried a few read-ahead amounts, and they all seem to show similar performance.
E.g. a raid 1 rebuild jumped from ~46 to ~52 MiB/s after command.

Revision history for this message
Jack Wasey (jackwasey) wrote :

(I meant when increasing ra from 256 to a large number)

Revision history for this message
Phillip Susi (psusi) wrote :

Normal disks also have a default 128kb readahead, not 4 MB, which seems to be quite sufficient. I don't see why you want the dm device layered on top of the physical disks doing MORE readahead, let alone 4 MB worth. If the setting given to lvchange is not respected though, that is a bug.

Revision history for this message
Thomas Danhorn (tdanhorn) wrote :

I suggest to mark this as "fixed". I did a few simple tests with hdparm -t /dev/xxx and while I am not convinced that this is necessarily the best, let on alone the only relevant measure of performance, I found no limitation in Karmic on a current 250 GB 7200 rpm laptop hard drive with the default setting of 256 sectors (as Phillip Susi indicated, this is the default for all block devices, both LVM and non-LVM). This agrees with Jack Wasey's observations. I get ~90+ MB/s for readaheads from 64 to 8192 sectors, and I have no reason to believe that there are is a setting within that range that would be significantly different from the others (on my system, with the hdparm test; there is some variability in repeated measurements even with the same settings). At 32 sectors there is a slight drop to ~85 MB/s, and settings of 2 and 8 sectors give ~27 and ~40 MB/s, respectively, which shows that the readahead settings do have an effect. To set the rumor to rest that the lvchange command has no effect (which was true in 2004 and perhaps later) - I tried with both lvchange -r and blockdev --setra, and they have the same effect - i.e. basically no change in performance between 64 and 8192 (highest I have tried), and a drop below 32. (Using blockdev --setra on the physical device a logical volume resides on, e.g. /dev/sdxy, rather than of the logical volume itself has no effect, by the way.)
Long story short, in my experience the default settings are fine and the lvchange command works as expected if you feel the need to change them. With SSD's this may have even less impact.

Phillip Susi (psusi)
Changed in lvm2 (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.