Bug #329880 “dmraid & attempt to access beyond end of device” : Bugs : linux package : Ubuntu

Revision history for this message

TJ (tj) wrote on 2009-02-15:

#1

Interesting. This is something I reported and posted a patch to mainline for back in early 2006. I thought it was eventually sorted out.

Revision history for this message

TJ (tj) wrote on 2009-02-15:

#2

See my original bug #77734 report "Disk Read Errors during boot-time caused by probe of invalid partitions"

Revision history for this message

Scott James Remnant (Canonical) (canonical-scott) wrote on 2009-02-16:

#3

udev doesn't when the device is claimed (in fact, I'm pretty sure the kernel doesn't even provide partition information) something must have regressed in the kernel

Revision history for this message

KBios (kbios) wrote on 2009-02-17:

#4

Same here, on a nvidia raid0 device.
This also seems to slow down system boot.

Changed in linux:
status:	New → Confirmed

Revision history for this message

TJ (tj) wrote on 2009-02-18:

#5

Please attach /var/log/dmesg

Changed in linux:
assignee:	nobody → intuitivenipple

Revision history for this message

KBios (kbios) wrote on 2009-02-18:

#6

dmesg.log Edit (66.9 KiB, text/plain)

Here it is.
However, it seems to cover only the first part of the boot process. Is it right?

Revision history for this message

TJ (tj) wrote on 2009-02-18:

#7

Yes, that is what I wanted, thank-you.

The issue is not related to udev. This is an issue with how the kernel scans block devices for partitions during start-up. I've confirmed that this issue is caused by the issue I reported in bug #77734.

Here's some significant log messages:

[ 6.136056] sda: sda1
[ 6.160201] sda: p1 exceeds device capacity

[ 6.160504] sdb: unknown partition table

[ 10.488869] attempt to access beyond end of device
[ 10.488874] sda: rw=0, want=1172134336, limit=586072368
[ 10.488876] Buffer I/O error on device sda1, logical block 1172134272

The story is this. Once a block device is found it is scanned for a partition table. If one is found it is reported partition by partition. A successful scan is reported thus (example from a non-dmraid disk):

[ 2.343465] sda: sda1 sda2 sda3 sda4

In reporting the partition the logic traverses the partitions looking for extended partitions containing logical partitions.

If the dmraid array is striped (RAID 0) the RAID volume starts on disk sda and ends on disk sdb. The primary partition table (MBR) this case is at sector 0 of the logical RAID volume, which just happens to also be at sector 0 of sda.
The partition table entries will contain offsets into the logical RAID volume which is larger than sda.

The logic in fs/partitions/msdos.c::msdos_partition() (tries to) scan each primary and extended partition when the block device is being built.

In trying to seek to the end of sda1 the LBA sector number it is using is logical, not physical. In striped arrays it almost certainly points to a sector that is physically on sdb.

dmraid differs from md (multiple disk) RAID arrays in that md works within the limits of the physical disks (usually built from groups of partitions). The md superblock (4KB) is stored at the end of a partition (usually in the last 64KB).

dmraid was designed to support the 'fake' RAID meta-data signatures of Promise FasTrak and the like which were originally implemented in controller BIOS and Windows device drivers.

I've marked this bug a duplicate of bug #77734 and added the upstream bug reference to that bug, reopening it against upstream Linux.

Yes, that is what I wanted, thank-you.

The issue is not related to udev. This is an issue with how the kernel scans block devices for partitions during start-up. I've confirmed that this issue is caused by the issue I reported in bug #77734.

Here's some significant log messages:

[    6.136056]  sda: sda1
[    6.160201] sda: p1 exceeds device capacity

[    6.160504]  sdb: unknown partition table

[   10.488869] attempt to access beyond end of device
[   10.488874] sda: rw=0, want=1172134336, limit=586072368
[   10.488876] Buffer I/O error on device sda1, logical block 1172134272

The story is this. Once a block device is found it is scanned for a partition table. If one is found it is reported partition by partition. A successful scan is reported thus (example from a non-dmraid disk):

[    2.343465]  sda: sda1 sda2 sda3 sda4

In reporting the partition the logic traverses the partitions looking for extended partitions containing logical partitions.

If the dmraid array is striped (RAID 0) the RAID volume starts on disk sda and ends on disk sdb. The primary partition table (MBR) this case is at sector 0 of the logical RAID volume, which just happens to also be at sector 0 of sda.
The partition table entries will contain offsets into the logical RAID volume which is larger than sda.

The logic in fs/partitions/msdos.c::msdos_partition() (tries to) scan each primary and extended partition when the block device is being built.

In trying to seek to the end of sda1 the LBA sector number it is using is logical, not physical. In striped arrays it almost certainly points to a sector that is physically on sdb.

dmraid differs from md (multiple disk) RAID arrays in that md works within the limits of the physical disks (usually built from groups of partitions). The md superblock (4KB) is stored at the end of a partition (usually in the last 64KB).

dmraid was designed to support the 'fake' RAID meta-data signatures of Promise FasTrak and the like which were originally implemented in controller BIOS and Windows device drivers.

I've marked this bug a duplicate of bug #77734 and added the upstream bug reference to that bug, reopening it against upstream Linux.

Revision history for this message

TJ (tj) wrote on 2009-02-18:

#8

Please post a report the the upstream kernel bug report at:

http://bugzilla.kernel.org/show_bug.cgi?id=7912

Revision history for this message

TJ (tj) wrote on 2009-02-19:

#9

After some exchanges with Linus on the upstream bug I'm un-duplicating this from #77734. That bug can deal with the kernel versions that allowed the disk I/O to be attempted. It appears the current code was introduced by:

git describe --contains a168ee84c90b39ece357da127ab388f2f64db19c
v2.6.25-rc1~1160

where __generic_make_request() calls bio_check_eod() which will generate the warning in handle_bad_sector().

As this commit refactored the block code significantly I'm not sure at this point if there was similar protection in earlier code.

This bug can deal with limiting the warning messages.

Neil Brown (upstream) believes this may be fixed by commit ac0d86f5809598ddcd6bfa0ea8245ccc910e9eac:

git describe --contains ac0d86f5809598ddcd6bfa0ea8245ccc910e9eac
v2.6.28-rc1~347

and therefore will be included in Jaunty.

Neil says:

"It is certainly a problem that I have seen before. mdadm destroy all partitions in a device that it includes in an array to try to stop other code getting confused. Presumably dmraid doesn't. But from .28 it probably shouldn't need to."

It may be, if we can show this patch can resolve the issue, that it might be a candidate for an SRU back-port.

Could one of the affected users do a test with Jaunty (kernel 2.6.28) and attach the resulting /var/log/kern.log to this report for us to examine?

Jeremy Foshee (jeremyfoshee) on 2011-01-12

Changed in linux (Ubuntu):
assignee:	TJ (intuitivenipple) → nobody

Revision history for this message

Jean-Louis Dupond (dupondje) wrote on 2011-07-26:

#10

No issues with this anymore since long time :)
Can be closed

Changed in linux (Ubuntu):
status:	Confirmed → Fix Released

Ubuntu
linux package

dmraid & attempt to access beyond end of device

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux package

dmraid & attempt to access beyond end of device

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package