Ubuntu
linux-source-2.6.20 package

Bug #77734
Activity log

Activity log for bug #77734

Date	Who	What changed	Old value	New value	Message
2007-01-02 21:37:47	TJ	bug			added bug
2007-01-03 20:00:56	TJ	description	I appear to have stumbled upon a bug in the kernel that can, in certain circumstances, both cause the kernel-boot to get stuck in an endless loop, and possibly damage the IDE drives over time (based on experience). Using Edgy Eft Desktop Live CD, preparing to install to an existing Windows system. This probably occurs during an installed system-boot too, but I've not got that far as yet. Scenario: PC with a Promise FastTrak TX2000 SoftRAID controller and 4x 60GB IDE parallel ATA drives configured as RAID 10 (Mirror + Stripe) to provide one logical 120GB drive. The PC already has Windows 2003 Server installed and booting from the RAID 10, with 2 NTFS partitions. I wanted to shrink the 2nd partition to make room to install Ubuntu 6.10 from the Live CD. See my Ubuntu forums article for a detailed explanation of my experience: http://www.ubuntuforums.org/showthread.php?p=1958918 Bug: When booting Edgy from the CD the kernel loads the Promise fasttrak controller module "pdc202xx" and then probes each of the connected IDE hard drives (for a partition table?) dmraid not being loaded so its not dealing with the logical drive. Large drives use LBA addressing to overcome the CHS limitations of partition tables. If the probe finds a partition table on any drive, it then tries to seek to the starting sector of each partition (presumably to read its boot-sector system-id byte?), and also tries to seek into the last few sectors of the partition (looking for a superblock?). On a RAID 0 array where the striping causes the partition table to represent a larger logical drive, the starting and ending sector numbers of some partitions are beyond the end of the physical drive the partition table is written on. This causes the Disk Read Errors reported here. The fix would be for the probe to compare the physical number of cylinders reported by the drive (as seen by e.g. fdisk /dev/hde or fdisk /dev/hdg) to the starting/ending sector numbers for the LBA device. If the entries in the partition are beyond the end of the physical disk the probe should handle the situation gracefully (This could potentially be used as a cue to auto-loading dmraid). Once dmraid is loaded "fdisk /dev/mapper/raidarrayname" shows the correct total number of logical sectors. -------- Short extract of repetitive disk errors - usually there are hundred or thousands ------ PDC202XX: Primary channel reset. ide2: reset: success hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown end_request: I/O error, dev hde, sector 238276076 printk: 8 messages suppressed. Buffer I/O error on device hde2, logical block 47279294 hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }	I appear to have stumbled upon a bug in the kernel that can, in certain circumstances, both cause the kernel-boot to get stuck in an endless loop, and possibly damage the IDE drives over time (based on experience). Using Edgy Eft Desktop Live CD, preparing to install to an existing Windows system. This probably occurs during an installed system-boot too, but I've not got that far as yet. Scenario: PC with a Promise FastTrak TX2000 SoftRAID controller and 4x 60GB IDE parallel ATA drives configured as RAID 1+0 (Mirror + Stripe) to provide one logical 120GB drive. The PC already has Windows 2003 Server installed and booting from the RAID 1+0, with 2 NTFS partitions. I wanted to shrink the 2nd partition to make room to install Ubuntu 6.10 from the Live CD. See my Ubuntu forums article for a detailed explanation of my experience: http://www.ubuntuforums.org/showthread.php?p=1958918 Bug: When booting Edgy from the CD the kernel loads the Promise fasttrak controller module "pdc202xx" and then probes each of the connected IDE hard drives (for a partition table?) dmraid not being loaded so its not dealing with the logical drive. The RAID 1+0 120GB logical drive consists of hde+hdf mirrored to hdg+hdh, with the partiton table on hde and hdg. Large drives use LBA addressing to overcome the CHS limitations of partition tables. If the probe finds a partition table on any drive, it then tries to seek to the starting sector of each partition (presumably to read its boot-sector system-id byte?), and also tries to seek into the last few sectors of the partition (looking for a superblock?). On a RAID 0 array where the striping causes the partition table to represent a larger logical drive, the starting and ending sector numbers of some partitions are beyond the end of the physical drive the partition table is written on. This causes the Disk Read Errors reported here. The fix would be for the probe to compare the physical number of cylinders reported by the drive (as seen by e.g. fdisk /dev/hde or fdisk /dev/hdg) to the starting/ending sector numbers for the LBA device. If the entries in the partition are beyond the end of the physical disk the probe should handle the situation gracefully (This could potentially be used as a cue to auto-loading dmraid). Once dmraid is loaded "fdisk /dev/mapper/raidarrayname" shows the correct total number of logical sectors. -------- Short extract of repetitive disk errors - usually there are hundred or thousands ------ PDC202XX: Primary channel reset. ide2: reset: success hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown end_request: I/O error, dev hde, sector 238276076 printk: 8 messages suppressed. Buffer I/O error on device hde2, logical block 47279294 hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hde: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
2007-01-25 17:41:43	TJ	None: statusexplanation		Assigned to more appropriate package
2007-01-31 17:47:40	TJ	bug			assigned to linux (upstream)
2007-01-31 17:52:46	TJ	title	Disk Read Errors during boot-time probe of physical softRAID drives	Disk Read Errors during boot-time caused by probe of invalid partitions
2007-01-31 20:24:58	TJ	bug			added attachment 'msdos.c.tj.patch' (Patch for fs/partitions/msdos.c)
2007-01-31 20:29:29	TJ	bug			assigned to linux-source-2.6.17 (Debian)
2007-01-31 21:36:37	TJ	bug			added attachment 'msdos.c.tj.2.patch' (Updated patch for fs/partitions/msdos.c)
2007-01-31 21:38:37	TJ	linux-source-2.6.17: status	Unconfirmed	In Progress
2007-01-31 21:38:37	TJ	linux-source-2.6.17: assignee		intuitive-nipple
2007-01-31 21:38:37	TJ	linux-source-2.6.17: statusexplanation	Assigned to more appropriate package	Updated status to "In Progress" to reflect the availability of a universal patch for testing. Needs to be tested in systems that don't have this issue to ensure it doesn't cause any regressions.
2007-01-31 23:04:13	TJ	linux: status	Unconfirmed	In Progress
2007-02-01 02:00:13	TJ	bug			added attachment 'msdos.c.tj.7.patch' (Patch revision 3)
2007-03-26 16:21:10	Tormod Volden	linux-source-2.6.17: statusexplanation	Updated status to "In Progress" to reflect the availability of a universal patch for testing. Needs to be tested in systems that don't have this issue to ensure it doesn't cause any regressions.
2007-07-25 20:05:21	TJ	linux-source-2.6.20: status	In Progress	Fix Released
2007-07-25 20:06:02	TJ	linux: status	In Progress	Fix Released
2009-02-15 23:25:40	TJ	linux-source-2.6.20: status	Fix Released	Confirmed
2009-02-15 23:25:40	TJ	linux-source-2.6.20: assignee	intuitivenipple
2009-02-18 19:35:01	TJ	linux: status	Fix Released	Unknown
2009-02-18 19:35:01	TJ	linux: importance	Undecided	Unknown
2009-02-18 19:35:01	TJ	linux: statusexplanation	Fix applied to Andrew Morton's -mm tree in January 2007
2009-02-18 19:36:10	Bug Watch Updater	linux: status	Unknown	Confirmed
2009-02-18 19:37:45	TJ	bug			assigned to linux (Ubuntu)
2009-02-18 19:48:10	TJ	linux: status	New	Confirmed
2009-02-18 19:48:10	TJ	linux: assignee		intuitivenipple
2009-02-18 19:48:10	TJ	linux: statusexplanation		Confirmed as still affecting Jaunty by report in bug #329880. It appears Linus Torvalds rejected my patch when it was pushed from Andrew Morton's -mm tree to mainline in May 2007: ----------------------------- From: akpm@linux-foundation.org To: linux@tjworld.net, mm-commits@vger.kernel.org Subject: - filesystem-disk-errors-at-boot-time-caused-by-probe.patch removed from -mm tree Date: Tue, 08 May 2007 19:34:23 -0700 (Wed, 03:34 BST) The patch titled filesystem: Disk Errors at boot-time caused by probe of partitions has been removed from the -mm tree. Its filename was filesystem-disk-errors-at-boot-time-caused-by-probe.patch This patch was dropped because it was nacked ----------------------------- From: Linus Torvalds <torvalds@linux-foundation.org> To: akpm@linux-foundation.org Cc: linux@tjworld.net, bunk@stusta.de, Jens Axboe <jens.axboe@oracle.com> Subject: Re: [patch 012/455] filesystem: Disk Errors at boot-time caused by probe of partitions Date: Tue, 8 May 2007 09:19:32 -0700 (PDT) (17:19 BST) On Tue, 8 May 2007, akpm@linux-foundation.org wrote: > > From: TJ <linux@tjworld.net> I don't really like these kinds of addresses. Who is TJ? When I google for that name, I find a lot of hits, but all the links to tjworld.net are down. I also think the patch is wrong. IIRC, we cannot trust the "capacity" data, because not all disks report it correctly. If we did, we'd just do the check in read_dev_sector() instead. So I'm dropping this. I might be wrong about the capacity thing, we may have fixed it (Jens cc'd). But if the capacity is trustworthy, why not just do the trivial check in read_dev_sector to protect against invalid extended ones? And in add_partitions()? Linus -----------------------------
2009-03-28 10:11:50	Chucky Ellison	bug			added attachment 'dmesg.txt' (dmesg.txt)
2009-03-29 00:40:35	Chucky Ellison	bug			added attachment 'dmesg.2.6.29.txt' (dmesg.2.6.29.txt)
2009-03-29 21:35:48	Chucky Ellison	bug			added attachment 'proc.partitions.2.6.29.txt' (proc.partitions.2.6.29.txt)
2009-03-29 21:37:24	Chucky Ellison	bug			added attachment 'fdisk-l.2.6.29.txt' (fdisk-l.2.6.29.txt)
2009-04-28 23:54:26	Leann Ogasawara	linux-source-2.6.20 (Ubuntu): status	Confirmed	Won't Fix
2009-07-10 19:38:18	kernel-janitor	tags	dmraid	dmraid kj-comment
2011-01-12 21:29:02	Jeremy Foshee	linux (Ubuntu): assignee	TJ (intuitivenipple)
2011-01-19 10:32:17	Andy Whitcroft	linux-source-2.6.17 (Debian): status	New	Fix Released
2011-01-19 10:33:05	Andy Whitcroft	linux (Ubuntu): status	Confirmed	Fix Released
2011-02-03 17:20:39	Bug Watch Updater	linux: importance	Unknown	High
2011-09-16 16:40:48	Steve Conklin	linux: importance	High	Undecided
2011-09-16 16:40:48	Steve Conklin	linux: status	Confirmed	New
2011-09-16 16:40:48	Steve Conklin	linux: remote watch	Linux Kernel Bug Tracker #7912
2011-09-16 16:40:59	Steve Conklin	linux: status	New	Fix Released

Ubuntulinux-source-2.6.20 package

Activity log for bug #77734

Ubuntu
linux-source-2.6.20 package