Bug #77734 “Disk Read Errors during boot-time caused by probe of...” : Bugs : linux-source-2.6.20 package : Ubuntu

TJ (tj) on 2007-01-03

description:

updated

Revision history for this message

TJ (tj) wrote on 2007-01-25:

#1

Download full text (8.7 KiB)

I've been slowly working to isolate the source-code at the root of this error. The biggest problem I faced was the sheer number of errors reported at boot-time swamped the kernel's log buffer (128KB by default) and therefore I had no information about what was happening in the lead-up to this.

Yesterday I built a new kernel with the kernel log buffer size increased from 128KB to 1MB.

This is with Edgy Eft versions 2.6.17-10-generic and 2.6.17.14.

Using make menuconfig I changed the Kernel Hacking Kernel Log Buffer size. I altered the log-buffer-shift parameter from 17 to 20. This is a bit-shift value, so the buffer size is 2^X.

The entry in the kernel configuration file .config is CONFIG_LOG_BUF_SHIFT.

Now finally I have the kernel messages leading up to bug:

[17179576.612000] AMD7441: IDE controller at PCI slot 0000:00:07.1
[17179576.612000] AMD7441: chipset revision 4
[17179576.612000] AMD7441: not 100% native mode: will probe irqs later
[17179576.612000] AMD7441: 0000:00:07.1 (rev 04) UDMA100 controller
[17179576.612000] ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:DMA
[17179576.612000] ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:DMA
[17179576.612000] Probing IDE interface ide0...
[17179576.900000] hda: Maxtor 6Y120L0, ATA DISK drive
[17179577.180000] hdb: Maxtor 6Y060L0, ATA DISK drive
[17179577.236000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[17179577.236000] Probing IDE interface ide1...
[17179577.972000] hdc: PIONEER DVD-RW DVR-109, ATAPI CD/DVD-ROM drive
[17179578.756000] hdd: PIONEER DVD-RW DVR-103, ATAPI CD/DVD-ROM drive
[17179578.812000] ide1 at 0x170-0x177,0x376 on irq 15
[17179578.824000] hda: max request size: 128KiB
[17179578.832000] hda: 240121728 sectors (122942 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
[17179578.836000] hda: cache flushes supported
[17179578.836000] hda: hda1 hda2 hda3 < hda5 hda6 hda7 hda8 >
[17179578.884000] hdb: max request size: 128KiB
[17179578.884000] hdb: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
[17179578.884000] hdb: cache flushes supported
[17179578.884000] hdb: hdb1
[17179578.888000] hdc: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, UDMA(66)
[17179578.888000] Uniform CD-ROM driver Revision: 3.20
[17179578.912000] hdd: ATAPI 24X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, DMA
[17179579.280000] PDC20271: IDE controller at PCI slot 0000:00:08.0
[17179579.280000] ACPI: PCI Interrupt 0000:00:08.0[A] -> GSI 20 (level, low) -> IRQ 169
[17179579.280000] PDC20271: chipset revision 2
[17179579.280000] PDC20271: ROM enabled at 0x88000000
[17179579.280000] PDC20271: 100% native mode on irq 169
[17179579.280000] ide2: BM-DMA at 0xb000-0xb007, BIOS settings: hde:pio, hdf:pio
[17179579.280000] ide3: BM-DMA at 0xb008-0xb00f, BIOS settings: hdg:pio, hdh:pio
[17179579.280000] Probing IDE interface ide2...
[17179579.572000] hde: Maxtor 6Y060L0, ATA DISK drive
[17179579.852000] hdf: Maxtor 6Y060L0, ATA DISK drive
[17179579.908000] ide2 at 0xd400-0xd407,0xd002 on irq 169
[17179579.908000] hde: max request size: 128KiB
[17179579.924000] hde: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)
[17179579.924...

I've been slowly working to isolate the source-code at the root of this error. The biggest problem I faced was the sheer number of errors reported at boot-time swamped the kernel's log buffer (128KB by default) and therefore I had no information about what was happening in the lead-up to this.

Yesterday I built a new kernel with the kernel log buffer size increased from 128KB to 1MB.

This is with Edgy Eft versions 2.6.17-10-generic and 2.6.17.14.

Using make menuconfig I changed the Kernel Hacking Kernel Log Buffer size. I altered the log-buffer-shift parameter from 17 to 20. This is a bit-shift value, so the buffer size is 2^X.

The entry in the kernel configuration file .config is CONFIG_LOG_BUF_SHIFT.

Now finally I have the kernel messages leading up to bug:

[17179576.612000] AMD7441: IDE controller at PCI slot 0000:00:07.1
[17179576.612000] AMD7441: chipset revision 4
[17179576.612000] AMD7441: not 100% native mode: will probe irqs later
[17179576.612000] AMD7441: 0000:00:07.1 (rev 04) UDMA100 controller
[17179576.612000]     ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:DMA
[17179576.612000]     ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:DMA
[17179576.612000] Probing IDE interface ide0...
[17179576.900000] hda: Maxtor 6Y120L0, ATA DISK drive
[17179577.180000] hdb: Maxtor 6Y060L0, ATA DISK drive
[17179577.236000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[17179577.236000] Probing IDE interface ide1...
[17179577.972000] hdc: PIONEER DVD-RW DVR-109, ATAPI CD/DVD-ROM drive
[17179578.756000] hdd: PIONEER DVD-RW DVR-103, ATAPI CD/DVD-ROM drive
[17179578.812000] ide1 at 0x170-0x177,0x376 on irq 15
[17179578.824000] hda: max request size: 128KiB
[17179578.832000] hda: 240121728 sectors (122942 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
[17179578.836000] hda: cache flushes supported
[17179578.836000]  hda: hda1 hda2 hda3 < hda5 hda6 hda7 hda8 >
[17179578.884000] hdb: max request size: 128KiB
[17179578.884000] hdb: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
[17179578.884000] hdb: cache flushes supported
[17179578.884000]  hdb: hdb1
[17179578.888000] hdc: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, UDMA(66)
[17179578.888000] Uniform CD-ROM driver Revision: 3.20
[17179578.912000] hdd: ATAPI 24X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, DMA
[17179579.280000] PDC20271: IDE controller at PCI slot 0000:00:08.0
[17179579.280000] ACPI: PCI Interrupt 0000:00:08.0[A] -> GSI 20 (level, low) -> IRQ 169
[17179579.280000] PDC20271: chipset revision 2
[17179579.280000] PDC20271: ROM enabled at 0x88000000
[17179579.280000] PDC20271: 100% native mode on irq 169
[17179579.280000]     ide2: BM-DMA at 0xb000-0xb007, BIOS settings: hde:pio, hdf:pio
[17179579.280000]     ide3: BM-DMA at 0xb008-0xb00f, BIOS settings: hdg:pio, hdh:pio
[17179579.280000] Probing IDE interface ide2...
[17179579.572000] hde: Maxtor 6Y060L0, ATA DISK drive
[17179579.852000] hdf: Maxtor 6Y060L0, ATA DISK drive
[17179579.908000] ide2 at 0xd400-0xd407,0xd002 on irq 169
[17179579.908000] hde: max request size: 128KiB
[17179579.924000] hde: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)
[17179579.924000] hde: cache flushes supported
[17179579.924000]  hde: hde1 hde2 hde3 < >
[17179579.928000] hdf: max request size: 128KiB
[17179579.944000] hdf: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)
[17179579.944000] hdf: cache flushes supported
[17179579.944000]  hdf: unknown partition table
[17179579.952000] Probing IDE interface ide3...
[17179580.240000] hdg: Maxtor 6Y060L0, ATA DISK drive
[17179580.520000] hdh: Maxtor 6Y060L0, ATA DISK drive
[17179580.576000] ide3 at 0xb800-0xb807,0xb402 on irq 169
[17179580.576000] hdg: max request size: 128KiB
[17179580.592000] hdg: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)
[17179580.592000] hdg: cache flushes supported
[17179580.592000]  hdg: hdg1 hdg2 hdg3 < >
[17179580.616000] hdh: max request size: 128KiB
[17179580.632000] hdh: 120103200 sectors (61492 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(133)
[17179580.632000] hdh: cache flushes supported
[17179580.632000]  hdh: unknown partition table
[17179580.756000] hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.756000] hde: dma_intr: error=0x04 { DriveStatusError }
[17179580.756000] ide: failed opcode was: unknown
[17179580.756000] hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.756000] hde: dma_intr: error=0x04 { DriveStatusError }
[17179580.756000] ide: failed opcode was: unknown
[17179580.772000] hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.772000] hde: dma_intr: error=0x04 { DriveStatusError }
[17179580.772000] ide: failed opcode was: unknown
[17179580.772000] hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.772000] hde: dma_intr: error=0x04 { DriveStatusError }
[17179580.772000] ide: failed opcode was: unknown
[17179580.772000] hde: DMA disabled
[17179580.772000] hdf: DMA disabled
[17179580.772000] PDC202XX: Primary channel reset.
[17179580.820000] ide2: reset: success
[17179580.832000] hdg: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.832000] hdg: dma_intr: error=0x04 { DriveStatusError }
[17179580.832000] ide: failed opcode was: unknown
[17179580.832000] hdg: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.832000] hdg: dma_intr: error=0x04 { DriveStatusError }
[17179580.832000] ide: failed opcode was: unknown
[17179580.848000] hdg: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.848000] hdg: dma_intr: error=0x04 { DriveStatusError }
[17179580.848000] ide: failed opcode was: unknown
[17179580.848000] hdg: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[17179580.848000] hdg: dma_intr: error=0x04 { DriveStatusError }
[17179580.848000] ide: failed opcode was: unknown
[17179580.848000] hdg: DMA disabled
[17179580.848000] hdh: DMA disabled
[17179580.848000] PDC202XX: Secondary channel reset.
[17179580.864000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.864000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179580.864000] ide: failed opcode was: unknown
[17179580.896000] ide3: reset: success
[17179580.896000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.896000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179580.896000] ide: failed opcode was: unknown
[17179580.932000] hdg: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.932000] hdg: task_in_intr: error=0x04 { DriveStatusError }
[17179580.932000] ide: failed opcode was: unknown
[17179580.956000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.956000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179580.956000] ide: failed opcode was: unknown
[17179580.964000] hdg: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.964000] hdg: task_in_intr: error=0x04 { DriveStatusError }
[17179580.964000] ide: failed opcode was: unknown
[17179580.988000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179580.988000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179580.988000] ide: failed opcode was: unknown
[17179580.988000] PDC202XX: Primary channel reset.
[17179581.008000] hdg: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179581.008000] hdg: task_in_intr: error=0x04 { DriveStatusError }
[17179581.008000] ide: failed opcode was: unknown
[17179581.036000] ide2: reset: success
[17179581.040000] hdg: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179581.040000] hdg: task_in_intr: error=0x04 { DriveStatusError }
[17179581.040000] ide: failed opcode was: unknown
[17179581.040000] PDC202XX: Secondary channel reset.
[17179581.088000] ide3: reset: success
[17179581.096000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179581.096000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179581.096000] ide: failed opcode was: unknown
[17179581.096000] end_request: I/O error, dev hde, sector 135202804
[17179581.096000] Buffer I/O error on device hde2, logical block 21510976
[17179581.124000] hdg: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179581.124000] hdg: task_in_intr: error=0x04 { DriveStatusError }
[17179581.124000] ide: failed opcode was: unknown
[17179581.124000] end_request: I/O error, dev hdg, sector 135202804
[17179581.124000] Buffer I/O error on device hdg2, logical block 21510976
[17179581.132000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179581.132000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179581.132000] ide: failed opcode was: unknown[/code]

Revision history for this message

TJ (tj) wrote on 2007-01-25:

#2

The error reports continue for 20-30 seconds, all the time the drive heads banging like mad to the extent that even after a cold boot sometimes the drives won't initialize.

Note the difference in the kernel time of these last entries compared to where the errors started being reported in the previous post.

[17179603.416000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179603.416000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179603.416000] ide: failed opcode was: unknown
[17179603.416000] PDC202XX: Primary channel reset.
[17179603.464000] ide2: reset: success
[17179603.524000] hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[17179603.524000] hde: task_in_intr: error=0x04 { DriveStatusError }
[17179603.524000] ide: failed opcode was: unknown
[17179603.524000] end_request: I/O error, dev hde, sector 135203040
[17179603.612000] usbcore: registered new driver usbfs
[17179603.612000] usbcore: registered new driver hub
[17179603.612000] ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
[17179603.612000] ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 177
[17179603.612000] ohci_hcd 0000:03:00.0: OHCI Host Controller
[17179603.612000] ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
[17179603.628000] ohci_hcd 0000:03:00.0: irq 177, io mem 0xd6800000
[17179603.628000] ieee1394: Initialized config rom entry `ip1394'
[17179603.684000] usb usb1: configuration #1 chosen from 1 choice
[17179603.684000] hub 1-0:1.0: USB hub found
[17179603.684000] hub 1-0:1.0: 4 ports detected

Revision history for this message

TJ (tj) wrote on 2007-01-25:

#3

Assigned to more appropriate package

Revision history for this message

TJ (tj) wrote on 2007-01-25:

#4

Narrowing down the scope of the bug, the following two lines from the kernel log confirm the reason:

[17179581.096000] end_request: I/O error, dev hde, sector 135202804
[17179581.096000] Buffer I/O error on device hde2, logical block 21510976

Bearing in mind the partition table on hde is actually the partition table for the stripe-set hde+hdf, some entries in it will likely have sector numbers that are larger than the number of sectors on the physical drive hde, as can be seen from the partition table of the RAID device after dmraid has loaded:

# fdisk -u -l /dev/mapper/pdc_biieicaii

Disk /dev/mapper/pdc_biieicaii: 121.9 GB, 121999982592 bytes
255 heads, 63 sectors/track, 14832 cylinders, total 238281216 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/mapper/pdc_biieicaii1 * 63 49158899 24579418+ 7 HPFS/NTFS
/dev/mapper/pdc_biieicaii2 49158900 135203039 43022070 7 HPFS/NTFS
/dev/mapper/pdc_biieicaii3 135203040 238276079 51536520 5 Extended
/dev/mapper/pdc_biieicaii5 135203103 141082829 2939863+ 82 Linux swap / Solaris
/dev/mapper/pdc_biieicaii6 141082893 142094924 506016 83 Linux
/dev/mapper/pdc_biieicaii7 142094988 181181069 19543041 83 Linux
/dev/mapper/pdc_biieicaii8 181181133 238276079 28547473+ 83 Linux

So the logical sector 21,510,976 in hde2 is being sought, which is physical sector 135,202,804, but hde only has 120,103,200 sectors.

The report of the logical sector number in 'hde2' suggests the issue is caused by an fsck, although it doesn't explain why so many thousands of errors are being generated.

There aren't an overwhelming number of seek errors, as shown by doing:

# grep 'end_request: I/O error, dev hde' dmesg.diskerrors.txt | sed 's/$\[[0-9\.]*\]$ $[a-z]*$/\2/' | sort | uniq

end_request: I/O error, dev hde, sector 135202804
end_request: I/O error, dev hde, sector 135202808
end_request: I/O error, dev hde, sector 135202972
end_request: I/O error, dev hde, sector 135202976
end_request: I/O error, dev hde, sector 135203028
end_request: I/O error, dev hde, sector 135203032
end_request: I/O error, dev hde, sector 135203036
end_request: I/O error, dev hde, sector 135203040

So, all but sector 135,203,040 are part of the NTFS file-system in partition 2. That sector is the first sector of partition 3, which is an extended partition-table entry.

Narrowing down the scope of the bug, the following two lines from the kernel log confirm the reason:

[17179581.096000] end_request: I/O error, dev hde, sector 135202804
[17179581.096000] Buffer I/O error on device hde2, logical block 21510976

Bearing in mind the partition table on hde is actually the partition table for the stripe-set hde+hdf, some entries in it will likely have sector numbers that are larger than the number of sectors on the physical drive hde, as can be seen from the partition table of the RAID device after dmraid has loaded:

# fdisk -u -l /dev/mapper/pdc_biieicaii

Disk /dev/mapper/pdc_biieicaii: 121.9 GB, 121999982592 bytes
255 heads, 63 sectors/track, 14832 cylinders, total 238281216 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot      Start         End      Blocks   Id  System
/dev/mapper/pdc_biieicaii1   *          63    49158899    24579418+   7  HPFS/NTFS
/dev/mapper/pdc_biieicaii2        49158900   135203039    43022070    7  HPFS/NTFS
/dev/mapper/pdc_biieicaii3       135203040   238276079    51536520    5  Extended
/dev/mapper/pdc_biieicaii5       135203103   141082829     2939863+  82  Linux swap / Solaris
/dev/mapper/pdc_biieicaii6       141082893   142094924      506016   83  Linux
/dev/mapper/pdc_biieicaii7       142094988   181181069    19543041   83  Linux
/dev/mapper/pdc_biieicaii8       181181133   238276079    28547473+  83  Linux

So the logical sector 21,510,976 in hde2 is being sought, which is physical sector 135,202,804, but hde only has 120,103,200 sectors.

The  report of the logical sector number in 'hde2' suggests the issue is caused by an fsck, although it doesn't explain why so many thousands of errors are being generated.

There aren't an overwhelming number of seek errors, as shown by doing:

# grep 'end_request: I/O error, dev hde' dmesg.diskerrors.txt |  sed  's/$\[[0-9\.]*\]$ $[a-z]*$/\2/'   | sort  | uniq

end_request: I/O error, dev hde, sector 135202804
end_request: I/O error, dev hde, sector 135202808
end_request: I/O error, dev hde, sector 135202972
end_request: I/O error, dev hde, sector 135202976
end_request: I/O error, dev hde, sector 135203028
end_request: I/O error, dev hde, sector 135203032
end_request: I/O error, dev hde, sector 135203036
end_request: I/O error, dev hde, sector 135203040

So, all but sector 135,203,040 are part of the NTFS file-system in partition 2. That sector is the first sector of partition 3, which is an extended partition-table entry.

Revision history for this message

TJ (tj) wrote on 2007-01-26:

#5

With the 1MB kernel log buffer still configured, I edited drivers/ide/ide-disk.c, adding some debug code that would do a stack trace when one of the 8 sectors listed above was requested:

/* required for call-tracing */
#include <linux/sched.h>

....

static ide_startstop_t __ide_do_rw_disk(
...
if (drive->select.b.lba) {
  if (lba48) {
...
  } else {
/* report when suspect sector numbers are addressed */
unsigned int bad_index;
if ( strcmp(drive->name,"hde") == 0) { /* is is one of the affected drives? */
  sector_t bad_address[8]; /* make it easy to compare the known sector addresses */
  bad_address[0] = 135202804ULL;
  bad_address[1] = 135202808ULL;
  bad_address[2] = 135202972ULL;
  bad_address[3] = 135202976ULL;
  bad_address[4] = 135203028ULL;
  bad_address[5] = 135203032ULL;
  bad_address[6] = 135203036ULL;
  bad_address[7] = 135203040ULL;
  /* simple loop to compare current sector request with known bad sector addresses */
  for (bad_index = 0; bad_index < 8; bad_index++) {
   if (block == bad_address[bad_index]) { /* force a stack-trace */
    printk("IDE: %s attempting to read sector address %lld\n",drive->name, bad_address[bad_index]);
    printk("Trace: Trying to force a stack trace\n");
    dump_stack();
    break;
   }
  }
}
/* end of call-trace */
...

After starting the debugging kernel, the log typically shows:

IDE: hde attempting to read sector address 135203032
Trace: Trying to force a stack trace
<f883df91> ide_do_rw_disk+0x681/0x6b0 [ide_disk] <c02525b1> ide_do_request+0x6a1/0x8a0
<c011b22d> find_busiest_group+0xcd/0x2f0 <c012bb70> lock_timer_base+0x20/0x50
<c0252abe> ide_intr+0x1de/0x1f0 <c012b819> do_timer+0x39/0x360
<c010f142> mark_offset_pmtmr+0x22/0xee0 <c0149323> handle_IRQ_event+0x33/0x60
<c01493ed> __do_IRQ+0x9d/0x110 <c0105c89> do_IRQ+0x19/0x30
<c010408a> common_interrupt+0x1a/0x20 <c0102080> default_idle+0x0/0x60
<c01020aa> default_idle+0x2a/0x60 <c0102122> cpu_idle+0x42/0xb0
<c03f07a1> start_kernel+0x321/0x3a0 <c03f0210> unknown_bootoption+0x0/0x270
hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
hde: task_in_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown
end_request: I/O error, dev hde, sector 135203032

With the 1MB kernel log buffer still configured, I edited drivers/ide/ide-disk.c, adding some debug code that would do a stack trace when one of the 8 sectors listed above was requested:

/* required for call-tracing */
#include <linux/sched.h>

....

static ide_startstop_t __ide_do_rw_disk(
...
	if (drive->select.b.lba) {
		if (lba48) {
...
		} else {
/* report when suspect sector numbers are addressed */
 unsigned int bad_index;
 if ( strcmp(drive->name,"hde") == 0) { /* is is one of the affected drives? */
  sector_t bad_address[8]; /* make it easy to compare the known sector addresses */
  bad_address[0] = 135202804ULL;
  bad_address[1] = 135202808ULL;
  bad_address[2] = 135202972ULL;
  bad_address[3] = 135202976ULL;
  bad_address[4] = 135203028ULL;
  bad_address[5] = 135203032ULL;
  bad_address[6] = 135203036ULL;
  bad_address[7] = 135203040ULL;
  /* simple loop to compare current sector request with known bad sector addresses */
  for (bad_index = 0; bad_index < 8; bad_index++) {
   if (block == bad_address[bad_index]) { /* force a stack-trace */
    printk("IDE: %s attempting to read sector address %lld\n",drive->name, bad_address[bad_index]);
    printk("Trace: Trying to force a stack trace\n");
    dump_stack();
    break;
   }
  }
 }
/* end of call-trace */
...

After starting the debugging kernel, the log typically shows:

IDE: hde attempting to read sector address 135203032
Trace: Trying to force a stack trace
<f883df91> ide_do_rw_disk+0x681/0x6b0 [ide_disk]  <c02525b1> ide_do_request+0x6a1/0x8a0
<c011b22d> find_busiest_group+0xcd/0x2f0  <c012bb70> lock_timer_base+0x20/0x50
<c0252abe> ide_intr+0x1de/0x1f0  <c012b819> do_timer+0x39/0x360
<c010f142> mark_offset_pmtmr+0x22/0xee0  <c0149323> handle_IRQ_event+0x33/0x60
<c01493ed> __do_IRQ+0x9d/0x110  <c0105c89> do_IRQ+0x19/0x30
<c010408a> common_interrupt+0x1a/0x20  <c0102080> default_idle+0x0/0x60
<c01020aa> default_idle+0x2a/0x60  <c0102122> cpu_idle+0x42/0xb0
<c03f07a1> start_kernel+0x321/0x3a0  <c03f0210> unknown_bootoption+0x0/0x270
hde: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
hde: task_in_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown
end_request: I/O error, dev hde, sector 135203032

Revision history for this message

TJ (tj) wrote on 2007-01-28:

#6

I've slightly modified the trace-code to try and report the name of the callback function in the block I/O request (struct request.end_io) but unfortunately its null.

Any suggestions on how to find out what process/function put the job in the queue, hopefully without logging *every* job added, would be useful.

The trace shows:

Trace: hde attempting to read sector address 135202804
Trace: Call back function address end_io 00000000
Trace: Call back function name 0x0<7>Trace: Trying to force a stack trace
[<c010482d>] show_trace+0xd/0x10
[<c0104ed7>] dump_stack+0x17/0x20
[<f882cab4>] ide_do_rw_disk+0x1c4/0x690 [ide_disk]
[<c02504a7>] ide_do_request+0x6c7/0x8c0
[<c02509b9>] do_ide_request+0x19/0x20
[<c01ddcbc>] cfq_kick_queue+0x5c/0xc0
[<c0132652>] run_workqueue+0x72/0xf0
[<c0133288>] worker_thread+0x118/0x140
[<c0135fd7>] kthread+0xa7/0xd0
[<c0101005>] kernel_thread_helper+0x5/0x10
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown

and the code in drivers/ide/ide-disk.c now reads:

/* report when suspect sector numbers are addressed */
unsigned int bad_index;
if ( strcmp(drive->name,"hde") == 0) { /* is it one of the affected drives? */
  /* make it easy to compare the known sector addresses */
  sector_t bad_address[8] = {135202804ULL, 135202808ULL, 135202972ULL, 135202976ULL,
                             135203028ULL, 135203032ULL, 135203036ULL, 135203040ULL };

  /* simple loop to compare current sector request with known bad sector addresses */
  for (bad_index = 0; bad_index < 8; bad_index++) {
   if (block == bad_address[bad_index]) { /* force a stack-trace */
    if (printk_ratelimit()) {
     printk(KERN_DEBUG "Trace: %s attempting to read sector address %lld\n",drive->name, bad_address[bad_index]);
     printk(KERN_DEBUG "Trace: Call back function address end_io %08lx\n",(unsigned long) rq->end_io);
     print_symbol("Trace: Call back function name %s\n", (unsigned long) rq->end_io);
     printk(KERN_DEBUG "Trace: Trying to force a stack trace\n");
     dump_stack();
     break;
    }
   }
  }
}
/* end of call-trace */

I've slightly modified the trace-code to try and report the name of the callback function in the block I/O request (struct request.end_io) but unfortunately its null.

Any suggestions on how to find out what process/function put the job in the queue, hopefully without logging *every* job added, would be useful.

The trace shows:

Trace: hde attempting to read sector address 135202804
Trace: Call back function address end_io 00000000
Trace: Call back function name 0x0<7>Trace: Trying to force a stack trace
 [<c010482d>] show_trace+0xd/0x10
 [<c0104ed7>] dump_stack+0x17/0x20
 [<f882cab4>] ide_do_rw_disk+0x1c4/0x690 [ide_disk]
 [<c02504a7>] ide_do_request+0x6c7/0x8c0
 [<c02509b9>] do_ide_request+0x19/0x20
 [<c01ddcbc>] cfq_kick_queue+0x5c/0xc0
 [<c0132652>] run_workqueue+0x72/0xf0
 [<c0133288>] worker_thread+0x118/0x140
 [<c0135fd7>] kthread+0xa7/0xd0
 [<c0101005>] kernel_thread_helper+0x5/0x10
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hde: dma_intr: error=0x04 { DriveStatusError }
ide: failed opcode was: unknown

and the code in drivers/ide/ide-disk.c now reads:

/* report when suspect sector numbers are addressed */
 unsigned int bad_index;
 if ( strcmp(drive->name,"hde") == 0) { /* is it one of the affected drives? */
  /* make it easy to compare the known sector addresses */  
  sector_t bad_address[8] = {135202804ULL, 135202808ULL, 135202972ULL, 135202976ULL,
                             135203028ULL, 135203032ULL, 135203036ULL, 135203040ULL };

/* simple loop to compare current sector request with known bad sector addresses */
  for (bad_index = 0; bad_index < 8; bad_index++) {
   if (block == bad_address[bad_index]) { /* force a stack-trace */
    if (printk_ratelimit()) {
     printk(KERN_DEBUG "Trace: %s attempting to read sector address %lld\n",drive->name, bad_address[bad_index]);
     printk(KERN_DEBUG "Trace: Call back function address end_io %08lx\n",(unsigned long) rq->end_io);
     print_symbol("Trace: Call back function name %s\n", (unsigned long) rq->end_io);
     printk(KERN_DEBUG "Trace: Trying to force a stack trace\n");
     dump_stack();
     break;
    }
   }
  }
 }
/* end of call-trace */

Revision history for this message

TJ (tj) wrote on 2007-01-28:

#7

Managed to get a more complete trace on the very first disk access *after* DMA was disabled by the IDE probing. The trace routine added to submit_bio() didn't trigger (disk device name didn't match what I was expecting) but that function is on the call stack.

I suspect this is because after DMA was disabled the device-unplug process was triggered.

__ide_do_rw_disk: hde attempting to read sector address135203040
__ide_do_rw_disk: rq->rq_disk->devfs_name=ide/host2/bus0/target0/lun0
__ide_do_rw_disk: Trying to force a stacktrace
  [<c010482d>]show_trace+0xd/0x10
  [<c0104ed7>]dump_stack+0x17/0x20
  [<f882caa1>] ide_do_rw_disk+0x1b1/0x680[ide_disk]
  [<c02505a7>]ide_do_request+0x6c7/0x8c0
  [<c0250ab9>]do_ide_request+0x19/0x20
  [<c01d3f40>]__generic_unplug_device+0x20/0x30
  [<c01dd29d>]cfq_start_queueing+0x1d/0x30
  [<c01dd66c>]cfq_insert_request+0x3bc/0x560
  [<c01d124b>]elv_insert+0xfb/0x170
  [<c01d131b>]__elv_add_request+0x5b/0xb0
  [<c01d544c>]__make_request+0xdc/0x3b0
  [<c01d3036>]generic_make_request+0x156/0x210
  [<c01d5038>]submit_bio+0x158/0x1f0
  [<c016bbbc>]submit_bh+0xcc/0x130
  [<c016ebe8>]block_read_full_page+0x2b8/0x320
  [<c0171c2f>]blkdev_readpage+0xf/0x20
  [<c0152a80>]__do_page_cache_readahead+0x190/0x240
  [<c0152b92>]blockable_page_cache_readahead+0x62/0xc0
  [<c0152ddf>]page_cache_readahead+0x12f/0x1f0
  [<c014c5b3>]do_generic_mapping_read+0x4e3/0x530
  [<c014cf77>]__generic_file_aio_read+0xe7/0x240
  [<c014e37e>]generic_file_read+0x8e/0xb0
  [<c016a6df>]vfs_read+0xaf/0x180
  [<c016ac3d>]sys_read+0x3d/0x70
  [<c0103087>]syscall_call+0x7/0xb
hde: dma_intr: status=0x51 { DriveReady SeekComplete Error}
hde: dma_intr: error=0x04 { DriveStatusError}
ide: failed opcode was:unknown

Revision history for this message

TJ (tj) wrote on 2007-01-31:

#8

Download full text (4.8 KiB)

I've finally pinned down the cause of this bug and created a kernel-patch that will be posted to the kernel mailing list later.

I'll attach the patch file to this bug report in a separate comment.

The bug is caused because no bounds checking is done in the partition-table checking routines that look for PC-style partitions (linux, bsd, solaris, msdos, extended, etc.)

In fs/partitions/msdos.c, in function msdos_partition(), the raw partition table sector values are read and block_device's created based on them, but no bounds (sanity) checking is done to ensure the sector numbers are within the limits of the disk.

As a result, the subsequent attempts by various parts of the kernel to scan for file-systems cause the repeated errors shown in this bug report.

This will only occur if the partition table contains sector numbers that are invalid, or is part of a logical RAID stripe such as that managed by the dmraid module.

Because the block_devices underlying the RAID device are still exposed to the kernel, and because they contain 'valid' partition table structures, they get scanned twice during boot, and again when the Display/Window manager loads.

Each period of errors usually lasts about 10 seconds-per-disk, and is repeated 3 times during start-up. With a mirror+stripe RAID 1+0 array this is doubled.
The banging of the disk heads against the end of the disk sounds scary and does, on occasion, *cause physical problems* that result in the disks failing to initialise until they have been left switched off a while.

My patch creates a new function in fs/partitions/msdos.c "check_sane_values()" and calls that function from within msdos_partition() *before* it enters the partition-table build loop.

If insane values are found the function prints a detailed report of each error to the kernel log (printk() to dmesg) and returns an error which results in the partition table scan being aborted and msdos_partition() reporting "unable to read partition table".

The new code in msdos_partition() is:

if (check_sane_values(p, bdev) == -1) {
put_dev_sector(sect); /* release to cache */
return -1; /* report invalid partition table */
}

And the new function check_sane_values() is:

/*
* Check that *all* sector offsets are valid before actually building the partition structure.
*
* This prevents physical damage to disks and boot-time problems caused by an apparently valid
* partition table causing attempts to read sectors beyond the end of the physical disk.
*
* This is especially important where this is the first physical disk in a striped RAID array
* and the partition table contains sector offsets into the larger logical disk (beyond the end
* of this physical disk).
*
* The RAID module will correctly manage the disks.
*
* The function is re-entrant so it can call itself to check extended partitions.
*
* @param p partition table
* @param bdev block device
* @returns -1 if insane values found; 0 otherwise
* @copy Copyright 31 January 2007
* @author TJ <email address hidden>
*/
int check_sane_values(struct partition *p, struct block_device *bdev) {
unsigned char *data;
struct partition *ext;
Sector sect;
int slot;
int insa...

I've finally pinned down the cause of this bug and created a kernel-patch that will be posted to the kernel mailing list later.

I'll attach the patch file to this bug report in a separate comment.

The bug is caused because no bounds checking is done in the partition-table checking routines that look  for PC-style partitions (linux, bsd, solaris, msdos, extended, etc.)

In fs/partitions/msdos.c, in function msdos_partition(), the raw partition table sector values are read and block_device's created based on them, but no bounds (sanity) checking is done to ensure the sector numbers are within the limits of the disk.

As a result, the subsequent attempts by various parts of the kernel to scan for file-systems cause the repeated errors shown in this bug report.

This will only occur if the partition table contains sector numbers that are invalid, or is part of a logical RAID stripe such as that managed by the dmraid module.

Because the block_devices underlying the RAID device are still exposed to the kernel, and because they contain 'valid' partition table structures, they get scanned twice during boot, and again when the Display/Window manager loads.

Each period of errors usually lasts about 10 seconds-per-disk, and is repeated 3 times during start-up. With a mirror+stripe RAID 1+0 array this is doubled.
The banging of the disk heads against the end of the disk sounds scary and does, on occasion, *cause physical problems* that result in the disks failing to initialise until they have been left switched off a while.

My patch creates a new function in fs/partitions/msdos.c "check_sane_values()" and calls that function from within msdos_partition() *before* it enters the partition-table build loop.

If insane values are found the function prints a detailed report of each error to the kernel log (printk() to dmesg) and returns an error which results in the partition table scan being aborted and msdos_partition() reporting "unable to read partition table".

The new code in msdos_partition() is:

if (check_sane_values(p, bdev) == -1) {
		put_dev_sector(sect); /* release to cache */
		return -1; /* report invalid partition table */
	}

And the new function check_sane_values() is:

/* 
 * Check that *all* sector offsets are valid before actually building the partition structure.
 *
 * This prevents physical damage to disks and boot-time problems caused by an apparently valid
 * partition table causing attempts to read sectors beyond the end of the physical disk.
 *
 * This is especially important where this is the first physical disk in a striped RAID array
 * and the partition table contains sector offsets into the larger logical disk (beyond the end
 * of this physical disk).
 *
 * The RAID module will correctly manage the disks.
 *
 * The function is re-entrant so it can call itself to check extended partitions.
 * 
 * @param p partition table
 * @param bdev block device
 * @returns -1 if insane values found; 0 otherwise
 * @copy Copyright 31 January 2007
 * @author TJ <linux@tjworld.net>
 */ 
int check_sane_values(struct partition *p, struct block_device *bdev) {
	unsigned char *data;
	struct partition *ext;
	Sector sect;
	int slot;
	int insane;
	int sector_size = bdev_hardsect_size(bdev) / 512;
	int ret = 0; /* default is to report ok */

/* don't return early; allow all partition entries to be checked */
	for (slot = 1 ; slot <= 4 ; slot++, p++) { 
		insane = 0; /* track sanity within each table entry */
		if (START_SECT(p) > bdev->bd_disk->capacity-1) { /* invalid - beyond end of disk */
			insane |= 1; /* bit-0 flags insane start */
		}
		if (START_SECT(p)+NR_SECTS(p)-1 > bdev->bd_disk->capacity-1) { /* invalid - beyond end of disk */
			insane |= 2; /* bit-1 flags insane end */
		}
		if (!insane && is_extended_partition(p)) { /* check the extended partition */
			data = read_dev_sector(bdev, START_SECT(p)*sector_size, &sect); /* fetch sector from cache */
			if (data) {
				if (msdos_magic_present(data + 510)) { /* check for signature */
					ext = (struct partition *) (data + 0x1be);
					ret = check_sane_values(ext, bdev); /* recursive call */
					if (ret == -1) /* insanity found */
						insane |= 4; /* bit-2 flags insane extended partition contents */
				}
				put_dev_sector(sect); /* release sector to cache */
			}
			else ret = -1; /* failed to read sector from cache */

}
		if (insane) { /* insanity found; report it */
			ret = -1; /* error code */
			printk("\n"); /* start error report on a fresh line */
			if (insane | 1)
				printk(" partition %d: start (sector %d) beyond end of disk (sector %d)\n", slot, START_SECT(p), (unsigned int) bdev->bd_disk->capacity-1);
			if (insane | 2)
				printk(" partition %d: end (sector %d) beyond end of disk (sector %d)\n", slot, START_SECT(p)+NR_SECTS(p)-1, (unsigned int) bdev->bd_disk->capacity-1);
			if (insane | 4)
				printk(" partition %d: insane extended contents\n", slot);
		}
	}
	return ret;
}

Revision history for this message

TJ (tj) wrote on 2007-01-31:

#9

Patch for fs/partitions/msdos.c Edit (3.7 KiB, text/plain)

Patch attached.

Valid for (at least) kernels 2.6.17 - 2.6.20 and probably going back to kernel 2.4 and prior.

Revision history for this message

TJ (tj) wrote on 2007-01-31:

#10

Updated patch for fs/partitions/msdos.c Edit (3.8 KiB, text/plain)

Update to suppress unnecessary kernel logging for empty (zero-length) partition table entries.

Revision history for this message

TJ (tj) wrote on 2007-01-31:

#11

Updated status to "In Progress" to reflect the availability of a universal patch for testing. Needs to be tested in systems that don't have this issue to ensure it doesn't cause any regressions.

Changed in linux-source-2.6.17:
assignee:	nobody → intuitive-nipple
status:	Unconfirmed → In Progress

Revision history for this message

TJ (tj) wrote on 2007-01-31:

#12

Bug and patch have been posted to kernel.org

http://bugzilla.kernel.org/show_bug.cgi?id=7912

Changed in linux:
status:	Unconfirmed → In Progress

Revision history for this message

TJ (tj) wrote on 2007-02-01:

#13

Patch revision 3 Edit (3.9 KiB, text/plain)

Revised patch as provided to kernel maintainers.

Revision history for this message

TJ (tj) wrote on 2007-07-25:

#14

Fix applied to Andrew Morton's -mm tree in January 2007

Changed in linux-source-2.6.20:
status:	In Progress → Fix Released
Changed in linux:
status:	In Progress → Fix Released

TJ (tj) on 2009-02-15

Changed in linux-source-2.6.20:
assignee:	intuitivenipple → nobody
status:	Fix Released → Confirmed

TJ (tj) on 2009-02-18

Changed in linux:
importance:	Undecided → Unknown
status:	Fix Released → Unknown

Bug Watch Updater (bug-watch-updater) on 2009-02-18

Changed in linux:
status:	Unknown → Confirmed

Revision history for this message

TJ (tj) wrote on 2009-02-18:

#15

Confirmed as still affecting Jaunty by report in bug #329880.

It appears Linus Torvalds rejected my patch when it was pushed from Andrew Morton's -mm tree to mainline in May 2007:

-----------------------------
From: <email address hidden>
To: <email address hidden>, <email address hidden>
Subject: - filesystem-disk-errors-at-boot-time-caused-by-probe.patch removed from -mm tree
Date: Tue, 08 May 2007 19:34:23 -0700 (Wed, 03:34 BST)

The patch titled
filesystem: Disk Errors at boot-time caused by probe of partitions
has been removed from the -mm tree. Its filename was
filesystem-disk-errors-at-boot-time-caused-by-probe.patch

This patch was dropped because it was nacked

-----------------------------
From: Linus Torvalds <email address hidden>
To: <email address hidden>
Cc: <email address hidden>, <email address hidden>, Jens Axboe <email address hidden>
Subject: Re: [patch 012/455] filesystem: Disk Errors at boot-time caused by probe of partitions
Date: Tue, 8 May 2007 09:19:32 -0700 (PDT) (17:19 BST)

On Tue, 8 May 2007, <email address hidden> wrote:
>
> From: TJ <email address hidden>

I don't really like these kinds of addresses. Who is TJ? When I google for
that name, I find a lot of hits, but all the links to tjworld.net are
down.

I also think the patch is wrong.

IIRC, we cannot trust the "capacity" data, because not all disks report it
correctly. If we did, we'd just do the check in read_dev_sector() instead.

So I'm dropping this. I might be wrong about the capacity thing, we may
have fixed it (Jens cc'd). But if the capacity is trustworthy, why not
just do the trivial check in read_dev_sector to protect against invalid
extended ones? And in add_partitions()?

Linus
-----------------------------

Changed in linux:
assignee:	nobody → intuitivenipple
status:	New → Confirmed

Revision history for this message

Chucky Ellison (ellisonch) wrote on 2009-03-28:

#16

dmesg.2.6.27.txt Edit (66.9 KiB, text/plain)

I don't exactly follow all of the above discussion, but I am attaching my dmesg which I believe to be a manifestation of this bug (although i found this bug through bug #329880). See errors around 6.114387, 6.821079, 27.105417, and 37.177262.

dmraid is all set up correctly, so once the system is booted, I can access the raid through /dev/mapper/. My sda-sdd are drives that are set up in a 1+0 raid using the nvidia raid on my motherboard. I boot to /dev/sde1 which is not part of a raid. If I can supply any more information or do any testing, please let me know.

$ lsb_release -rd
Description: Ubuntu 8.10
Release: 8.10

Revision history for this message

TJ (tj) wrote on 2009-03-28:

#17

Chucky, yes that looks like that system is suffering the same issue. Are you able to try the latest Jaunty kernel or even better one of the mainline 2.6.29 kernels. If the same delays and attempts to access the device are shown it'll have more impact.

Revision history for this message

Chucky Ellison (ellisonch) wrote on 2009-03-28:

#18

I have compiled many kernels on other linux systems, but never in Ubuntu. Should I follow https://wiki.ubuntu.com/KernelTeam/GitKernelBuild ?

Revision history for this message

Chucky Ellison (ellisonch) wrote on 2009-03-29:

#19

dmesg.2.6.29.txt Edit (38.9 KiB, text/plain)

I've compiled 2.6.29 from git, and I have attached my new dmesg. For whatever reason, the new kernel places my boot drive at sda, and the raid drives are now sdb, sdc, sdd, sde.

Relevant messages seem to start around 10.132601. There are definitely many fewer messages, but it still says things like "sdc: unknown partition table" and "sdd: p1 size 1250258562 limited to end of disk".

It really freaks me out that it says "write protect is off" for those drives. It's not actually writing to those drives, is it?

Revision history for this message

TJ (tj) wrote on 2009-03-29: Re: [Bug 77734] Re: Disk Read Errors during boot-time caused by probe of invalid partitions

#20

On Sat, 2009-03-28 at 20:35 +0000, Chucky Ellison wrote:
> I have compiled many kernels on other linux systems, but never in
> Ubuntu. Should I follow
> https://wiki.ubuntu.com/KernelTeam/GitKernelBuild ?

No, there's nothing you can or need to do right now. When there's a
patch available and a test kernel the bug will be updated.

Revision history for this message

TJ (tj) wrote on 2009-03-29:

#21

Compiled from git - oh *grins* ... there are mainline kernels available as packages:

http://kernel.ubuntu.com/~kernel-ppa/mainline/

"Write Protect is off" is a drive status message and the expected value - if you want to actually write data to the drives at some point :)

The messages are expected since at the point the disks are scanned dmraid hasn't loaded and so the 'plain' disks do appear to have problems.

One thing that interests me is the "sdd: p1 size 1250258562 limited to end of disk" because the kernel keeps track of all partitions and so it may store an 'incorrect' entry. With the 2.6.29 kernel can you report this output after dmraid has loaded:

cat /proc/partitions

Revision history for this message

Chucky Ellison (ellisonch) wrote on 2009-03-29:

#22

proc.partitions.2.6.29.txt Edit (443 bytes, text/plain)

Revision history for this message

Chucky Ellison (ellisonch) wrote on 2009-03-29:

#23

fdisk-l.2.6.29.txt Edit (1.8 KiB, text/plain)

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2009-04-28:

#24

Just closing the linux-source-2.6.20 task as the 2.6.20 Feisty Fawn Kernel reached it's end of life a while ago:

https://lists.ubuntu.com/archives/ubuntu-announce/2008-September/000113.html

This issue will remain open against the actively developed "linux (Ubuntu)" kernel package. Thanks.

Changed in linux-source-2.6.20 (Ubuntu):
status:	Confirmed → Won't Fix

Revision history for this message

kernel-janitor (kernel-janitor) wrote on 2009-07-10:

#25

[This is an automated message. Apologies if it has reached you inappropriately.]

This bug was flagged as having a patch attached. The Ubuntu Kernel Team's preferred policy is for all patches to be submitted and accepted into the upstream kernel before agreeing to merge them into the Ubuntu kernel. The goal for the Ubuntu kernel is to have little to no divergence from the upstream linux kernel source.

https://wiki.ubuntu.com/KernelTeam/KernelPatches has been written to document the suggested policy and procedures for helping get a patch merged upstream and subsequently into the Ubuntu kernel. Please take the time to review that wiki if this patch should be considered for inclusion into the upstream and Ubuntu kernel. Let us know if you have any questions or need any help via the Ubuntu Kernel Team mailing list. Thanks in advance.

tags:

added: kj-comment

Jeremy Foshee (jeremyfoshee) on 2011-01-12

Changed in linux (Ubuntu):
assignee:	TJ (intuitivenipple) → nobody

Revision history for this message

Andy Whitcroft (apw) wrote on 2011-01-19:

#26

It seems that this is fixed by upstream changes, by the commit below:

  commit ac0d86f5809598ddcd6bfa0ea8245ccc910e9eac
  Author: Kay Sievers <email address hidden>
  Date: Wed Oct 15 22:04:21 2008 -0700

block: sanitize invalid partition table entries

This was included in v2.6.38 and later kernels. I am going to close this one off against that commit.

Changed in linux-source-2.6.17 (Debian):
status:	New → Fix Released
Changed in linux (Ubuntu):
status:	Confirmed → Fix Released

Bug Watch Updater (bug-watch-updater) on 2011-02-03

Changed in linux:
importance:	Unknown → High

Steve Conklin (sconklin) on 2011-09-16

Changed in linux:
importance:	High → Undecided
status:	Confirmed → New
status:	New → Fix Released

Ubuntu
linux-source-2.6.20 package

Disk Read Errors during boot-time caused by probe of invalid partitions

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
Linux	Fix Released	Undecided	Unassigned
linux (Ubuntu)	Fix Released	Undecided	Unassigned
linux-source-2.6.17 (Debian)	Fix Released	Undecided	Unassigned
linux-source-2.6.20 (Ubuntu)	Won't Fix	Undecided	Unassigned

Ubuntulinux-source-2.6.20 package

Disk Read Errors during boot-time caused by probe of invalid partitions

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntu
linux-source-2.6.20 package