udisks-daemon hangs system at boot when bad GPT sector on HD

Bug #946565 reported by Whit Blauvelt
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
udisks (Ubuntu)
New
Undecided
Unassigned

Bug Description

This is on an ASUS Eee 1001P with hard drive and SD card (and USB), running Ubuntu 10.04.4 with udisks 1.0.1-1ubuntu1. The hard drive developed a bad sector in the area reserved for the secondary GPT - the second-to-last sector in this case. The drive does not use GPT. Nonetheless once udisks-daemon is started in the boot sequence, because of the SD card being found and evoking udisks through udev, udisks starts aggressively polling the hard drive too, including repeated requests for to read the (nonexistent) secondary GPT. This looks like so:

Mar 2 10:23:04 boot2 kernel: [ 3.703401] sd 5:0:0:0: [sdc] Attached SCSI removable disk
Mar 2 10:23:04 boot2 kernel: [ 4.474634] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Mar 2 10:23:04 boot2 kernel: [ 4.474796] ata1.00: irq_stat 0x40000008
Mar 2 10:23:04 boot2 kernel: [ 4.474882] ata1.00: failed command: READ FPDMA QUEUED
Mar 2 10:23:04 boot2 kernel: [ 4.474996] ata1.00: cmd 60/08:00:a8:9e:a1/00:00:12:00:00/40 tag 0 ncq 4096 in
Mar 2 10:23:04 boot2 kernel: [ 4.474999] res 41/40:08:af:9e:a1/00:00:12:00:00/00 Emask 0x409 (media error) <F>
Mar 2 10:23:04 boot2 kernel: [ 4.475301] ata1.00: status: { DRDY ERR }
Mar 2 10:23:04 boot2 kernel: [ 4.475385] ata1.00: error: { UNC }
Mar 2 10:23:04 boot2 kernel: [ 4.479702] ata1.00: configured for UDMA/133
Mar 2 10:23:04 boot2 kernel: [ 4.479728] ata1: EH complete
Mar 2 10:23:04 boot2 kernel: [ 6.668024] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
Mar 2 10:23:04 boot2 kernel: [ 6.668189] ata1.00: irq_stat 0x40000008
Mar 2 10:23:04 boot2 kernel: [ 6.668275] ata1.00: failed command: READ FPDMA QUEUED
Mar 2 10:23:04 boot2 kernel: [ 6.668389] ata1.00: cmd 60/08:00:a8:9e:a1/00:00:12:00:00/40 tag 0 ncq 4096 in
Mar 2 10:23:04 boot2 kernel: [ 6.668392] res 41/40:08:af:9e:a1/00:00:12:00:00/00 Emask 0x409 (media error) <F>
Mar 2 10:23:04 boot2 kernel: [ 6.668694] ata1.00: status: { DRDY ERR }
Mar 2 10:23:04 boot2 kernel: [ 6.668778] ata1.00: error: { UNC }
Mar 2 10:23:04 boot2 kernel: [ 6.743082] ata1.00: configured for UDMA/133
Mar 2 10:23:04 boot2 kernel: [ 6.743105] ata1: EH complete

and that error keeps repeating so rapidly as to slow down the boot sequence by several minutes, and slow operation of the machine beyond that. It will occassionally produce identification of the bad sector:

Mar 2 10:23:04 boot2 kernel: [ 15.590221] sd 0:0:0:0: [sda] Unhandled sense code
Mar 2 10:23:04 boot2 kernel: [ 15.590227] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 2 10:23:04 boot2 kernel: [ 15.590234] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Mar 2 10:23:04 boot2 kernel: [ 15.590244] Descriptor sense data with sense descriptors (in hex):
Mar 2 10:23:04 boot2 kernel: [ 15.590249] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Mar 2 10:23:04 boot2 kernel: [ 15.590268] 12 a1 9e af
Mar 2 10:23:04 boot2 kernel: [ 15.590276] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Mar 2 10:23:04 boot2 kernel: [ 15.590286] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 12 a1 9e a8 00 00 08 00
Mar 2 10:23:04 boot2 kernel: [ 15.590304] end_request: I/O error, dev sda, sector 312581807
Mar 2 10:23:04 boot2 kernel: [ 15.593116] Buffer I/O error on device sda, logical block 39072725
Mar 2 10:23:04 boot2 kernel: [ 15.595932] ata1: EH complete

That sector is outside of all partitions, so fsck of course can't fix it. hdparm can, once you figure out what's going on.

Questions:

A. Why, when invoked to help with removeable media, is udisks-daemon worrying about a fixed drive?

B. Why, when the fixed drive has a conventional MBR and no GPT - primary or secondary - does udisks-daemon persistently try to read the secondary GPT there?

C. Why, when encountering the failure to read a bad sector where the secondary GPT might be, is udisks-daemon so aggressively retrying as to cripple the system?

D. Is there a way to configure udisks-daemon to make no attempt to read GPT tables, since the vast majority of systems don't use them, and it's method of looking for those tables introduces a novel point of failure with no corresponding benefit for the vast majority of systems? If so, shouldn't it be the default? For instance, running gptsync on the same system, before fixing the sector, quickly produced:

Current GPT partition table:
 No GPT partition table present!

Shouldn't this be just as evident to udisks? Possibly gptsync is looking only for the primary table. But when there's no trace of a primary table, let alone an corrupted one, should the secondary table be so aggressively pursued?

Perhaps udisks has good reason to monitor the fixed drive (A), but the aggressive pursuit of data from GPT, on a system where there's no GPT at all, seems like bad design. I even wonder if udisks' intense interest in reading that sector was what led to its failure, since there was no other reason for the system to ever go to that drive address.

Also, could this be the cause of other reports of similar cycles of READ FPDMA QUEUED errors using Ubuntu?

Revision history for this message
Whit Blauvelt (whit-launchpad) wrote :

I'd speculate that a few of the reports here might be related:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/550559

Revision history for this message
Nemo_bis (nemobis) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.