When SCSI bus hangs SES driver indefinitely block any process accessing LED status for devices

Bug #1454158 reported by George Shuklin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Ubuntu 14.04, 3.13.0-40-generic

Configuration:

SCSI (mpt2sas, 16.100.00.00) with few enclosures with SATA disks.

Situation:
One of enclosures is hung and do not reply to any requests (including reset). All sg_* command to any device behind hanged enclosure, including enclosure itself, are stuck in the 'D' state.

Problem:
Access to sysfs place processes in 'D' state:

cat /sys/class/enclosure/5\:0\:46\:0/Slot\ 01/locate

root 588 0.0 0.0 7152 612 pts/6 D+ 08:40 0:00 cat /sys/class/enclosure/5:0:46:0/Slot 01/locate

Proposed solution: Add timeout to ses devices independent of HBA driver.

Rationale: Accessing to sysfs do not expect to be 'real IO' with chances to hung in 'D' forever.

Kernel bugzilla bug: https://bugzilla.kernel.org/show_bug.cgi?id=98121

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1454158

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Opinion
status: Opinion → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.1 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1-rc3-vivid/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
George Shuklin (george-shuklin) wrote :

I'm afraid it isn't possible. Expander hang is rare event and after reboot it come back to normal work. So this is very edge case.

I believe that problem is not fixed in upstream kernel because ses.c wasn't changed too much.

Btw, I dig in source code for a little and found that ses.c calls actually (after many wrapping functions) blk_execute_rq(), and timeout violation is clearly not a ses bug...

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
George Shuklin (george-shuklin) wrote :

It is still valid and not fixed.

Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1454158

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.