Comment 31 for bug 1783906

MichaƂ Wadowski (wadosm) wrote :

Ok, I find out what happens.

While standard SATA setup, chain of functions is called: ... -> ata_eh_recover -> ata_eh_reset. If SATA is not initialized, then hard reset is performed (function ata_do_reset() ).

In both drivers I have, function ata_do_reset() returns 0, even if after this reset device is not working (like my second drive HDD).

There is no sata_down_spd_limit() calls at all. After hard reset second device is not working, and there is no tries to recover that.

At first time, or every time when I physically hotplug device, or I call "echo "- - -" > /sys/class/scsi_host/host1/scan", then chain of functions are called: ata_eh_recover -> ata_eh_schedule_probe, and variable trials is increasing. After a few hotplugs, when trials > ATA_EH_PROBE_TRIALS, then sata_down_spd_limit(link, 1) is called and it cuts down SATA bandwidth. After bandwidth limiting, hard reset is performed and then device is working.

I think it's wrong behavior when it's try to limit the bandwidth ony after many hotpluging and hard resets. It could try in one ata_eh_recover() call.

For my own, I changed a little code of ata_eh_reset() to check if the device is online after reset:

rc = ata_do_reset(link, reset, classes, deadline, true);
if( ata_link_offline(link) )
  rc = -EPIPE;

At the bottom of ata_eh_reset(), if rc == -EPIPE, then sata_down_spd_limit() is called and after that. This completely fixes my problem with not working drive. I don't have to manually reconnect device to be working. Only issue is some delay performed before next reset (schedule_timeout_uninterruptible function).

Maybe this conversation should be moved to Linux linux-ide mailing list, t