Unmounted USB drives wake up on 16.04

Bug #1733068 reported by paoletto
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu
Invalid
Undecided
Unassigned

Bug Description

On Ubuntu 16.04, if i plug in a USB drive, then unmount it (eject button) in nautilus, then force it to sleep (hdparm -y), the drive reports to be in standby:

hdparm -C /dev/sdd

/dev/sdd:
 drive state is: standby

However, after some time during which i have not tried to access the drive, the drive seems to spin up again, and reports to be active again:

hdparm -C /dev/sdd

/dev/sdd:
 drive state is: active/idle

I have been researching on google, askubuntu, irc, but apparently nobody has been able to tell what is that wakes up the drive. Everybody say that if the drive is unmounted, it should stay sleeping.

Note: I also tried to add the drive partitions in fstab, so to force gvfs to ignore the drive.
Even this failed to keep the drive sleeping.

I have not been able to test this on 14.04 yet, but 14.04 successfully keeps internal SATA drives asleep even when mounted unless i explicitly access them.

One additional question i would have, at this point, is also whether there is some tool to check what process tries to access the device.
Or some sort of tcpdump for SATA..

*UPDATE*
I managed to test this on 14.04, and as expected, the disk spins down almost immediately, and stays spun down, and after approximately half hour (maybe less) it goes in standby (disk led blinking regularly: http://gph.is/2yKy20M)
and stays like that.
Also, unlike 16.04, dmesg stays clean without repeating messages about meaningless driver sense errors

paoletto (paoletto)
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1733068/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
paoletto (paoletto) wrote :

yes, of course there's no source package to blame. Or, at least, i donno which one it would be..

tags: added: xenial
affects: ubuntu → nautilus (Ubuntu)
Revision history for this message
paoletto (paoletto) wrote :

Nobody that is able to even reproduce this?

Revision history for this message
Sebastien Bacher (seb128) wrote :

Brian, that doesn't seem a nautilus issue

affects: nautilus (Ubuntu) → ubuntu
Revision history for this message
Phillip Susi (psusi) wrote :

You can use blktrace to see who is accessing the drive. My first suspect would be udisksd so you might try killing that process and see if that makes it go away.

Revision history for this message
paoletto (paoletto) wrote :

Thank you for the suggestion!

So i tried it out, on the unmounted usb drive, and this is what i got:

http://paste.ubuntu.com/26128246/

the fear is that it might not (only) be some userspace daemon to keep the drive alive..
But it doesn't seem to me that it is easy to understand what's going on, even with btrace/blktrace..

Revision history for this message
paoletto (paoletto) wrote :

to compare, this is what i got on the 14.04 where unmounted HDs stay standby

http://paste.ubuntu.com/26128324/

Revision history for this message
paoletto (paoletto) wrote :

oh, and 14.04 disks stay mounted (just not accessed)

Revision history for this message
paoletto (paoletto) wrote :

Ok, so i managed to start btrace when i saw the disk sleeping and stopped after it woke up:
http://paste.ubuntu.com/26128577/

Revision history for this message
paoletto (paoletto) wrote :
Revision history for this message
paoletto (paoletto) wrote :

Updated the description: I tested with 14.04 and as expected everything goes to sleep when it should and stays sleeping.

I have the serious fear this is a regression in the kernel SATA subsystem.
HOWEVER, i'm willing to test it more on 16.04.

Could someone point me at a super stripped-down version of 16.04 where one can be absolutely certain there is no userspace daemon waking up the drives?
So that it would be possible to isolate the issue being a regression in the kernel or not

description: updated
Revision history for this message
paoletto (paoletto) wrote :

and, by the way, this is the kind of messages that flood dmesg:

[ 118.189091] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 118.189101] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 118.189106] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 118.189112] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00
[ 118.190080] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 118.190086] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 118.190090] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 118.190095] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 40 98 00

Revision history for this message
paoletto (paoletto) wrote :

also, on 16.04, in addition to the above messages, the drive not only does not go to standby (with the led blinking regularly), but it also does not spin down. Tested with 3 different 16.04 (2 ubuntus and one kde neon) on 2 different systems

Revision history for this message
paoletto (paoletto) wrote :
Download full text (3.7 KiB)

Ok, i think i gathered some relevant evidence.
I had the drive plugged on a 16.04, and at some point i noticed it went standby (led blinking intermittently).

Last dmesg lines were [ 6217.720607] perf interrupt took too long (5053 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 6631.699961] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 6631.699969] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 6631.699973] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 6631.699978] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 00 e5 00
[ 6632.179875] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 6632.179884] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 6632.179889] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 6632.179895] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(12)/Blank a1 06 20 da 00 00 4f c2 00 b0 00 00

i checked uptime and it was 1:55. So i took 5 min, and waited until the next DRIVE_SENSE error would appear in dmesg, monitoring the situation

And guess what, the drive woke up right before the next error appeared in dmesg!
Like, spin up (led blinking intermittently but much faster) and after 5-7 seconds, drive ready, and error on dmesg.

 21:14:06 up 2:00, 1 user, load average: 0,11, 0,41, 1,03
[ 6032.254264] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 6032.254266] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 6032.254270] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(12)/Blank a1 06 20 da 00 00 4f c2 00 b0 00 00
[ 6217.720607] perf interrupt took too long (5053 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 6631.699961] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 6631.699969] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 6631.699973] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 6631.699978] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 00 e5 00
[ 6632.179875] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 6632.179884] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 6632.179889] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 6632.179895] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(12)/Blank a1 06 20 da 00 00 4f c2 00 b0 00 00
[ 7238.669972] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 7238.669981] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Error [current] [descriptor]
[ 7238.669985] sd 7:0:0:0: [sdc] tag#0 Add. Sense: No additional sense information
[ 7238.669990] sd 7:0:0:0: [sdc] tag#0 CDB: ATA command pass through(16) 85 06 20 00 00 00 00 00 00 00 00 00 00 00 e5 00
[ 7238.794856] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_SENSE
[ 7238.794863] sd 7:0:0:0: [sdc] tag#0 Sense Key : Hardware Erro...

Read more...

Revision history for this message
Alan Stern (stern) wrote :

I have no idea which kernel versions Ubuntu uses in its releases, but the recurring Hardware Error bug should be fixed in the 4.14 kernel. At least I think it should -- nobody has reported on whether the fix actually works, so I have no way of knowing.

As for the drive spontaneously waking up... You may not be aware of this, but the kernel periodically probes for drive events, even when no user process is accessing the drive. You can control the probe frequency by writing to the /sys/block/sd?/events_poll_msecs file (fill in the ? with the drive letter). -1 means to use the system default probing interval and 0 means to disable polling entirely.

Now, I'm not saying that the kernel polling is what is waking up your drive. But it is something you need to know about.

Revision history for this message
paoletto (paoletto) wrote :

Thank you Alan, for providing these valuable info.
Actually i noticed the polling by looking at dmesg. I suspected that the errors in my distribution's kernel (4.4.0 in 16.04) were somehow related to a kernel polling process with a default of 10 minutes.

I will try to do both things you suggest: try out a 4.14+ kernel and disable the polling, hoping that the first will work.

FYI, 14.04 with 3.19 (iirc) doesn't wake up the drives out of the box. Is the polling something that has been added after? Or maybe there is some additional operation performed in the polling that now wakes up the drive (and is hopefully related to the fixed bug you mention).

I will post my findings as soon as i have them!

Revision history for this message
paoletto (paoletto) wrote :

first finding:

cat /sys/block/sdb/events_poll_msecs
-1

will now proceed testing a newer kernel to check if the problem has been fixed in the meanwhile

Revision history for this message
paoletto (paoletto) wrote :

working on it, but i was also thinking: The same value is returned in 14.04.
Can it be that 16.04 kernel does not honor that value and keeps polling at a 10minute interval?

these driver sense error always come up at a 600secs interval, and they are definitely generated by the same process that also wakes up the drives

Revision history for this message
paoletto (paoletto) wrote :

Trying 4.14.4 ((from mainline debs).
So far it looks promising: Hdparm seems now able to also query/set the status (before it was returning an error with this drive), and it's about one H since i forced it to standby and the drive is staying in standby.

Revision history for this message
paoletto (paoletto) wrote :

Spoke too early, drive just woke up out of the blue, after having been manually unmounted in nautilus, and forced to standby with hdparm -y (now that it works), without me re-mounting it or anything.

dmesg does not report anything relevant:
[ 641.041831] EXT4-fs (sde1): mounted filesystem with ordered data mode. Opts: (null)
[ 641.142793] EXT4-fs (sde2): mounted filesystem with ordered data mode. Opts: (null)
[ 1277.415515] perf: interrupt took too long (2538 > 2500), lowering kernel.perf_event_max_sample_rate to 78750
[ 2388.046183] perf: interrupt took too long (3176 > 3172), lowering kernel.perf_event_max_sample_rate to 62750

/sys/block/sde/events_poll_msecs seems to also default to -1

I keep monitoring to see what happens

Revision history for this message
Alan Stern (stern) wrote :

It wouldn't be surprising to find that udisks2 is the program waking up your drive.

Revision history for this message
paoletto (paoletto) wrote :

Seems that with the new kernel the disk stays awake and never even tries to go standby:

http://codepad.org/lIRqEBOW

Edit:
i did stop this script for a while, and the disk went to sleep (not standby, but i think suspend or more. The drive was not only spun down, but the light was blinking, so some deep sleep).

then i restarted it, and hdparm woke it up! (after which it reported active/idle).
I never experienced, before, that hdparm -C wakes up a drive.
Can it be that the subsystem now deep-probes the drive in some way?

Next (and possibly last) test i could do is trying 4.14 on a 14.04, assuming it will work.
To check if the kernel breaks the expected behavior that is currently provided by 14.04

Revision history for this message
paoletto (paoletto) wrote :

so far the feeling is that

-) 3.14 seems to support this drive better, as it appears to be able to retrieve its power status and to, more importantly, set it, unlike the previous kernel (4.4-)
-) 3.14 seems to be also more invasive. Previously, hdparm -C would return some error, but never wake up the drive. on internal SATA drives it would just work, and also never wake up the drive. here it seems that if the drive is in standby, the command works as expected, but as soon as it gets into some deeper sleep (not just spun down, but also led blinking intermittently), then the command wakes up the drive that spins up before hdparm returns an output
-) udisk2 seems to be doing something too (note: it is also present on 14.04, but it somehow doesn't get in the way in that release). I killed it manually, but at some point it respawned again, and automounted the drive which spun up.

I'm Still working to get the 3.14 on 14.04, but the fear is that it might be multiple things that regressed and might be hard to find. I might have to go back to 14.04, although, given the current Linux situation, i doubt that this bug would get even investigated in the next couple of years..

Revision history for this message
Alan Stern (stern) wrote :

If you want to get more information about what's happening at the USB level, you should use usbmon (see Documentation/usb/usbmon.txt in the kernel source for instructions). It will tell you what commands are being sent to the drive, but not which program is responsible for sending them. Still, you can use it to compare the different Ubuntu releases.

(Hint: If you do collect a usbmon trace, it helps to unplug beforehand as many of the other USB devices as you can, to prevent the data streams from getting all mixed together.)

Revision history for this message
Phillip Susi (psusi) wrote :

I'm not sure why you were getting those errors, but definitely stop the udisks service. Your blktrace shows it being accessed by the "pool" task. For some silly reason, the glib thread pool library that udisks uses decides to rename its threads to "pool" instead of the application name.

With udisks shut down, use hdparm -y to put the drive into suspend, then try checking its status with hdparm -C.

Revision history for this message
Paul White (paulw2u) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. We are sorry that we do not always have the capacity to review all reported bugs in a timely manner.

Ubuntu 16.04 (xenial) reached end-of-standard-support on April 29, 2021.

I'm setting the status of this bug to 'Incomplete' as it's not seen any activity for some time. If this is still an issue when using a currently maintained release of Ubuntu then please let us know which one(s) otherwise this bug report can be left to expire in approximately 60 days time.

Changed in ubuntu:
status: New → Incomplete
Revision history for this message
Paul White (paulw2u) wrote :

It seems that this issue won't expire due to the bugwatch relating to a RedHat bug report which was closed many years ago. With no comments relating to the Ubuntu issue since 2017 I'm closing this as 'Invalid'.

Changed in ubuntu:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.