[regression] Changed CONFIG_SATA_MOBILE_LPM_POLICY=3 default causes I/O errors

Bug #1889968 reported by Tore Anderson
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-signed-hwe-5.4 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Somewhere between 4.15.0-112-generic and 5.3.0-62-generic the kernel config option SATA_MOBILE_LPM_POLICY was changed from 0 (the upstream default) to 3. This is causing frequent SATA link resets, resulting in I/O stalls and errors. For example:

ata1.00: exception Emask 0x0 SAct 0xdc0000 SErr 0x50000 action 0x6 frozen
ata1: SError: { PHYRdyChg CommWake }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/20:90:d8:62:c6/00:00:24:00:00/40 tag 18 ncq dma 16384 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/20:98:60:26:1e/00:00:00:00:00/40 tag 19 ncq dma 16384 in
         res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/08:a0:78:85:11/00:00:03:00:00/40 tag 20 ncq dma 4096 out
         res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/10:b0:80:60:c6/00:00:24:00:00/40 tag 22 ncq dma 8192 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08:b8:d0:13:ac/00:00:02:00:00/40 tag 23 ncq dma 4096 in
         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1.00: device reported invalid CHS sector 0
ata1.00: device reported invalid CHS sector 0
sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#23 Sense Key : Illegal Request [current]
sd 0:0:0:0: [sda] tag#23 Add. Sense: Unaligned write command
sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 02 ac 13 d0 00 00 08 00
blk_update_request: I/O error, dev sda, sector 44831696 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
ata1: EH complete

Available workarounds:

1) downgrading to 4.15.0-*-generic
2) appending 'ahci.mobile_lpm_policy=n' to the kernel command line, where 'n' is either 0, 1 or 2.

The meanings of policy numbers can be found at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/Kconfig?h=v5.8-rc7#n118:

 0 => Keep firmware settings
 1 => Maximum performance
 2 => Medium power
 3 => Medium power with Device Initiated PM enabled
 4 => Minimum power

The computer in question is an Intel NUC DN2820FYK (running the latest system firmware version), containing an embedded Intel Corporation Atom Processor E3800 Series SATA AHCI Controller (rev 0e) controller. The hard drive is a HITACHI HTS723232L9SA60.

I have confirmed that the issue persists in the latest mainline kernel build (5.8.0-050800rc7-generic).

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-5.4.0-42-generic 5.4.0-42.46~18.04.1
ProcVersionSignature: Ubuntu 5.4.0-42.46~18.04.1-generic 5.4.44
Uname: Linux 5.4.0-42-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.15
Architecture: amd64
Date: Sat Aug 1 10:51:29 2020
SourcePackage: linux-signed-hwe-5.4
UpgradeStatus: Upgraded to bionic on 2020-06-28 (33 days ago)

Revision history for this message
Tore Anderson (toreanderson) wrote :
Revision history for this message
Tore Anderson (toreanderson) wrote :
Revision history for this message
Tore Anderson (toreanderson) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-signed-hwe-5.4 (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.