amd_iommu conflict with Marvell 88SE9230 SATA Controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Linux |
Fix Released
|
Medium
|
|||
linux (Debian) |
Fix Released
|
Unknown
|
|||
linux (Fedora) |
Unknown
|
Unknown
|
|||
linux (Ubuntu) |
Incomplete
|
Low
|
Unassigned |
Bug Description
Booting with kernel 4.18.0-
https:/
WORKAROUND: Use kernel boot parameter:
amd_iommu=off
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
CurrentDesktop: XFCE
DistroRelease: Ubuntu 18.04
HibernationDevice: RESUME=
InstallationDate: Installed on 2016-07-02 (913 days ago)
InstallationMedia: Mythbuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
IwConfig:
lo no wireless extensions.
enp9s0 no wireless extensions.
MachineType: Gigabyte Technology Co., Ltd. B450M S2H
Package: linux (not installed)
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=
ProcVersionSign
RelatedPackageV
linux-
linux-
linux-firmware 1.173.2
RfKill:
Tags: bionic
Uname: Linux 4.18.0-13-generic x86_64
UpgradeStatus: Upgraded to bionic on 2018-07-27 (158 days ago)
UserGroups: adm cdrom dip lpadmin mythtv nopasswdlogin plugdev sambashare sudo video
_MarkForUpload: True
dmi.bios.date: 12/04/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F2c
dmi.board.
dmi.board.name: B450M S2H
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.family: Default string
dmi.product.name: B450M S2H
dmi.product.sku: Default string
dmi.product.
dmi.sys.vendor: Gigabyte Technology Co., Ltd.
In Linux Kernel Bug Tracker #42679, pawel.zaq (pawel.zaq-linux-kernel-bugs) wrote : | #17 |
In Linux Kernel Bug Tracker #42679, pawel.zaq (pawel.zaq-linux-kernel-bugs) wrote : | #18 |
Created attachment 72218
Output of `lspci -knnv' command
In Linux Kernel Bug Tracker #42679, pawel.zaq (pawel.zaq-linux-kernel-bugs) wrote : | #19 |
Created attachment 72219
Kernel config
In Linux Kernel Bug Tracker #42679, public (public-linux-kernel-bugs) wrote : | #20 |
The same problem occurs on a Z68A-GD65 MSI G3 system Marvell 88SE91xx.
grep DMAR:
ACPI: DMAR beaff508 000B0 (v01 ALASKA A M I 00000001 INTL 00000001)
DMAR: Host address width 36
DMAR: DRHD base: 0x000000fed91000 flags: 0x1
DMAR: RMRR base: 0x000000bf4cc000 end: 0x000000bf4eefff
DMAR: No ATSR found
DMAR:[DMA Read] Request device [03:00.1] fault addr fffc0000
DMAR:[fault reason 02] Present bit in context entry is clear
grep IOMMU:
Intel-IOMMU: enabled
IOMMU 0: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a
IOMMU 0 0xfed91000: using Queued invalidation
IOMMU: Setting RMRR:
IOMMU: Setting identity map for device 0000:00:1d.0 [0xbf4cc000 - 0xbf4eefff]
IOMMU: Setting identity map for device 0000:00:1a.0 [0xbf4cc000 - 0xbf4eefff]
IOMMU: Prepare 0-16MiB unity mapping for LPC
IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
Intel-IOMMU: enabled
IOMMU 0: reg_base_addr fed91000 ver 1:0 cap c9008020660262 ecap f0105a
IOMMU 0 0xfed91000: using Queued invalidation
IOMMU: Setting RMRR:
IOMMU: Setting identity map for device 0000:00:1d.0 [0xbf4cc000 - 0xbf4eefff]
IOMMU: Setting identity map for device 0000:00:1a.0 [0xbf4cc000 - 0xbf4eefff]
IOMMU: Prepare 0-16MiB unity mapping for LPC
IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
grep ata8:
ata8: SATA max UDMA/133 abar m2048@0xfa310000 port 0xfa310180 irq 48
ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata8.00: qc timeout (cmd 0xec)
ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata8.00: qc timeout (cmd 0xec)
ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata8: limiting SATA link speed to 3.0 Gbps
ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
ata8.00: qc timeout (cmd 0xec)
ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
In Linux Kernel Bug Tracker #42679, public (public-linux-kernel-bugs) wrote : | #21 |
Created attachment 72419
config file / kernel 3.2.2
Kernel config
In Linux Kernel Bug Tracker #42679, public (public-linux-kernel-bugs) wrote : | #22 |
Created attachment 72420
Output of `lspci -knnv' command
Output of `lspci -knnv' command
In Linux Kernel Bug Tracker #42679, listenmitglied (listenmitglied-linux-kernel-bugs) wrote : | #23 |
I confirm this bug with kernel 3.2.6: same error with VT-d enabled in bios.
With mainboard "Asus Rampage III Gene", Z68, onboard Marvell; CPU Xeon L5520; 3x4GB Ram. Logs/Printouts follow this evening.
In Linux Kernel Bug Tracker #42679, listenmitglied (listenmitglied-linux-kernel-bugs) wrote : | #24 |
Created attachment 72733
kernel config
above bug confirmed with 3.2.13
In Linux Kernel Bug Tracker #42679, listenmitglied (listenmitglied-linux-kernel-bugs) wrote : | #25 |
Created attachment 72734
dmesg intel z68, asus rampage III gene, vt-d enable
In Linux Kernel Bug Tracker #42679, listenmitglied (listenmitglied-linux-kernel-bugs) wrote : | #26 |
Created attachment 72735
lspci, asus rampage III gene, z68, vt-d enable, 3.2.13
In Linux Kernel Bug Tracker #42679, listenmitglied (listenmitglied-linux-kernel-bugs) wrote : | #27 |
(In reply to comment #6)
> I confirm this bug with kernel 3.2.6: same error with VT-d enabled in bios.
>
> With mainboard "Asus Rampage III Gene", Z68, onboard Marvell; CPU Xeon L5520;
> 3x4GB Ram. Logs/Printouts follow this evening.
Also confirmed for current kernel 3.2.13.
In Linux Kernel Bug Tracker #42679, acooks (acooks-linux-kernel-bugs) wrote : | #28 |
From a pdf file by Intel with title "Intel® Virtualization Technology for Directed I/O
Architecture Specification":
--snip--
3.6.1.4 PCI Express Devices Using Phantom Functions
To increase the maximum possible number of outstanding requests requiring completion, PCI Express allows a device to use function numbers not assigned to implemented functions to logically extend the Tag identifier. Unclaimed function numbers are referred to as Phantom Function Numbers (PhFN). A device reports its support for phantom functions through the Device Capability configuration register, and requires software to explicitly enable use of phantom functions through the Device Control configuration register.
Since the function number is part of the requester-id used to locate the context-entry for processing a DMA request, when assigning PCI Express devices with phantom functions enabled, software must program multiple context entries, each corresponding to the PhFN enabled for use by the device function. Each of these context-entries must be programmed identically to ensure the DMA requests with any of these requester-ids are processed identically.
--snip--
grep -ri phant says pci_regs.h knows about the capability, but it doesn't appear anywhere else in the kernel as far as I can see. Look for PCI_EXP_
Unfortunately, lspci indicates that the Marvell chip is not using phantom functions (lspci upload to follow), so at this point I can't tell if I'm on the right trail.
Caveat lector: I don't have any previous experience with low-level PCI stuff.
In Linux Kernel Bug Tracker #42679, acooks (acooks-linux-kernel-bugs) wrote : | #29 |
Created attachment 73265
lspci output including device capabilities
In Linux Kernel Bug Tracker #42679, grythumn (grythumn-linux-kernel-bugs) wrote : | #30 |
I'm seeing similar errors with AMD-Vi (AMD's IOMMU implementation) and a couple of Marvell 88SE9128-based cards, and can confirm that it is still present in 3.7.0 builds.
https:/
In Linux Kernel Bug Tracker #42679, stijn+bugs (stijn+bugs-linux-kernel-bugs) wrote : | #31 |
This problem happens here as well. Asus P9X79 WS, BIOS 3306, X79, i7-3930K. Running kernel 3.7.3. In addition to being unable to use the Marvel SATA controller ports, this causes a ~40s hang during boot.
I tried contacting Asus about this, as I think this could be fixed by a BIOS update, but they replied to me in horrible English they do not support Linux. I'll think twice before buying Asus again in the future, but it would be nice if a workaround could be implemented in the kernel.
In Linux Kernel Bug Tracker #42679, stijn+bugs (stijn+bugs-linux-kernel-bugs) wrote : | #32 |
Created attachment 91521
dmesg on Asus P9X79 WS, kernel 3.7.3
In Linux Kernel Bug Tracker #42679, stijn+bugs (stijn+bugs-linux-kernel-bugs) wrote : | #33 |
Created attachment 91531
lspci -knvv on Asus P9X79 WS, kernel 3.7.3
In Linux Kernel Bug Tracker #42679, stijn+bugs (stijn+bugs-linux-kernel-bugs) wrote : | #34 |
FWIW, I still have this issue with 3.7.8 and 3.8-rc7. BIOS update 3401 for the P9X79 WS didn't help. Additionally the hang during boot becomes worse (up to ~65 seconds), when a hard drive is connected. Since the drive is unusable anyway, I hacked the AHCI driver to ignore the Marvell controller. While no solution to this problem, at least my boot time is back to normal (<30s).
In Linux Kernel Bug Tracker #42679, tradofox (tradofox-linux-kernel-bugs) wrote : | #35 |
Same problem with Marvell 88SE9172 SATA Controller.
I have Gigabyte GA-Z77X-UD5H with two Marvell 88SE9172 SATA controllers and Intel E3-1245v2 CPU. VT-d is enabled. When running normal Debian 7 or >Ubuntu 12.04 i can see HDDs and SSDs connected to Marvell ports. After installing XenServer 6.1 and Xen Cloud Platform 1.6 - HDDs and SSDs are not detected, but lspci showing that Marvell 88SE9172 controllers are detected.
In Linux Kernel Bug Tracker #42679, lizhenhua (lizhenhua-linux-kernel-bugs) wrote : | #36 |
The root cause of this bug seems to be : the device illegally accessed the memory that should be reserved for IOMMU module, and this changed iommu registers.
In Linux Kernel Bug Tracker #42679, bhelgaas (bhelgaas-linux-kernel-bugs) wrote : | #37 |
ZhenHua, can you elaborate on this? Do you mean a device accessed the MMIO space used to program the IOMMU itself? If so, how did you conclude that? I doubt the IOMMU space is at address 0xfff00000.
Based on the following data:
Paweł:
DMAR:[DMA Read] Request device [0b:00.1] fault addr fff00000
DMAR:[fault reason 02] Present bit in context entry is clear
0b:00.0 [0106]: Marvell [1b4b:9123]
Korneliusz:
DMAR:[DMA Read] Request device [03:00.1] fault addr fffc0000
DMAR:[fault reason 02] Present bit in context entry is clear
03:00.0 [0106]: Marvell 88SE9123 SATA [1b4b:9123]
Daniel:
IOMMU identity map errors (assuming unrelated for now)
DMAR:[DMA Read] Request device [01:00.1] fault addr fff00000
DMAR:[fault reason 02] Present bit in context entry is clear
01:00.0 [0106]: Marvell 88SE9123 SATA [1b4b:9123]
Stijn:
dmar: DMAR:[DMA Read] Request device [07:00.1] fault addr fff00000
DMAR:[fault reason 02] Present bit in context entry is clear
07:00.0 0106: 1b4b:9130 (rev 11) (prog-if 01 [AHCI 1.0])
in each case the IOMMU saw a DMA read to an address that wasn't mapped for the requesting device. In each case, the requester is function .1, the kernel doesn't know about a .1 function, and there is a Marvell 912x SATA control at the corresponding .0 function.
Andrew's Phantom Function theory seems like a good direction to explore. Maybe these devices incorrectly report Phantom Function support in the Device Capability & Control, and we just need some sort of quirk to work around that.
It would be interesting to know whether the .0 Marvell function has valid IOMMU mappings for the fault addresses (0xfff00000 or 0xfffc0000), or whether there is really anything at those addresses. They seem like dubious targets for DMA.
In Linux Kernel Bug Tracker #42679, zhen-hual (zhen-hual-linux-kernel-bugs) wrote : | #38 |
Hi guys,
1. Since there are only lspci running in "intel_iommu=on", could you paste lspci -vvv and lspci -t, lspci -n when intel_iommu is not set to on?
Thanks
ZhenHua
In Linux Kernel Bug Tracker #42679, acooks (acooks-linux-kernel-bugs) wrote : | #39 |
Created attachment 109981
Patch with quirk for incorrect PCI requester IDs
Here's a patch that provides a quirk for what I believe to be the root cause: devices that use incorrect PCI requester IDs, including Marvell 91xx controllers.
Various revisions have been sent to LKML and IOMMU-list in the past and a number of people have reported that it solved their problem and I've been running this on two boxes for months. I'm not sure why it hasn't been accepted.
Note that there are several devices that suffer from the same affliction, i.e., using incorrect PCI requester IDs in when their transactions. The Marvell devices use both xx:yy.0 and xx:yy.1, possibly related to the SATA port number. Other devices, like Ricoh's R5C832 PCIe IEEE 1394 Controller commonly found in T410 and T420 Thinkpads use a single incorrect requester ID.
Please try this patch and let me know if it works for you.
In Linux Kernel Bug Tracker #42679, zhen-hual (zhen-hual-linux-kernel-bugs) wrote : | #40 |
Each context_entry has a present bit. If a context entry is used for a device, but its present bit is not set to 1, an error with fault number 2 will occur.
I tested on my PC, comment a line "context_
86 comments hidden Loading more comments | view all 166 comments |
In Linux Kernel Bug Tracker #42679, alex.williamson (alex.williamson-linux-kernel-bugs) wrote : | #127 |
(In reply to frollic from comment #109)
> Is there any progress ?
>
> I'm hitting this error on Fedora 3.17.8-200.fc20 kernel, which makes my
> system pretty much unusable :(
>
> 07:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe
> SATA 6Gb/s Controller [1b4b:9230] (rev 10) (prog-if 01 [AHCI 1.0])
It should have been fixed in v3.16 by cc346a4714 for this device. Are you sure you're seeing the same error? What are the symptoms?
In Linux Kernel Bug Tracker #42679, alex.williamson (alex.williamson-linux-kernel-bugs) wrote : | #128 |
Actually, refreshing my memory in the comments here, others are also reporting that issues for 1b4b:9230 persist, but they're different than the problem we're trying to fix here and suggest either broken hardware or broken driver (or both). As suggested previously, if you're not getting DMAR faults, file a new bug.
In Linux Kernel Bug Tracker #42679, frollic (frollic-linux-kernel-bugs) wrote : | #129 |
Indeed, I don't have DMAR errors in my syslog.
Drives are 3 * WDC WD20EFRX-68EUZN0, 82.00A82, max UDMA/133 running
soft-RAID5.
One SAMSUNG SSD SM841 mSATA 128GB, DXM43D0Q, max UDMA/133 in a mSAT->SATA case/converter.
Feb 4 19:09:43 atlantis kernel: [ 464.228813] ata3: failed to read log page 10h (errno=-5)
Feb 4 19:09:43 atlantis kernel: [ 464.231988] ata3.00: exception Emask 0x1 SAct 0xc000 SErr 0x0 action 0x0
Feb 4 19:09:43 atlantis kernel: [ 464.235233] ata3.00: irq_stat 0x40000008
Feb 4 19:09:43 atlantis kernel: ata3: failed to read log page 10h (errno=-5)
Feb 4 19:09:43 atlantis kernel: ata3.00: exception Emask 0x1 SAct 0xc000 SErr 0x0 action 0x0
Feb 4 19:09:43 atlantis kernel: ata3.00: irq_stat 0x40000008
Feb 4 19:09:43 atlantis kernel: [ 464.238596] ata3.00: failed command: READ FPDMA QUEUED
Feb 4 19:09:43 atlantis kernel: [ 464.242000] ata3.00: cmd 60/00:70:
Feb 4 19:09:43 atlantis kernel: [ 464.242000] res 50/00:00:
Feb 4 19:09:43 atlantis kernel: ata3.00: failed command: READ FPDMA QUEUED
Feb 4 19:09:43 atlantis kernel: ata3.00: cmd 60/00:70:
res 50/00:00:
Feb 4 19:09:43 atlantis kernel: [ 464.248733] ata3.00: status: { DRDY }
Feb 4 19:09:43 atlantis kernel: ata3.00: status: { DRDY }
Feb 4 19:09:43 atlantis kernel: [ 464.252192] ata3.00: failed command: READ FPDMA QUEUED
Feb 4 19:09:43 atlantis kernel: [ 464.255558] ata3.00: cmd 60/00:78:
Feb 4 19:09:43 atlantis kernel: [ 464.255558] res 50/00:00:
Feb 4 19:09:43 atlantis kernel: ata3.00: failed command: READ FPDMA QUEUED
Feb 4 19:09:43 atlantis kernel: ata3.00: cmd 60/00:78:
res 50/00:00:
Feb 4 19:09:43 atlantis kernel: [ 464.262523] ata3.00: status: { DRDY }
Feb 4 19:09:43 atlantis kernel: ata3.00: status: { DRDY }
Feb 4 19:09:43 atlantis kernel: [ 464.272877] ata3.00: revalidation failed (errno=-2)
Feb 4 19:09:43 atlantis kernel: [ 464.276284] ata3: hard resetting link
Feb 4 19:09:43 atlantis kernel: ata3.00: revalidation failed (errno=-2)
Feb 4 19:09:43 atlantis kernel: ata3: hard resetting link
Feb 4 19:09:44 atlantis kernel: [ 464.586712] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 4 19:09:44 atlantis kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 4 19:09:44 atlantis kernel: [ 464.593370] ata3.00: configured for UDMA/133
Feb 4 19:09:44 atlantis kernel: [ 464.596855] ata3: EH complete
Feb 4 19:09:44 atlantis kernel: ata3.00: configured for UDMA/133
Feb 4 19:09:44 atlantis kernel: ata3: EH complete
Feb 4 19:10:03 atlantis kernel: [ 484.234979] ata3: failed to read log page 10h (errno=-5)
Feb 4 19:10:03 atlantis kernel: [ 484.238484] ata3.00: exception Emask 0x1 SAct 0xc000000 SErr 0x0 action 0x0
Feb 4 19:10:03 atlantis kernel: [ 484.242039] ata3.00: irq_stat 0x40000008
Fe...
In Linux Kernel Bug Tracker #42679, frollic (frollic-linux-kernel-bugs) wrote : | #130 |
In addition, mobo is brand new (doesn't mean it can't be faulty), WDC drives are 2 months old (installed just before X-mas last year). The SSD was purchased used, so I can't tell you how old that is.
All of the hardware, except for the Samsung SSD, ran just fine on my Supermicro X7SPA-H, before I swapped mobo just two days ago.
In Linux Kernel Bug Tracker #42679, kernel (kernel-linux-kernel-bugs) wrote : | #131 |
(In reply to Alex Williamson from comment #95)
I encountered same problem on PX-G128M6e (Plextor M6e series SSD) and resolved it by the patch.
(actually, I used the 4.0.5 kernel patched with the code described in https:/
Booting with the ssd and passthrough the ssd to a guest OS both work correctly.
My system is Asus H97M-PLUS with Bios 2501 and PX-G128M6e with firmware revision 1.06.
The kernel .config is Arch's linux 4.0.5-1 package.
In Linux Kernel Bug Tracker #42679, kernel (kernel-linux-kernel-bugs) wrote : | #132 |
Created attachment 179951
dmesg of 4.0.5 vanilla kernel with iommu=on
`grep -i -e dmar -e iommu` is below
[ 0.000000] Command line: BOOT_IMAGE=
[ 0.000000] ACPI: DMAR 0x00000000DAC6CED0 0000B8 (v01 INTEL BDW 00000001 INTL 00000001)
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 0.000000] Intel-IOMMU: enabled
[ 0.107086] dmar: Host address width 39
[ 0.107098] dmar: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.107123] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[ 0.107138] dmar: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.107154] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da
[ 0.107169] dmar: RMRR base: 0x000000dbe7b000 end: 0x000000dbe89fff
[ 0.107179] dmar: RMRR base: 0x000000dd000000 end: 0x000000df1fffff
[ 0.107191] IOAPIC id 8 under DRHD base 0xfed91000 IOMMU 1
[ 0.685402] DMAR: No ATSR found
[ 0.685642] IOMMU: dmar0 using Queued invalidation
[ 0.685651] IOMMU: dmar1 using Queued invalidation
[ 0.685662] IOMMU: Setting RMRR:
[ 0.685694] IOMMU: Setting identity map for device 0000:00:02.0 [0xdd000000 - 0xdf1fffff]
[ 0.686154] IOMMU: Setting identity map for device 0000:00:14.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.686215] IOMMU: Setting identity map for device 0000:00:1a.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.686268] IOMMU: Setting identity map for device 0000:00:1d.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.686308] IOMMU: Prepare 0-16MiB unity mapping for LPC
[ 0.686329] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 0.847930] dmar: DRHD: handling fault status reg 2
[ 0.848264] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 1.161006] dmar: DRHD: handling fault status reg 3
[ 1.161963] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 6.159656] dmar: DRHD: handling fault status reg 2
[ 6.160750] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 6.472980] dmar: DRHD: handling fault status reg 3
[ 6.473513] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 11.471329] dmar: DRHD: handling fault status reg 2
[ 11.471661] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 11.784476] dmar: DRHD: handling fault status reg 3
[ 11.785472] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
[ 16.783038] dmar: DRHD: handling fault status reg 2
[ 16.783646] dmar: DMAR:[DMA Write] Request device [02:00.1] fault addr fffe0000
In Linux Kernel Bug Tracker #42679, kernel (kernel-linux-kernel-bugs) wrote : | #133 |
Created attachment 179961
dmesg of 4.0.5 patched kernel with iommu=on
`grep -i -e dmar -e iommu` is below
[ 0.000000] Command line: BOOT_IMAGE=
[ 0.000000] ACPI: DMAR 0x00000000DAC6CED0 0000B8 (v01 INTEL BDW 00000001 INTL 00000001)
[ 0.000000] Kernel command line: BOOT_IMAGE=
[ 0.000000] Intel-IOMMU: enabled
[ 0.107025] dmar: Host address width 39
[ 0.107037] dmar: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.107060] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[ 0.107075] dmar: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.107092] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008c20660462 ecap f010da
[ 0.107107] dmar: RMRR base: 0x000000dbe7b000 end: 0x000000dbe89fff
[ 0.107117] dmar: RMRR base: 0x000000dd000000 end: 0x000000df1fffff
[ 0.107129] IOAPIC id 8 under DRHD base 0xfed91000 IOMMU 1
[ 0.688999] DMAR: No ATSR found
[ 0.689240] IOMMU: dmar0 using Queued invalidation
[ 0.689249] IOMMU: dmar1 using Queued invalidation
[ 0.689259] IOMMU: Setting RMRR:
[ 0.689292] IOMMU: Setting identity map for device 0000:00:02.0 [0xdd000000 - 0xdf1fffff]
[ 0.689754] IOMMU: Setting identity map for device 0000:00:14.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.689816] IOMMU: Setting identity map for device 0000:00:1a.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.689868] IOMMU: Setting identity map for device 0000:00:1d.0 [0xdbe7b000 - 0xdbe89fff]
[ 0.689908] IOMMU: Prepare 0-16MiB unity mapping for LPC
[ 0.689930] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[ 66.222474] [drm] DMAR active, disabling use of stolen memory
In Linux Kernel Bug Tracker #42679, kernel (kernel-linux-kernel-bugs) wrote : | #134 |
`lscpi -nnvv`
02:00.0 SATA controller [0106]: Lite-On IT Corp. / Plextor M6e PCI Express SSD [Marvell 88SS9183] [1c28:0122] (rev 14) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. Device [1b4b:9183]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 30
Region 0: I/O ports at e050 [size=8]
Region 1: I/O ports at e040 [size=4]
Region 2: I/O ports at e030 [size=8]
Region 3: I/O ports at e020 [size=4]
Region 4: I/O ports at e000 [size=32]
Region 5: Memory at f7c20000 (32-bit, non-prefetchable) [size=512]
Expansion ROM at f7c00000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00378 Data: 0000
Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCo
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationCom
Equalizatio
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout+ NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Kernel driver in use: ahci
Kernel modules: ahci
----
`lscpi -nnvv` on the host with passthrough the ssd to a guest OS
02:00.0 SATA controller [0106]: Lite-On IT Corp. / Plextor M6e PCI Express SSD [Marvell 88SS9183] [1c28:0122] (rev 14) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. Device [1b...
In Linux Kernel Bug Tracker #42679, tasos (tasos-linux-kernel-bugs) wrote : | #135 |
I believe I am affected by the same bug with the Marvell 88SE9120 controller on an ASRock 990FX Extreme 4 motherboard.
Although there are no DMAR errors in dmesg, when AMD's IOMMU is enabled in the bios I get the following a couple of times, before it gives up
[ 117.616423] ata9: hard resetting link
[ 117.632972] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.1 domain=0x0000 address=
[ 117.632982] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.1 domain=0x0000 address=
[ 118.340472] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.1 domain=0x0000 address=
[ 122.616621] ata9: softreset failed (1st FIS failed)
[ 122.616632] ata9: reset failed, giving up
[ 122.616640] ata9: EH complete
Once the controller's dev ID was added to drivers/
[ 1520.100391] ata9: hard resetting link
[ 1526.038156] ata9: SATA link up 3.0 Gbps (SStatus 123 SControl 330)
[ 1526.044554] ata9.00: ATA-7: SAMSUNG HD502IJ, 1AA01112, max UDMA7
[ 1526.044559] ata9.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[ 1526.050996] ata9.00: configured for UDMA/133
[ 1526.051007] ata9: EH complete
And here is the patch
--- a/drivers/
+++ b/drivers/
@@ -3589,6 +3589,8 @@ DECLARE_
/* https:/
DECLARE_
quirk_
+DECLARE_
+ quirk_dma_
DECLARE_
quirk_
/* https:/
Could this device id be added to the list of affected devices?
In Linux Kernel Bug Tracker #42679, alex.williamson (alex.williamson-linux-kernel-bugs) wrote : | #136 |
(In reply to Tasos Sahanidis from comment #118)
>
> Could this device id be added to the list of affected devices?
It's already queued in the pull request for v4.2:
In Linux Kernel Bug Tracker #42679, tasos (tasos-linux-kernel-bugs) wrote : | #137 |
(In reply to Alex Williamson from comment #119)
> It's already queued in the pull request for v4.2:
>
> http://
> pci/quirks.
Apologies for that, did not see it.
Thank you for your time!
In Linux Kernel Bug Tracker #42679, bill.hudacek (bill.hudacek-linux-kernel-bugs) wrote : | #138 |
Hi. Old Newbie to kernel things here. I see from Alex's (initial?) patch at https:/
However, exploring at https:/
So - I'm probably looking in all the wrong places.
I've just set up Fedora 22 4.1.3-200.
ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata10.00: failed command: WRITE DMA
ata10.00: cmd ca/00:01:
res 40/00:00:
ata10.00: status: { DRDY }
ata10: hard resetting link
ata10: link is slow to respond, please be patient (ready=0)
ata10: COMRESET failed (errno=-16)
ata10: hard resetting link
ata10: link is slow to respond, please be patient (ready=0)
ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata10.00: qc timeout (cmd 0xec)
ata10.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata10.00: revalidation failed (errno=-5)
ata10: hard resetting link
ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata11.00: failed command: READ DMA EXT
ata11.00: cmd 25/00:10:
res 40/00:00:
ata11.00: status: { DRDY }
ata11: hard resetting link
ata10: link is slow to respond, please be patient (ready=0)
ata11: link is slow to respond, please be patient (ready=0)
ata10: COMRESET failed (errno=-16)
ata10: hard resetting link
ata11: COMRESET failed (errno=-16)
ata11: hard resetting link
This is a StarTech PEXSAT31E1 add-on so it's not booting the system. It's connected to a external cabinet, and I'm using mdadm for RAID-5. All drives report the same issues (logging not included here) which is what had me looking at the controller.
I am really hoping it's not included yet - which would both explain the issue and the fact that 'the fix is in'.
I've not built a kernel since - well, a long time ago - Ubuntu 6.10 or so. Now I might get a chance to try it on Fedora.
Please let me know if it would help if I provided more info. Sure looks like I'm just like most others here...
Can anyone Help?
Many Thanks :-)
/Bill
In Linux Kernel Bug Tracker #42679, bill.hudacek (bill.hudacek-linux-kernel-bugs) wrote : | #139 |
*bump*
I'm down here. I'm contemplating getting a 3ware and going the hardware route. I've had pretty horrid experience with Highpoint support (non-existent) and the Marvell controllers seem to be dysfunctional. Vendor who sold me the card could not provide any drivers or firmware updates, so this is my only possible path to a solution using this type of controller - the kernel patch(es).
Thanks.
In Linux Kernel Bug Tracker #42679, frollic (frollic-linux-kernel-bugs) wrote : | #140 |
For the 9230 you might want to check the updated BIOS we've discussed at:
http://
In Linux Kernel Bug Tracker #42679, oh-itsme (oh-itsme-linux-kernel-bugs) wrote : | #141 |
(In reply to frollic from comment #123)
> For the 9230 you might want to check the updated BIOS we've discussed at:
> http://
> updates-and-such/
I had found that thread in a websearch as I have encountered similar issues as you had, also using a Supermicro X10SBA. I had contacted Supermicro about this, but support did not really seem to be aware of this issue, and no update for the controller was sent to me. The thread you refer to does not state the outcome of applying the firmware to the X10SBA, does it solve the issue?
In Linux Kernel Bug Tracker #42679, frollic (frollic-linux-kernel-bugs) wrote : | #142 |
(In reply to oh-itsme from comment #124)
> I had found that thread in a websearch as I have encountered similar issues
> as you had, also using a Supermicro X10SBA. I had contacted Supermicro about
> this, but support did not really seem to be aware of this issue, and no
> update for the controller was sent to me.
I was in touch with the dutch support of Supermicro, they were very helpful, it took them about 10 days to obtain the update from Marvell.
The person I was in contact with wrote that the update would be posted along with the next BIOS update for the motherboard, but I don't think it actually happened :(
> The thread you refer to does not state the outcome of applying the firmware
> to the X10SBA, does it solve the issue?
Yes it helpmed me, the soft-RAID is running fine now, even though I get occasional mismatch_cnt is not 0 on /dev/mdXXX when running raid-check.
In Linux Kernel Bug Tracker #42679, tasos (tasos-linux-kernel-bugs) wrote : | #143 |
There seems to have been a regression sometime after the 4.3 tag (6a13feb9c82803
This results in the drives attached to the controller becoming inaccessible.
Please note that this time the quirk for my device is present in drivers/
In Linux Kernel Bug Tracker #42679, kevosev23194 (kevosev23194-linux-kernel-bugs) wrote : | #144 |
Hi There
Just want to address a problem with Asrock Extreme 9 X79 with BIOS P4.00 platform and its Marvell 88SE9220 controller.
I expecience the same faults as the above DMAR faults when this controller is enabled.
However the problem appears to be resolved by adding a new entry in quirks.c
DECLARE_
Let me know if you need me to attach any logs of faults, at the moment I'm using a custom compiled kernel with the above fix on Arch Linux but can switch to a standard kernel.
Kind Regards,
In Linux Kernel Bug Tracker #42679, alan (alan-linux-kernel-bugs) wrote : | #145 |
If you've got the quirk fix and done the testing then I would see Documentation/
Send it to <email address hidden> and it should get reviewed and merged
Alan
In Linux Kernel Bug Tracker #42679, microsoftenator (microsoftenator-linux-kernel-bugs) wrote : | #146 |
I can confirm that this issue occurs with the Marvell 88SE9128 controller on my Gigabyte GA-X59A-UD7 (rev2.0) motherboard. As with Kevin Hunt above, adding a new entry in quirks.c appears to resolve the issue.
Given the name of this bug, I was surprised that the 9128 wasn't in there.
In Linux Kernel Bug Tracker #42679, microsoftenator (microsoftenator-linux-kernel-bugs) wrote : | #147 |
Addendum to the above:
The 9128 *does* appear to be in quirks file for mainline, but not in the kernel provided by Arch Linux (4.15.15). It seems that was either added in 4.16 or Arch's patches removed it for some reason.
In Linux Kernel Bug Tracker #42679, bhelgaas (bhelgaas-linux-kernel-bugs) wrote : | #148 |
http://
http://
Are there any devices that are still broken in v4.17-rc1? If not, maybe we can close this bug?
In Linux Kernel Bug Tracker #42679, k8wtaylnuuz7 (k8wtaylnuuz7-linux-kernel-bugs) wrote : | #149 |
(In reply to Bjorn Helgaas from comment #131)
> http://
> ?id=aa0082066343 for Marvell 9128 appeared in v4.16-rc1.
>
> http://
> ?id=832e4e1f76b8 for Marvell 88SE9220 appeared in v4.17-rc1.
>
> Are there any devices that are still broken in v4.17-rc1? If not, maybe we
> can close this bug?
I still have this issue with a Marvell 88SE9230 and kernel v4.16.8 under Arch Linux. It's probably worth checking all their SATA Controllers before closing this bug: https:/
In Linux Kernel Bug Tracker #42679, bhelgaas (bhelgaas-linux-kernel-bugs) wrote : | #150 |
v4.16 already contains a quirk for the Marvell 88SE9230 (added by cc346a4714a5 ("PCI: Add function 1 DMA alias quirk for Marvell devices") way back in v3.16).
But from comment #44 and comments #49-#58, it sounds like the 9230 has other problems in addition to this one, so I suspect you're seeing those other problems. If so, can you open a new bug for that and copy Joshua and Alex? I took a quick look and didn't see a definitive resolution for the problems Joshua reported.
I'm going to close this one and if people see more problems that are resolved by quirk_dma_
In Linux Kernel Bug Tracker #42679, f.bluethner (f.bluethner-linux-kernel-bugs) wrote : | #151 |
I have this issue with "Marvell Technology Group Ltd. 88SS9183 PCIe SSD Controller" in my "Asus Rog Strix Z370-F Gaming" and solved it by adding "DECLARE_
quirk_dma_
149 comments hidden Loading more comments | view all 166 comments |
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs. | #1 |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1810239
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.
Steven Ellis (steven-openmedia) wrote : Re: amd_iommu conflict with Marvell Sata controller | #2 |
root@mythfe-amd:~# lspci -knnv -s 01:00.0
01:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230] (rev 11) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230]
Flags: bus master, fast devsel, latency 0, IRQ 56
I/O ports at f050 [size=8]
I/O ports at f040 [size=4]
I/O ports at f030 [size=8]
I/O ports at f020 [size=4]
I/O ports at f000 [size=32]
Memory at f7d10000 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at f7d00000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [70] Express Legacy Endpoint, MSI 00
Capabilities: [e0] SATA HBA v0.0
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: ahci
Kernel modules: ahci
Changed in linux (Ubuntu): | |
status: | New → Incomplete |
tags: | added: cosmic |
Steven Ellis (steven-openmedia) wrote : AlsaInfo.txt | #3 |
tags: | added: apport-collected bionic |
description: | updated |
Steven Ellis (steven-openmedia) wrote : CRDA.txt | #4 |
Steven Ellis (steven-openmedia) wrote : CurrentDmesg.txt | #5 |
Steven Ellis (steven-openmedia) wrote : Lspci.txt | #6 |
Steven Ellis (steven-openmedia) wrote : Lsusb.txt | #7 |
Steven Ellis (steven-openmedia) wrote : ProcCpuinfo.txt | #8 |
Steven Ellis (steven-openmedia) wrote : ProcCpuinfoMinimal.txt | #9 |
Steven Ellis (steven-openmedia) wrote : ProcEnviron.txt | #10 |
Steven Ellis (steven-openmedia) wrote : ProcInterrupts.txt | #11 |
Steven Ellis (steven-openmedia) wrote : ProcModules.txt | #12 |
Steven Ellis (steven-openmedia) wrote : PulseList.txt | #13 |
Steven Ellis (steven-openmedia) wrote : UdevDb.txt | #14 |
Steven Ellis (steven-openmedia) wrote : WifiSyslog.txt | #15 |
Kai-Heng Feng (kaihengfeng) wrote : Re: amd_iommu conflict with Marvell Sata controller | #16 |
Would it be possible for you to test the latest upstream kernel? Refer
to https:/
v4.20 kernel[0].
If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-
If the mainline kernel does not fix this bug, please add the tag:
'kernel-
Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".
Thanks in advance.
Changed in linux: | |
importance: | Unknown → Medium |
status: | Unknown → Fix Released |
135 comments hidden Loading more comments | view all 166 comments |
Steven Ellis (steven-openmedia) wrote : Re: amd_iommu conflict with Marvell Sata controller | #152 |
Looks like there is a new upstream issue with
- https:/
Steven Ellis (steven-openmedia) wrote : | #153 |
I attempted a boot with the following upstream kernel packages
linux-
linux-
On boot I see the following errors
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: qc timeout (cmd 0xef)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: failed to set xfermode (err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata4: limiting SATA link speed to 1.5 Gbps
Jan 02 22:09:23 mythfe-amd kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: qc timeout (cmd 0xa1)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: revalidation failed (errno=-5)
Jan 02 22:09:23 mythfe-amd kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: qc timeout (cmd 0xa1)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: revalidation failed (errno=-5)
Jan 02 22:09:23 mythfe-amd kernel: ata8.00: disabled
Jan 02 22:09:23 mythfe-amd kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: qc timeout (cmd 0xec)
Jan 02 22:09:23 mythfe-amd kernel: ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Jan 02 22:09:23 mythfe-amd kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
Jan 02 22:09:23 mythfe-amd kernel: ata10: SATA link down (SStatus 0 SControl 330)
Jan 02 22:09:23 mythfe-amd kernel: ata13: SATA link down (SStatus 0 SControl 330)
Jan 02 22:09:23 mythfe-amd kernel: ata14: SATA link down (SStatus 0 SControl 330)
Steven Ellis (steven-openmedia) wrote : | #154 |
Rebooted with the 4.20.0-
Jan 02 22:10:52 mythfe-amd kernel: ata8.00: ATAPI: MARVELL VIRTUAL, 1.09, max UDMA/66
Jan 02 22:10:52 mythfe-amd kernel: ata8.00: configured for UDMA/66
Jan 02 22:10:52 mythfe-amd kernel: ata4.00: ATA-8: ST3500418AS, CC46, max UDMA/133
Jan 02 22:10:52 mythfe-amd kernel: ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 32)
Jan 02 22:10:52 mythfe-amd kernel: ata4.00: configured for UDMA/133
Jan 02 22:10:52 mythfe-amd kernel: ata2.00: ATA-7: ST3250820AS, 3.AAE, max UDMA/133
Jan 02 22:10:52 mythfe-amd kernel: ata2.00: 488397168 sectors, multi 0: LBA48 NCQ (depth 32)
Jan 02 22:10:52 mythfe-amd kernel: ata2.00: configured for UDMA/133
Jan 02 22:10:52 mythfe-amd kernel: scsi 1:0:0:0: Direct-Access ATA ST3250820AS E PQ: 0 ANSI: 5
Jan 02 22:10:52 mythfe-amd kernel: sd 1:0:0:0: Attached scsi generic sg0 type 0
Jan 02 22:10:52 mythfe-amd kernel: sd 1:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/233 GiB)
Jan 02 22:10:52 mythfe-amd kernel: scsi 3:0:0:0: Direct-Access ATA ST3500418AS CC46 PQ: 0 ANSI: 5
Jan 02 22:10:52 mythfe-amd kernel: sd 1:0:0:0: [sda] Write Protect is off
Changed in linux (Debian): | |
status: | Unknown → New |
description: | updated |
tags: | added: kernel-bug-exists-upstream-4.20 latest-bios-f2 |
summary: |
- amd_iommu conflict with Marvell Sata controller + amd_iommu conflict with Marvell 88SE9230 SATA Controller |
penalvch (penalvch) wrote : | #155 |
Steven Ellis, for you personally:
1) Did this problem not occur in a prior Ubuntu or kernel release, and if so which?
2) If this issue has always occured, could you please advise to the earliest kernel you tested?
3) To keep this relevant to upstream, one will want to test the latest mainline kernel as it is released (now 5.0-rc2). Could you please advise?
Changed in linux (Ubuntu): | |
importance: | Undecided → Low |
Steven Ellis (steven-openmedia) wrote : | #156 |
I've only recently traced the issue to the iommu kernel option. This device has been unstable since I bought it and I pull it out occasionally to see if the driver issues have been addressed.
I'm afraid that the test system I'm using is currently unavailable. I'll post an update when I have a chance for fresh testing.
piktogramm (piktogramm) wrote : | #157 |
Hi,
I had similar Problems with my Marvell 88EE9230. I was able to improve the situation quite a lot by updating the firmware of the controller itself. In General all firmware versions beyond version 2.3.xxx improved the situation quite a lot. The remaining problem is, that I get failures on ata6 which is the only port which is not connected to any drive at all. Any drive connected to the marvell controller itself is perfectly stable (24/7 for +400 days).
Source for Firmwares: https:/
May 05 03:16:10 doomsdaydevice kernel: ata6.00: exception Emask 0x0 SAct 0x6 SErr 0x0 action 0x6 frozen
May 05 03:16:11 doomsdaydevice kernel: ata6.00: failed command: WRITE FPDMA QUEUED
May 05 03:16:11 doomsdaydevice kernel: ata6.00: cmd 61/10:08:
May 05 03:16:11 doomsdaydevice kernel: ata6.00: status: { DRDY }
May 05 03:16:11 doomsdaydevice kernel: ata6.00: failed command: WRITE FPDMA QUEUED
May 05 03:16:11 doomsdaydevice kernel: ata6.00: cmd 61/10:10:
May 05 03:16:11 doomsdaydevice kernel: ata6.00: status: { DRDY }
May 05 03:16:11 doomsdaydevice kernel: ata6.00: supports DRM functions and may not be fully accessible
May 05 03:16:11 doomsdaydevice kernel: ata6.00: supports DRM functions and may not be fully accessible
May 05 06:01:33 doomsdaydevice kernel: ata6.00: exception Emask 0x0 SAct 0x60 SErr 0x0 action 0x6 frozen
May 05 06:01:33 doomsdaydevice kernel: ata6.00: failed command: WRITE FPDMA QUEUED
May 05 06:01:33 doomsdaydevice kernel: ata6.00: cmd 61/08:28:
May 05 06:01:33 doomsdaydevice kernel: ata6.00: status: { DRDY }
May 05 06:01:33 doomsdaydevice kernel: ata6.00: failed command: WRITE FPDMA QUEUED
May 05 06:01:33 doomsdaydevice kernel: ata6.00: cmd 61/08:30:
May 05 06:01:33 doomsdaydevice kernel: ata6.00: status: { DRDY }
May 05 06:01:33 doomsdaydevice kernel: ata6.00: supports DRM functions and may not be fully accessible
May 05 06:01:33 doomsdaydevice kernel: ata6.00: supports DRM functions and may not be fully accessible
May 05 06:37:03 doomsdaydevice kernel: ata6.00: exception Emask 0x0 SAct 0x30 SErr 0x0 action 0x6 frozen
May 05 06:37:03 doomsdaydevice kernel: ata6.00: failed command: WRITE FPDMA QUEUED
May 05 06:37:03 doomsdaydevice kernel: ata6.00: cmd 61/08:20:
May 05 06:37:03 doomsdaydevice kernel: ata6.00: status: { DRDY }
May 05 06:37:03 doomsdaydevice kernel: ata6.00: fail...
penalvch (penalvch) wrote : | #158 |
Johannes (piktogrammdd+
ubuntu-bug linux
Please feel free to subscribe me to it.
tags: |
added: bios-outdated-f.40 removed: latest-bios-f2 |
tags: | added: needs-upstream-testing |
piktogramm (piktogramm) wrote : | #159 |
Christoper, I filed the bug. Anyway I made a mistake. I took the output from lshw where scsi@6 was not populated and I took for granted, that ata6 equals scsi@6 which isn't the case. Therefore I get the mentioned errors on my boot drive.
https:/
In Linux Kernel Bug Tracker #42679, LK7S2ED64JHGLKj75shg9klejHWG49h5hk (lk7s2ed64jhglkj75shg9klejhwg49h5hk-linux-kernel-bugs) wrote : | #160 |
"Marvell Technology Group Ltd. 88SS9215 PCIe SSD Controller" have the same bug.
Fixed by:
DECLARE_
quirk_
Changed in linux (Debian): | |
status: | New → Fix Released |
In Linux Kernel Bug Tracker #42679, sam (sam-linux-kernel-bugs) wrote : | #161 |
Also "Marvell Technology Group Ltd. 88SE9125 PCIe SATA 6.0 Gb/s controller [1b4b:9125]" - fixed with:
DECLARE_
quirk_
Is this sufficient or should I open a new bug?
In Linux Kernel Bug Tracker #42679, alan (alan-linux-kernel-bugs) wrote : | #162 |
Even better would be to make a git diff of it and then submit it with explanation to
<email address hidden> and cc <email address hidden>
See:
https:/
In Linux Kernel Bug Tracker #42679, biergaizi2009 (biergaizi2009-linux-kernel-bugs) wrote : | #163 |
(In reply to sbingner from comment #136)
> Also "Marvell Technology Group Ltd. 88SE9125 PCIe SATA 6.0 Gb/s controller
> [1b4b:9125]" - fixed with:
>
> DECLARE_
> quirk_dma_
>
> Is this sufficient or should I open a new bug?
I have the same hardware and was able to test and confirm the bug. I just submitted the patch to the Linux kernel maintainers. Hopefully it'll be accepted soon.
https:/
In Linux Kernel Bug Tracker #42679, biergaizi2009 (biergaizi2009-linux-kernel-bugs) wrote : | #164 |
(In reply to Tom Li from comment #138)
> (In reply to sbingner from comment #136)
> > Also "Marvell Technology Group Ltd. 88SE9125 PCIe SATA 6.0 Gb/s controller
> > [...]
> > Is this sufficient or should I open a new bug?
>
> I have the same hardware and was able to test and confirm the bug. I just
> submitted the patch to the Linux kernel maintainers. Hopefully it'll be
> accepted soon.
>
> https:/
Patch for 88SE9125 has been merged into the upstream kernel since Linux v5.17-rc1.
Greg K.H. has also queued this patch for Linux 4.4, 4.9, 4.14, 5.4, 5.10, 5.15, 5.16. The patch should appear in the next stable kernel release in each branch.
In Linux Kernel Bug Tracker #42679, biergaizi2009 (biergaizi2009-linux-kernel-bugs) wrote : | #165 |
(In reply to Tom Li from comment #139)
> Patch for 88SE9125 has been merged into the upstream kernel since Linux
> v5.17-rc1.
>
> https:/
> ?id=e4453758828
>
> Greg K.H. has also queued this patch for Linux 4.4, 4.9, 4.14, 5.4, 5.10,
> 5.15, 5.16. The patch should appear in the next stable kernel release in
> each branch.
My patch has just been included in Linux 4.4.300, 4.9.298, 4.14.263, 4.19.226, 5.4.174, 5.10.94, 5.15.17, and 5.16.3.
kogiyuuki (kogichan) wrote : | #166 |
I have encountered similar problem in Linux 6.5.0-25-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Feb 20 16:09:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux .
Here is my hardware infomation:
```
$ sudo lspci -knnv -s 3:00.0
03:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 x2 4-port SATA 6 Gb/s RAID Controller [1b4b:9230] (rev 11) (prog-if 01 [AHCI 1.0])
Subsystem: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 x2 4-port SATA 6 Gb/s RAID Controller [1b4b:9230]
Flags: bus master, fast devsel, latency 0, IRQ 39, IOMMU group 13
I/O ports at e050 [size=8]
I/O ports at e040 [size=4]
I/O ports at e030 [size=8]
I/O ports at e020 [size=4]
I/O ports at e000 [size=32]
Memory at fc410000 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at fc400000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [70] Express Legacy Endpoint, MSI 00
Capabilities: [e0] SATA HBA v0.0
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: ahci
Kernel modules: ahci
```
and my syslog(snipped):
```
kernel: [ 2.076538] ata14.00: ATAPI: MARVELL VIRTUAL, 1.09, max UDMA/66
kernel: [ 2.082784] ahci 0000:03:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000e address=0xcffe3840 flags=0x0050]
kernel: [ 74.990910] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
kernel: [ 80.032244] ata14.00: qc timeout after 5000 msecs (cmd 0xa1)
kernel: [ 80.532224] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4)
```
I haven't reproduced this in Linux5.15.0.
Does anyone have information about this?
Created attachment 72217
Output of `dmesg' command
I have a MSI Z68A-GD80 B3 motherboard and when I try to enable Intel's IOMMU (kernel booted with intel_iommu=on), integrated Marvell 88SE9128 SATA controller doesn't work.
To reproduce: INTEL_IOMMU= y).
1. Compile and prepare kernel with Intel IOMMU support enabled (CONFIG_
2. Reboot the computer.
3. Enter BIOS and enable VT-d.
4. Boot the kernel with intel_iommu=on parameter.
Right after boot, kernel reports the following errors (SATA controller is at 0b:00.0):
[ 2.639774] DRHD: handling fault status reg 3
[ 2.639782] DMAR:[DMA Read] Request device [0b:00.1] fault addr fff00000
[ 2.639783] DMAR:[fault reason 02] Present bit in context entry is clear
After a while these entries appear:
[ 7.625837] ata14.00: qc timeout (cmd 0xa1)
[ 7.628341] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 7.935483] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 17.908407] ata14.00: qc timeout (cmd 0xa1)
[ 17.910935] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 17.912276] ata14: limiting SATA link speed to 1.5 Gbps
[ 18.219077] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 48.134607] ata14.00: qc timeout (cmd 0xa1)
[ 48.137508] ata14.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[ 48.444646] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
When there is a disk connected to the controller it does not work. When there are none, computer starts normally, apart from the huge lag caused by, presumably, probing the device.
Since this is the secondary controller on these motherboards, to eliminate those symptoms you can just plug disk in one of available ports of the built-in Intel SATA controller and disable Marvell's one using BIOS. The other work-around, if you need to use eSATA capabilities of the latter, is to disable VT-d techonology also using BIOS.