Booting hangs when USB 3.0 Etron EJ168 PCI card is detected
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
I have installed Etron EJ168 PCI USB 3.0 [1B6F:7023] which briliantly works in any Windows OS (I use multiboot PC) on my Conroe865PE AGP/PCI Intel Kentsfield QX6800 system with classic BIOS (non UEFI).
There are no devices connected to PCI USB card during boot.
The desktop Ubuntu 18.04 kernel 4.15.0-46-generic freezes with this card during boot (see attached "photo log" with boot and grub options). I can't access/login Ubuntu or console. It was checked and confirmed on ubuntu-
Previously I was used and removed USB 3.0 NEC Renesas upd720200 PCI card [1033:0194], which was partially workable under my installation of Ubuntu 18.04 and Windows. It had problems from time to time with FW load from card on start and not detecting USB devices connected even if FW present.
I tried and also removed USB 3.0 NEC Renesas upd720202 PCI card [1912:0015] which also was partially workable in the same way (other card bios and newer NEC chip). System was fully bootable with NEC cards.
The Etron EJ168 works superb on Windows OS's with directly or with hub connections with USB devices.
I can assist in any way to solve this bug as IT Professional.
Mike (michael-ubu-pan) wrote : | #1 |
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs. | #2 |
Changed in linux (Ubuntu): | |
status: | New → Incomplete |
tags: | added: bionic |
Mike (michael-ubu-pan) wrote : | #3 |
I can't boot to do logs with any kernel pci options when Etron PCI USB 3.0 is present in slot.
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
Mike (michael-ubu-pan) wrote : | #4 |
The same hanging behaviour in Centos-7 DVD
name = CentOS-7
family = CentOS
timestamp = 1543162874.22
variant =
version = 7
packagedir =
arch = x86_64
The card HW pics are for ex. there:
https:/
Kai-Heng Feng (kaihengfeng) wrote : | #5 |
Please try latest daily image which uses Linux v5.0:
http://
Kai-Heng Feng (kaihengfeng) wrote : | #6 |
According to LP: #1050778 it used to work on Ubuntu 11.10.
11.10 uses linux kernel v3.0, can you also test the kernel?
Mike (michael-ubu-pan) wrote : | #7 |
11.10 works without any problem with card and devices connected. disco-desktop-
1) skipping error when it can't be resolved during boot (it shouldn't prevent Ubuntu 18.4-19.4 system load)
2) the xhci should be "unbuged" besouse it is workable in 11.10.
Mike (michael-ubu-pan) wrote : | #8 |
lspci from Ubuntu 11.10 working kernel
Linux ubuntu 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
03:00.0 0c03: 1b6f:7023 (rev 01) (prog-if 30 [XHCI])
Subsystem: 1b6f:7023
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 40
Region 0: Memory at fe8f8000 (64-bit, non-prefetchable) [size=32K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [70] MSI: Enable+ Count=1/4 Maskable+ 64bit+
Address: 00000000fee0100c Data: 4169
Masking: 0000000e Pending: 00000000
Capabilities: [a0] Express (v2) Endpoint, MSI 01
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCo
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB
Kernel driver in use: xhci_hcd
Kernel modules: xhci-hcd
Kai-Heng Feng (kaihengfeng) wrote : | #9 |
Would it be possible for you to do a kernel bisection?
First, find the last good -rc kernel and the first bad -rc kernel between v3.0 and v3.2, from http://
Then,
$ sudo apt build-dep linux
$ git clone git://git.
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.
Mike (michael-ubu-pan) wrote : | #10 |
I found Ubuntu 15.10 amd64 on 4.2.0-16 kernel linux is the last version where the EJ168 card is working and system boots from DVD. I have to find the way to access my installed 18.10 with card (way to disable xhci driver during boot or pci card) becouse I cannot access installed system to make anything with it and test.
Kai-Heng Feng (kaihengfeng) wrote : | #11 |
Would it be possible to find the last good -rc kernel between v4.2 and v4.4?
Mike (michael-ubu-pan) wrote : | #12 |
I was temporary installed Ubuntu 15.10 amd64 as base to test kernels, becouse it was working with card. After long retries I have discovered that v4.9 (https:/
Kai-Heng Feng (kaihengfeng) wrote : | #13 |
One more step: please find the last good 4.10-rc* kernel.
Mike (michael-ubu-pan) wrote : | #14 |
Are You sure looking that in 4.10-rc* not in 4.9.XXX ?
Kai-Heng Feng (kaihengfeng) wrote : | #15 |
Yes, release candidates (rc) kernels are released between each major release.
Mike (michael-ubu-pan) wrote : | #16 |
- IMG_20190403_121434.jpg Edit (811.8 KiB, image/jpeg)
Starting from kernel 4.10.rc1 is stuck during load after SSDT Intel line(captured to attached pic).(probably any another problem with it comparing to v4.10)
Mike (michael-ubu-pan) wrote : | #17 |
- IMG_20190403_152406.jpg Edit (659.0 KiB, image/jpeg)
Every kernel from 4.10.rc2 to 4.10.rc8 stuck just after Loading initial ram disk...(boot without quiet splash options). No log.
Kai-Heng Feng (kaihengfeng) wrote : | #18 |
There are too many possible commits in PCI and USB between v4.9 and v4.10.
For situation like this I'll do a bisection between v4.10-rc8 and v4.10 first to find out what commit makes v4.10 be able to boot.
Mike (michael-ubu-pan) wrote : | #19 |
@Kai-Heng Feng
The last stable for me was v4.9 not any single version v4.10 nor v4.10-rc* is bootable!
Mike (michael-ubu-pan) wrote : | #20 |
Why the xhci_hcd module which is unstable driver itself and works with many unstable USB 3.0 controllers and devices (with bugs) is compiled in kernel and cannot be easoly disabled???
It is insane script-kiddie unprofessional kernel development.
Mike (michael-ubu-pan) wrote : | #21 |
Besiades my BIOS is not U/EFI I also experience the lag keyboard (connected to PS/2 port) in grub:
https:/
Mike (michael-ubu-pan) wrote : | #22 |
Kai-Heng Feng
I switched to Ubuntu 16.04.2 based on 4.8 as a base LTS is better. I've tested kernel v4.9-rc8 as bisect good and v4.10-rc8 as bisect bad but I get:
$ git bisect good v4.9-rc8
$ git bisect bad v4.10-rc8
Some good revs are not ancestor of the bad rev.
git bisect cannot work properly in this case.
Maybe you mistook good and bad revs?
Any ideas how to dig in it?
Mike (michael-ubu-pan) wrote : | #23 |
kernel v4.10-rc8 boots until hang on bug under 16.04.2 LTS. I don't know why the same kernel under 15.10 it is not, but it is freezed release and not important.
Mike (michael-ubu-pan) wrote : | #24 |
Are U waiting for expiration without fix? Anyone will take responsibility on his accout?
Kai-Heng Feng (kaihengfeng) wrote : | #25 |
I don't see that issue on my system:
$ git bisect good v4.9-rc8
You need to start by "git bisect start"
Do you want me to do it for you [Y/n]? y
$ git bisect bad v4.10-rc8
Bisecting: 7145 revisions left to test after this (roughly 13 steps)
[aa3ecf388adc90
Mike (michael-ubu-pan) wrote : | #26 |
Ok, I have retried to bitsect under main kernel 4.8.0 for 16.04.02 LTS instead installed on 4.9-rc8.
Using:
$ git bisect good v4.9-rc8
$ git bisect bad v4.10-rc8
I have built kernel under 4.8.0 with default (Enter) options becouse I don't know nuances in them yet.
I have got many files from kernel build and I need advice, what shall I install for test:
linux-4.
linux-4.
linux-firmware-
linux-headers-
linux-image-
linux-image-
linux-libc-
You-Sheng Yang (vicamo) wrote : | #27 |
Most of the time you need only "linux-
Mike (michael-ubu-pan) wrote : | #28 |
After 3rd "git bitsect" built linux-image-
Kai-Heng Feng (kaihengfeng) wrote : | #29 |
When I encounter situation like this, I alway do another "bisection in bisection" to find the fix commit, then cherry-pick the fix to continue the original bisection.
Kai-Heng Feng (kaihengfeng) wrote : | #30 |
Also to minimize the gamut of potential regressions, only includes necessary modules, e.g. PCI and xHCI. Things like graphics and network driver can be excluded.
Mike (michael-ubu-pan) wrote : | #31 |
Resuming the situation I have to do git bisect bad now on this step?
Kai-Heng Feng (kaihengfeng) wrote : | #32 |
Do you mean bisection in bisection (blank screen bisection), or the original bisection (xHC bisection)?
For the blank screen one, the current commit is "bad".
For the xHC one, we don't know yet. Cherry pick the fix on top of the current commit then we can decide it's good or bad.
Mike (michael-ubu-pan) wrote : Re: [Bug 1822363] Re: Booting hangs when USB 3.0 Etron EJ168 PCI card is detected | #33 |
Kai-Heng Feng pisze:
> Cherry pick the fix on top of the
> current commit then we can decide it's good or bad
I don't know how to do it.
Mike (michael-ubu-pan) wrote : | #34 |
Kai-Heng Feng pisze:
> Do you mean bisection in bisection (blank screen bisection), or the
> original bisection (xHC bisection)?
>
> For the blank screen one, the current commit is "bad".
>
> For the xHC one, we don't know yet. Cherry pick the fix on top of the
> current commit then we can decide it's good or bad.
>
I found git cherry-pick command but the situation is that I have passed
2 git bisect good on "4.9-rc8" and I have no boot problem on 4.10-rc2+3
which I now is somehow corrected on 4.10-rc8 because system hangs on USB
3.0 PCI again. But 4.10-rc8 is 11 steps ahead now (4.10-rc2+3). What
should I set to git cherry-pick?
Kai-Heng Feng (kaihengfeng) wrote : | #35 |
1. Pause the original bisection.
2. Start a new bisection for the blank screen issue. Found the regression commit.
3. Resume the original bisection. Each bisected point needs to cherry-pick the fix or revert the regression commit.
Mike (michael-ubu-pan) wrote : | #36 |
1. I understand that it is done by command "git bisect reset"
2. If I understand You well I am already on first regression commit (blank screen issue) becouse I can't boot on 4.10-rc2+3 but on 4.9.0+-2 it was possible.
Kai-Heng Feng (kaihengfeng) wrote : | #37 |
> 1. I understand that it is done by command "git bisect reset"
Before doing the reset, run
$ git bisect log > log
To record the current bisection steps.
2. If I understand You well I am already on first regression commit (blank screen issue) becouse I can't boot on 4.10-rc2+3 but on 4.9.0+-2 it was possible.
Yes.
After the regression commit was found, run
$ git bisect replay log
To resume the original bisection.
Mike (michael-ubu-pan) wrote : | #38 |
I finally found 1st commit (not connected with USB problem) which prevents booting, but it is corrected much later in 4.10-rc8.I don't know how to locate correction commit for the one below
$ git bisect bad
174cc7187e6f088
commit 174cc7187e6f088
Author: Lv Zheng <email address hidden>
Date: Wed Dec 14 15:04:25 2016 +0800
ACPICA: Tables: Back port acpi_get_
ACPICA commit cac6790954d4d75
This patch back ports Linux acpi_get_
early_
The 2 APIs are used by Linux as table management APIs for long time, it
contains a hidden logic that during the early stage, the mapped tables
should be unmapped before the early stage ends.
During the early stage, tables are handled by the following sequence:
acpi_
parse the table
early_
During the late stage, tables are handled by the following sequence:
acpi_
parse the table
Linux uses acpi_gbl_
late stage.
The reasoning of introducing acpi_get_
remember the early mapped pointer in acpi_get_table() and Linux isn't able to
prevent ACPICA from using the wrong early mapped pointer during the late
stage as there is no API provided from ACPICA to be an inverse of
acpi_
But how ACPICA can work with the early/late stage requirement? Inside of
ACPICA, tables are ensured to be remained in "INSTALLED" state during the
early stage, and they are carefully not transitioned to "VALIDATED" state
until the late stage. So the same logic is in fact implemented inside of
ACPICA in a different way. The gap is only that the feature is not provided
to the OSPMs in an accessible external API style.
It then is possible to fix the gap by providing an inverse of
acpi_
combined:
acpi_
parse the table
acpi_
In order to work easier with the current Linux code, acpi_get_table() and
acpi_
1. When the usage count of the table is increased from 0 to 1, table is
mapped and .Pointer is set with the mapping address (VALIDATED);
2. When the usage count of the table is decreased from 1 to 0, .Pointer
is unset and the mapping address is unmapped (INVALIDATED).
So that we can deploy the new APIs to Linux with minimal effort by just
invoking acpi_get_table() in acpi_get_
acpi_
Link: https:/
Signed-off-by: Lv Zheng <email address hidden>
Signed-off-by: Bob Moore <email address hidden>
Signed-off-by: Rafael J. Wysock...
Kai-Heng Feng (kaihengfeng) wrote : | #39 |
Commit "174cc7187e6f" has several several subsequent fixes, cherrypicks the fixes during bisection.
Kai-Heng Feng (kaihengfeng) wrote : | #40 |
Spotted this commit when I am doing an unrelated task:
commit f7fac17ca925faa
Author: Andrey Smirnov <email address hidden>
Date: Wed May 22 14:34:01 2019 +0300
xhci: Convert xhci_handshake() to use readl_poll_
Xhci_
readl_
avoid repetition.
Turned out this patch also fixes a bug on the AMD Stoneyridge platform
where usleep(1) sometimes takes over 10ms.
This means a 5 second timeout can easily take over 15 seconds which will
trigger the watchdog and reboot the system.
[Add info about patch fixing a bug to commit message -Mathias]
Signed-off-by: Andrey Smirnov <email address hidden>
Tested-by: Raul E Rangel <email address hidden>
Reviewed-by: Raul E Rangel <email address hidden>
Cc: <email address hidden>
Signed-off-by: Mathias Nyman <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
Can you try this? It's in Linux kernel v5.2.
Mike (michael-ubu-pan) wrote : Re: [Bug 1822363] Re: Booting hangs when USB 3.0 Etron EJ168 PCI card is detected | #41 |
Thank's. I will return on sunday to this because I have a brake.
-------- Oryginalna wiadomość --------Od: Kai-Heng Feng <email address hidden> Data: 08.07.2019 19:37 (GMT+01:00) Do: <email address hidden> Temat: [Bug 1822363] Re: Booting hangs when USB 3.0 Etron EJ168 PCI card is
detected Spotted this commit when I am doing an unrelated task:commit f7fac17ca925faa
Mike (michael-ubu-pan) wrote : Re: [Bug 1822363] Re: Booting hangs when USB 3.0 Etron EJ168 PCI card is detected | #42 |
I have checked v5.2 and it is working.
How to install it on upgraded 18.04 frm 16.04 which is not bootable due
to this usb bug?
Is it possible to install it from bootable 16.04 (I have it on on
different partition)?
Kai-Heng Feng:
> Spotted this commit when I am doing an unrelated task:
> commit f7fac17ca925faa
> Author: Andrey Smirnov <email address hidden>
> Date: Wed May 22 14:34:01 2019 +0300
>
> xhci: Convert xhci_handshake() to use readl_poll_
>
> Xhci_handshake() implements the algorithm already captured by
> readl_poll_
> avoid repetition.
>
> Turned out this patch also fixes a bug on the AMD Stoneyridge platform
> where usleep(1) sometimes takes over 10ms.
> This means a 5 second timeout can easily take over 15 seconds which will
> trigger the watchdog and reboot the system.
>
> [Add info about patch fixing a bug to commit message -Mathias]
> Signed-off-by: Andrey Smirnov <email address hidden>
> Tested-by: Raul E Rangel <email address hidden>
> Reviewed-by: Raul E Rangel <email address hidden>
> Cc: <email address hidden>
> Signed-off-by: Mathias Nyman <email address hidden>
> Signed-off-by: Greg Kroah-Hartman <email address hidden>
>
> Can you try this? It's in Linux kernel v5.2.
>
Kai-Heng Feng (kaihengfeng) wrote : | #43 |
Yes I think it's possible. Mount it, chroot (e.g. systemd-nspawn) into the system and install the kernel packages.
Scott (scothiam) wrote : | #44 |
Available to help test/try solutions.
I have a combo card (usb + sata) which has this chip as well, while I didn't have the boot-hang issue, no devices are seen (connected prior to boot or otherwise). Added comment on a thread specific to this card, but it is quite old, and marked as solved (gave up on card). https:/
Mac pro 2010, Ubuntu 18, kernel 5.2.2
Mike (michael-ubu-pan) wrote : | #45 |
You don't know PC architecture well. Your chip is natively connected to
PCIe.
My USB chip has PCI/PCI-X bridge to PCI and it is completly different
initialization procedure - not similiar to yours.
Scott Polley pisze:
> Available to help test/try solutions.
>
> I have a combo card (usb + sata) which has this chip as well, while I
> didn't have the boot-hang issue, no devices are seen (connected prior to
> boot or otherwise). Added comment on a thread specific to this card, but
> it is quite old, and marked as solved (gave up on card).
> https:/
>
> Mac pro 2010, Ubuntu 18, kernel 5.2.2
>
Scott (scothiam) wrote : | #46 |
Sorry if I misunderstood the issue you are having, good luck
Mike (michael-ubu-pan) wrote : | #47 |
Kai-Heng Feng pisze:
> Yes I think it's possible. Mount it, chroot (e.g. systemd-nspawn) into
> the system and install the kernel packages.
>
It's all done on kernel v5.2, which is fully bootable and working.
The conclusion is that Ubuntu from 16.4.3 till now and all Live CD are
unusable, because they are using buggy kernel's >= 4.10 && kernel's < 5.2.
They should be patched.
Kai-Heng Feng (kaihengfeng) wrote : | #48 |
Eoan image uses Linux 5.2:
http://
Mike (michael-ubu-pan) wrote : | #49 |
New Ubuntu 18.04 LTS supported kernel 4.15.0-60-generic is again bootable and USB 3.0 works.
Changed in linux (Ubuntu): | |
status: | Confirmed → Fix Released |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1822363
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.