Acer Aspire A315 IOAPIC failure on Ubuntu 18.04, kernel hangs, can't load, kernel freeze (AMD Ryzen 5/Radeon/Raven) / AMDGPU Hybrid crash

Bug #1776563 reported by Richard Baka on 2018-06-12
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Linux
Incomplete
Medium
amd
Undecided
Unassigned
linux (Ubuntu)
Medium
Unassigned
linux-firmware (Ubuntu)
Undecided
Unassigned

Bug Description

CPU: Ryzen 5 2500U
VGA: Radeon 535
Notebook: Acer Aspire A315

This is a brand new notebook on the market with Ryzen 5/Radeon.
The default kernel of Ubuntu(18.04) hangs at loading with message:

tsc: Refined TSC clocksource calibration: 1996.250 MHz
clocksource: tsc: mask: 0xffffffffffffffff max_cycles: (...), max_idle_ns: (...)
Soft lockup

Using pci=noacpi kernel parameter kernel loads without any problem but my notebook produces more heat than on Win10. If I know right Acer notebooks need ACPI to the correct power management.

The same thing happens on mainline 4.17,4.18rc1-2.
BIOS upgrade to the latest version: 1.08 hasn't helped

This problem has been reported upstream:
https://bugzilla.kernel.org/show_bug.cgi?id=200087

The latest correctly working kernel was 4.13.* but the heat problem was present with this too.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1776563

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic

apport-collect 1776563 can't be entered because the kernel can not load.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
summary: - Acer Aspire A315 (Ryzen5/Radeon/FHD) Ubuntu 18.04 kernel cant load
+ Ubuntu 18.04 kernel can't load kernel on Acer Aspire A315
+ (Ryzen5/Radeon/FHD)
summary: - Ubuntu 18.04 kernel can't load kernel on Acer Aspire A315
- (Ryzen5/Radeon/FHD)
+ Ubuntu 18.04 can't load kernel on Acer Aspire A315 (Ryzen5/Radeon/FHD)
no longer affects: bugzilla (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xserver-xorg-video-amdgpu (Ubuntu):
status: New → Confirmed
Freihut (freihut) wrote :

Had this on my A315 too, but I returned it to the vendor. Seems to be an UEFI-Bug, because it doesn't happened with my Ryzen 2500U from HP. Could also be related to that Ryzen/Radeon 535 combination (Vega/CGN 3).

On Grub-Menu press E and add "pci=noacpi" as kernel-parameter (where normally "quite splash" is). Then go on booting by pressing F10.
Sometimes (XFCE) it was also necessary to add "nomodeset" to boot, Gnome for example didn't need it (AFAIK).

I remember, I also needed to install amd's pro driver (for 18.04) via amdgpu-pro-install to get rid of the "nomodeset". I was able to run amdgpu-pro-uninstall later and still not needed the "nomodeset". Could be related to my system, but you may give it a try.
I was also using Kernel 4.17 (Mainline), which is available on http://kernel.ubuntu.com/~kernel-ppa/mainline/ or with UKUU https://www.omgubuntu.co.uk/2017/02/ukuu-easy-way-to-install-mainline-kernel-ubuntu

Richard Baka (bakarichard91) wrote :

Thanks Freihut, I will try this.

Richard Baka (bakarichard91) wrote :

It works but very slow. This could be an ACPI problem.

Richard Baka (bakarichard91) wrote :

I installed the new amdgpu pro driver and everything is very fast now. This bug should be reported to freedesktop, would you like somebody to do it? :D

Richard Baka (bakarichard91) wrote :

*Sorry correction: Who would like to do it? :D

Richard Baka (bakarichard91) wrote :

"The fact that ACPI was designed by a group of monkeys high on LSD, and is some of the worst designs in the industry obviously makes running it at any point pretty damn ugly."
Torvalds, Linus (2005-07-31). Message. linux-kernel mailing list. IU. Retrieved on 2006-08-28.

Richard Baka (bakarichard91) wrote :

Power management doesn't work well this way. It was hot a little. I've changed back to win10. This should be fixed by kernel developers or with a downstream patch.

2 comments hidden view all 209 comments

Created attachment 276583
dmesg after starting kernel with pci=noacpi

This is a brand new notebook on the market with Ryzen 5/Radeon. With disabled ACPI kernel boots without any problem but my notebook produces more heat than on Win10. Otherwise this happens when it is stayed on the bios screen in a while.

CPU: AMD Ryzen 5 2500U
GPU1: AMD Radeon Vega 8
GPU2: AMD Radeon 535

(I wrote to Acer to fix their bios problems but they said Linux is not supported. I don't think they are right but what can I do?)

Created attachment 276585
attachment-31427-0.html

Out of office 6/18-6/27

Created attachment 276587
Soft lockup failure without noacpi

Nothing changes with disabled iommu.

5 comments hidden view all 209 comments
6 comments hidden view all 209 comments

Created attachment 276589
dmesg after amd_iommu_dump=1

[ 0.000000] AMD-Vi: Using IVHD type 0x11
[ 0.000000] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[ 0.000000] AMD-Vi: mmio-addr: 00000000fd900000
[ 0.000000] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:01.0 flags: 00
[ 0.000000] AMD-Vi: DEV_RANGE_END devid: ff:1f.6
[ 0.000000] AMD-Vi: DEV_ALIAS_RANGE devid: ff:00.0 flags: 00 devid_to: 00:14.4
[ 0.000000] AMD-Vi: DEV_RANGE_END devid: ff:1f.7
[ 0.000000] AMD-Vi: DEV_SPECIAL(HPET[0]) devid: 00:14.0
[ 0.000000] AMD-Vi: DEV_SPECIAL(IOAPIC[33]) devid: 00:14.0
[ 0.000000] AMD-Vi: DEV_SPECIAL(IOAPIC[34]) devid: 00:00.1
[ 0.000000] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found

no longer affects: xserver-xorg-video-amdgpu (Ubuntu)

Created attachment 276591
Error message before freezing (without quite splash)

Please try booting with linux 4.18-rc1 or later. Also, please try 4.18-rc1+ with/without ACPI

Hi Erik,

Absolutely the same thing on 4.18rc1 and on rc2 too.

Fedora loads without any additional parameters(mysterious).

[ 0.000000] Switched APIC routing to physical flat.
[ 0.002000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.007000] tsc: Fast TSC calibration using PIT
[ 0.008000] tsc: Detected 1996.299 MHz processor
[ 0.008000] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398d0c7513b, max_idle_ns: 881590744042 ns
[ 0.008000] Calibrating delay loop (skipped), value calculated using timer frequency.. 3992.59 BogoMIPS (lpj=1996299)

Heat production may be still present but I can't measure it because there is no temperature values in "sensors" (there is 5 values in Win10).

Created attachment 277069
Fedora loads without noacpi

summary: - Ubuntu 18.04 can't load kernel on Acer Aspire A315 (Ryzen5/Radeon/FHD)
+ Acer Aspire A315 ACPI failure on Ubuntu 18.04 (Ryzen5/Radeon/FHD)
summary: - Acer Aspire A315 ACPI failure on Ubuntu 18.04 (Ryzen5/Radeon/FHD)
+ Acer Aspire A315 ACPI failure on Ubuntu 18.04 (Ryzen5/Radeon)
11 comments hidden view all 209 comments
summary: - Acer Aspire A315 ACPI failure on Ubuntu 18.04 (Ryzen5/Radeon)
+ Acer Aspire A315 ACPI failure on Ubuntu, kernel hangs, can't load 18.04
+ (Ryzen5/Radeon)
summary: - Acer Aspire A315 ACPI failure on Ubuntu, kernel hangs, can't load 18.04
+ Acer Aspire A315 ACPI failure on Ubuntu 18.04, kernel hangs, can't load
(Ryzen5/Radeon)
description: updated
summary: Acer Aspire A315 ACPI failure on Ubuntu 18.04, kernel hangs, can't load
- (Ryzen5/Radeon)
+ (AMD Ryzen 5/Radeon/Raven)
summary: - Acer Aspire A315 ACPI failure on Ubuntu 18.04, kernel hangs, can't load
- (AMD Ryzen 5/Radeon/Raven)
+ Acer Aspire A315 ACPI failure on Ubuntu 18.04, kernel hangs, can't load,
+ kernel freeze (AMD Ryzen 5/Radeon/Raven)
12 comments hidden view all 209 comments

Erik, I think this is in connection with clocksource calibration but I'm not an expert.

This works:
[ 0.007000] tsc: Fast TSC calibration using PIT
[ 0.008000] tsc: Detected 1996.299 MHz processor
[ 0.008000] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398d0c7513b, max_idle_ns: 881590744042 ns

This doesn't:
[...] tsc: Refined tsc clocksource calibration: ...
[...] clocksource: tsc: mask: 0xfff...f (...)

Changed in linux:
importance: Unknown → Medium
status: Unknown → Incomplete
2 comments hidden view all 209 comments

Hi, I was trying another kernel parameters and noapic seems to work. It is not needed to disable the whole ACPI "service", however I don't know how important apic is. On kernel 4.18 even temperature sensors appear.
Power management is almost perfect if cpu governor is set to powersave.

At least amdgpu crashes now so kernel doesn't start without nomodeset. Could this be an acpi problem or I should ask kernel firmware developers?

Hi,
amdgpu doesn't crash on my a315-41g-r40x (BIOS V1.08) with
  linux-next-next-20180713 compiled with VGA_SWITCHEROO=N
and with
  kernel parameters: ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2

gg71, where have you been till now? :D
Thanks, I will try it.

gg71, it works almost perfectly, thanks again. I have been working on this for ca one month. Please write a mail to me if you have any new info.

4 comments hidden view all 209 comments

The solution for Acer A315-41G-* notebooks: (USE AT YOUR OWN RISK - PLS be very careful)

1. Load kernel with these parameters: ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2 nomodeset
This is how it can be done (1. answer/first half 1-4): https://askubuntu.com/questions/19486/how-do-i-add-a-kernel-boot-parameter

1/b.(if it is not installed) Install ubuntu and load installed kernel again using the parameters (see 1.)

2. Start a terminal and do these steps:
> cd ~
> mkdir kernelbuild
> cd kernelbuild
> wget -c https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.17.6.tar.xz
> tar -xvf linux-4.17.6.tar.xz
> cd linux-4.17.6
> sudo apt install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache bison flex
> make menuconfig
+> Save,OK,EXIT
> nano .config
+> ctrl+w and search for CONFIG_VGA_SWITCHEROO=y
+> replace y with n (this is not ideal and should be fixed later)
+> ctrl+o, enter
> make -j4 (this will take a while, be patient)
> make modules_install
> sudo make install
> sudo nano /etc/default/grub
+> Edit the correct line and add the parameters: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2"
+>CTRL+O, enter
>sudo update-grub
+> reboot and start the correct kernel

If you install xsensors (sudo apt install xsensors) and start it (xsensors) you can monitor the temperature values of your notebook. (Recommended)

Richard Baka (bakarichard91) wrote :

Dear Ubuntu maintainers,

couldn't this be fixed by an ubuntu kernel patch? The hardest part is to disable gpu switching at kernel load time. APIC fixing parameters can be hardcoded for these models I think or search for the correct pci controller using a smart script.

This was a hell of an investigation, never again. Thanks for gg71, he/she is a lifesaver.

4 comments hidden view all 209 comments

Hi Richard:

This issue should be related to the buggy BIOS ivrs table.
Kernel panic when found no southbridge device ID.

Could you try boot kernel with "amd_iommu_dump=1 amd_iommu=off" (remove other kernel parameters you tried to solve this issue).

If it works, please attach the dmesg here.
I will try to make a kernel patch to make kernel boot with irq map disabled instead of panic.

Richard Baka (bakarichard91) wrote :

Hi AaronMa,

thanks for the response. I tried it but it didn't work. I think iommu problem is not the main reason of the kernel hang. Otherwise it can be disabled in BIOS and there is no change.

The main reason is: https://bugzilla.kernel.org/attachment.cgi?id=276587 like you can se on this picture is that IOAPIC[4] and IOAPIC[5] are not in the invrs table so we should search the correct pci controllers using lspci and give them to the kernel.

In this way:
LINUX_DEFAULT="quiet splash ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2"

Kernel can be started even with noapic but two sensors will be missing and the advanced touchpad functions will not work. This is the reason of CONFIG_VGA_SWITCHEROO=n compile time kernel parameter.

There is an another problem: this notebook has two GPUs and amdgpu (or the kernel, I don't know) can not handle this correctly so gpu switching has to be disabled

Richard Baka (bakarichard91) wrote :

Kernel can be started even with noapic but two sensors will be missing and the advanced touchpad functions will not work.

!!!This line is not here: This is the reason of CONFIG_VGA_SWITCHEROO=n compile time kernel parameter.

There is an another problem: this notebook has two GPUs and amdgpu (or the kernel, I don't know) can not handle this correctly so gpu switching has to be disabled
!!!But here: This is the reason of CONFIG_VGA_SWITCHEROO=n compile time kernel parameter.

Richard Baka (bakarichard91) wrote :

AaronMa,

This is the iommu debug:

[ 0.000000] AMD-Vi: Using IVHD type 0x11
[ 0.000000] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[ 0.000000] AMD-Vi: mmio-addr: 00000000fd900000
[ 0.000000] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:01.0 flags: 00
[ 0.000000] AMD-Vi: DEV_RANGE_END devid: ff:1f.6
[ 0.000000] AMD-Vi: DEV_ALIAS_RANGE devid: ff:00.0 flags: 00 devid_to: 00:14.4
[ 0.000000] AMD-Vi: DEV_RANGE_END devid: ff:1f.7
[ 0.000000] AMD-Vi: DEV_SPECIAL(HPET[0]) devid: 00:14.0
[ 0.000000] AMD-Vi: DEV_SPECIAL(IOAPIC[33]) devid: 00:14.0
[ 0.000000] AMD-Vi: DEV_SPECIAL(IOAPIC[34]) devid: 00:00.1
[ 0.000000] [Firmware Bug]: AMD-Vi: No southbridge IOAPIC found

I will give you the correct iommu "addresses" after dinner :).

Richard Baka (bakarichard91) wrote :

HOT NEWS!!

CONFIG_VGA_SWITCHEROO=n can be avoided using these kernel parameters amdgpu.runpm=0 radeon.modeset=0.
Further investigation is in progress...

Richard Baka (bakarichard91) wrote :

This could be the better solution because of the notebook's lowest heating but I'm not sure.

Richard Baka (bakarichard91) wrote :
Download full text (4.5 KiB)

Hi all,

After a bit of testing the power management seems to be better but it is far away from perfect. I don't see any anomaly watching temperature sensors (instead of ath10k_hwmon-pci(?!??)) but my notebook is definitely warm if I hold it on my lap.
This is more better on win10, I don't know why.

mosomaci@pc:~$ sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tdie: +55.0°C (high = +70.0°C)
Tctl: +55.0°C

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx: +0.81 V
fan1: N/A
temp1: +50.0°C (crit = +104000.0°C, hyst = -273.1°C)
power1: 1.13 kW (cap = 28.00 W)

ath10k_hwmon-pci-0300
Adapter: PCI adapter
temp1: +91.0°C

amdgpu-pci-0400
Adapter: PCI adapter
vddgfx: N/A
vddnb: N/A
fan1: N/A
temp1: +55.0°C (crit = +80.0°C, hyst = +0.0°C)
power1: N/A

Could our APIC fix not a perfect solution for this problem? I know that the DSDT is totally broken:

[ 0.088280] ACPI: Added _OSI(Module Device)
[ 0.088280] ACPI: Added _OSI(Processor Device)
[ 0.088280] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.088280] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.088280] ACPI: Added _OSI(Linux-Dell-Video)
[ 0.092591] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[ 0.100296] ACPI BIOS Error (bug): Failure creating [\_SB.PCI0.LPC0.EC0._Q46], AE_ALREADY_EXISTS (20180531/dswload2-316)
[ 0.100309] ACPI Error: AE_ALREADY_EXISTS, During name lookup/catalog (20180531/psobject-221)
[ 0.100313] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100321] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.UX**], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100326] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100332] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.M000], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100336] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100343] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.M049], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100347] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100353] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.M280], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100357] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100364] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.M009], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100369] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100372] ACPI Error: Skipping While/If block (20180531/psloop-594)
[ 0.100378] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.LPC0.EC0.M000], AE_NOT_FOUND (20180531/psargs-330)
[ 0.100383] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100390] ACPI Error: Cannot release Mutex [QMUX], not acquired (20180531/exmutex-359)
[ 0.100394] ACPI Error: Ignore error and continue table load (20180531/psobject-604)
[ 0.100402] ACPI BIOS Error (bug): Could not resolve [\_SB.PCI0.GPP2.BCM5], AE_NOT_FOUND (20180531...

Read more...

summary: - Acer Aspire A315 ACPI failure on Ubuntu 18.04, kernel hangs, can't load,
- kernel freeze (AMD Ryzen 5/Radeon/Raven)
+ Acer Aspire A315 IOAPIC failure on Ubuntu 18.04, kernel hangs, can't
+ load, kernel freeze (AMD Ryzen 5/Radeon/Raven) / AMDGPU Hybrid crash
tags: added: patch
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Changed in linux-firmware (Ubuntu):
status: New → Confirmed
129 comments hidden view all 209 comments

Created attachment 278985
diff good.config bad.config

Here is a diff of the two configs.
Created by doing as below, and then diff -aur good bad | grep -E '^(\+|-)'
cat .config-good | grep -Ev '^#' | grep -Ev '^\s*$' |sort > /tmp/.config-good

Created attachment 278987
Hacked AML tables vs good kernel compile config

Which is better?

suspend/resume on non graphical tty2 never crashes, suspend /resume on graphical x session sometimes causes screen freeze and you have to reboot the laptop.

i disabled polkit auth agent from openbox autostart and now the laptop sleeps like a baby, it was unrelated to acpi.

actually ignore my previous post i managed to crash suspend /sleep at a non graphical session, this could still be related to acpi or drivers.

Created attachment 279023
sleep dmesgs before/after working/crashed

different dmesg before/after successful suspend and suspend freeze, it seems like a cpu issue.

siyia: I see you only have rcu_nocbs=0-3 but there are 8 logical cores. Try seeing if using `rcu_nocbs=0-7 idle=nomwait` helps. Both those together fixed my system lockups. With just rcu_nocbs for all my cores I still got lockups (and you don't have it enabled for all cores). Ryzen Errata: https://support.amd.com/TechDocs/55449_Fam_17h_M_00h-0Fh_Rev_Guide.pdf

will post back soon after testing parameters

the cpu is ryzen 3 2200u 2 cores 2 threads.

unfortunately it didn't help.

idle=nowait however activates the cpu power save feature, lol wtf?
here i had a bug report https://bugzilla.kernel.org/show_bug.cgi?id=201045

i only get lockups only after/before suspend sometimes, otherwise the laptop is rock solid. however thanks Samantha for the idle=nowait boot parameter it solved the cpu power save feature.

siyia: Only thing else I'd think to try would be this: https://gist.github.com/60b73ff4e6ce901d09f9a8025826cb4a It must be run as root and you must have `msr-tools` installed.

I wrote it just now based on https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069799.html It sets some MSR registers that AMD specified as mitigations for some of the errata.

Let me know if that changes anything. (also I'm glad the kernel options somehow fixed the turbo issue for you).

your script didn't help, about the powersave it was enabled by adding amdgpu in the modules in mkinitcpio.conf and then updating iniframs with sudo mkinitcpio - p linux that fixed it, however after waking from suspend cpu runs at turbo frequency again.

siyia: I seem to be getting freezing during suspend too (though you seem to be getting it more than I). There are some with Ryzen that their crashes were fixed if they disabled C6 powerstate. Since I use a laptop that was mostly a non-option for me, but I wrote as script so systemd will disable C6 sleep before suspend and then enable it again after suspend (so CPU doesn't happen to be in C6 state aronud suspend time).

May or may not work. Since it has only happened every once in a while for me, I may not know for several days if the fix worked or not. You can test it out by putting https://gist.github.com/samcv/0b6a915aadcddc0e19640c20d9dd3164 as
/usr/lib/systemd/system-sleep/disable-enable-c6-state.sh and doing `chmod +x /usr/lib/systemd/system-sleep/disable-enable-c6-state.sh`. You will need download https://github.com/r4m0n/ZenStates-Linux/blob/master/zenstates.py and then set the ZENSTATES variable in my `disable-enable-c6-state.sh` script to wherever you put the script. If the script is working you should get an output from `journalctl -b 0 | grep -Ei '(enabled|disabled)\s*c6'` after you have done a suspend/resume cycle. If that doesn't fix it, your issue (and possibly mine depending on how my results go) should probably have their own bug filed.

still freezes with c6 disabled, only anomaly I can detect is that after resuming from sleep cpupowersave is disabled and cpu runs at turbo frequency. only way to reverse this is to reboot and have the amdgpu module initiated early in km.

fedora 29 requires noapic only for installation,after wich it boots without any parameters and no acpi errors,suspend-resume and cpu powersave work flawlesly.

Created attachment 279211
dmesg from fedora 29 with noacpi errors

everything works flawlessly on fedora 29

Hmm, maybe a different configuration of the kernel or a different version? Not sure what kernel Fedora uses.

BTW the latest 1.05 BIOS update on my Lenovo A485 fixes the underlying BIOS issue, so my system doesn't suffer from this issue anymore (doesn't mean a kernel fix wouldn't be a good idea, since not all OEM's are good about fixing issues for "unsupported platforms").

i just cannot understand how fedora kernel config can fix buggy acer bios?i mean the tables are completely broken,yet under fedora the load without any error,can this be replicated upstream,fedora 29 uses kernel 4.18.16.

Is there a way to build\use fedora kernel under Arch or Ubuntu? Could be a workaround for some time.

we could use the same config...,if we use the same config and the problem persists then it is probably a fedora kernel patch that fixes the issues with Acer Aspire A315-41G series.

Created attachment 279239
fedora kernel config for linux 4.18.6

You can use it to build a kernel with the arch build system and test if it works in archlinux.

arch kernel should also be 4.18.6

sorry i meant kernel * 4.18.16

(In reply to siyia from comment #55)
> i just cannot understand how fedora kernel config can fix buggy acer bios?i
> mean the tables are completely broken,yet under fedora the load without any
> error,can this be replicated upstream,fedora 29 uses kernel 4.18.16.

Fedora 29 really do not show any errors?
On 28 there was two lines in dmesg "ioapic[4] not in ivrs table" which present on screenshot attachment for this bugreport.
And third line was like "switching irq routing to physical flat" - sorry, i'm not remember exactly... I did not find anything useful about this "physical flat" irq mapping mode. But seems this is similar to noapic.

no such errors i ve uploaded fedora 29 dmesg please check it for yourself just to be on the safe side.

the only real workaround is to use fedora 29,it's being released tomorrow

Compiled manjaro kernel with fedora config. Looks like fedora patches do something to kernel

what they did works only after installing and booting fedora in bare metal,the live install cd still requires noapic

installing fedora with ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2 instead of noapic,produces the same soft lockup that ubuntu/arch gets.installing it with noapic allows you to boot without any custom parameters and acpi seems to work good.

ok i figured this out.it's more like fedora has acpi working on these laptops with noapic.Other distros with noapic cannot sleep but fedora can.

Mine device (a315-41-R19S) can sleep on Ububtu 18.04 LTS with default kernel 4.15. But it have no dGPU - maybe this is the reason...
Have noapic kernel boot parameter and factory bios 1.03.

(In reply to Another User from comment #68)
> Mine device (a315-41-R19S) can sleep on Ububtu 18.04 LTS with default kernel
> 4.15. But it have no dGPU - maybe this is the reason...
> Have noapic kernel boot parameter and factory bios 1.03.

I'm using A315-41-R8XR which doesn't have an dGPU and yet can't sleep with noapic. Could you please specify which wi-fi adapter (lspci -k) you are using?

My wi-fi:
02:00.0 Network controller: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter (rev 31)
 Subsystem: Lite-On Communications Inc QCA9377 802.11ac Wireless Network Adapter
 Kernel driver in use: ath10k_pci
 Kernel modules: ath10k_pci

Also in dmeseg i have errors for this module (firmware load failed) but wifi and bluetooth works fine.

02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTL8411B PCI Express Card Reader (rev 01)
 Subsystem: Acer Incorporated [ALI] Device 1259
 Kernel driver in use: rtsx_pci
 Kernel modules: rtsx_pci
02:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
 Subsystem: Acer Incorporated [ALI] Device 1259
 Kernel driver in use: r8169
 Kernel modules: r8169
03:00.0 Network controller: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter (rev 31)
 Subsystem: Lite-On Communications Inc Device 08a6
 Kernel driver in use: ath10k_pci
 Kernel modules: ath10k_pci

everything works as expected with fedora 29 and noapic,i can sleep resume without crashes and cpu powersave works,however i cannot reproduce the same behavior in other distros.

it is worth noting that i am using fedora xfce4 spin, not the workstation edition

you (bountou) wrote :

I never made a battery reset but it's the same since the day when I bought it and I've try differents bios, resetting bios, reinstalling entire ssd, and I still get a fast dried battery on linux. (so it should not linked to my battery but to linux)

Another User (another-user) wrote :

You may check cpu frequency and try to cap it via cpupower. But high power drain persist in windows too, so this is not fully linux problem.

To be honest, i'm not believe battery reset helps, but who knows...
There was another (cpu related) problem with another Acer laptop, that was fixed that way. I found this while searching solution for "this topic" ACPI issue:
https://forums.gentoo.org/viewtopic-t-1081448.html

Displaying first 40 and last 40 comments. View all 209 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.