r8169 no internet after suspending

Bug #1779817 reported by Danute
118
This bug affects 23 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Committed
Medium
Kai-Heng Feng
Bionic
Fix Released
Undecided
Unassigned
linux-oem (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

===SRU Justification===
[Impact]
r8169 failed to establish connection after the fix for LP: #1752772
landed.

[Fix]
Accepts BIOS WoL settings again, and disables MSI-X for certain chip
revisions.

[Test]
Users confirmed the fix worked for them.

[Regression Potential]
Low. This brings back the old behavior for both WoL and MSI.

===Original Bug Report===
When my computer wakes up from suspending, there's no internet. Unplugging and replugging cable doesn't work, restarting network service doesn't work also. Only after restarting the computer, internet comes back.
It only started happening after I freshly installed Ubuntu Budgie (18.04) and did all the system updates. Before I was using Ubuntu with Unity (16.04) and there was no problems with my internet.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-24-generic 4.15.0-24.26
ProcVersionSignature: Ubuntu 4.15.0-24.26-generic 4.15.18
Uname: Linux 4.15.0-24-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC2: minihydra 2085 F.... pulseaudio
 /dev/snd/controlC0: minihydra 2085 F.... pulseaudio
 /dev/snd/controlC1: minihydra 2085 F.... pulseaudio
CurrentDesktop: Budgie:GNOME
Date: Tue Jul 3 10:13:19 2018
HibernationDevice: RESUME=UUID=3747bab8-c258-4600-bc24-5d1f56a642dd
InstallationDate: Installed on 2018-07-02 (1 days ago)
InstallationMedia: Ubuntu-Budgie 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
IwConfig:
 enp2s0 no wireless extensions.

 lo no wireless extensions.
MachineType: System manufacturer System Product Name
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-24-generic root=UUID=a5ceb36e-76f7-4bd4-a37b-0b91b995c635 ro quiet splash vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-24-generic N/A
 linux-backports-modules-4.15.0-24-generic N/A
 linux-firmware 1.173.1
RfKill:

SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/28/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2103
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: M4A77T
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2103:bd06/28/2010:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnM4A77T:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.family: To Be Filled By O.E.M.
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
Danute (d-informatika) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Re: no internet after suspending

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc3

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Danute (d-informatika) wrote :

I tested with kernel linux-modules-4.18.0-041800rc3-generic_4.18.0-041800rc3.201807012030_amd64 and it's fixed. Thank you.

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

Kai-Heng:
I have tried the kernel under https://people.canonical.com/~khfeng/lp1779817/ and I see no change on my system (bug still there; reloading network driver after resume from suspend is still necessary).

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Jan, does you issue fixed by mainline kernel v4.18-rc3?

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

Kai-Heng, I have tested mainline kernel 4.18-rc3 from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18-rc3/ and it seems not to fix the issue.

Revision history for this message
craig p hicks (craigphicks) wrote :

search for "r8168 for 4.15 kernel" and you will see problem from across many linux distributions.

I am trying to follow

https://www.unixblogger.com/2016/08/11/how-to-get-your-realtek-rtl8111rtl8168-working-updated-guide/

and change the "more stable" r8168. Neither the Ubuntu repo of r8168 nor the realtek download of r8168 are compiling cleanly however. Seems like kernel 4.15 included changes to an interface

previously "setup_timer" is now "timer_setup"

but it might not be as simple as a name change. I'm looking at this patch for a different project (not r8168) : http://patchwork.dpdk.org/patch/31739/ . You will see more logic there than a simple name change. (It could be an unrelated change, I can't tell.)

So maybe the 4.15 update for the r8169 driver didn't get the necessary logic changes and that is why there is a bug (?)

I'm trying to compile the Realtek r8168 version now, just using pointer casts to get it to compile.

Does @kaihengfeng have information about changes to r8169? Ubuntu has a repo r8168-dkms but (in my setup) it has many more compile errors than realtek's r8168 :(

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

I've finished backporting all r8169 changes from net-next to 18.04 kernel. Please give it a try:
https://people.canonical.com/~khfeng/lp1779817-2/

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

I have tested https://people.canonical.com/~khfeng/lp1779817-2/ , and it still does not fix the bug for me.

Revision history for this message
Steve Dodd (anarchetic) wrote :

I'm not sure if I'm in the right bug thread, 4.15.0-24 stopped my r8168/9 NIC working after suspend, was fine in prior versions. I have tested Kai's lp1779817-2 kernel above but it does not help.

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

If we are sure that the r8169 works on older kernel, but not on the latest one, we can use kernel bisection to find which commit causes the regression:

First, find the last good -rc kernel and the first bad -rc kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Then,
$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.

Revision history for this message
Mateusln (mateusln) wrote :

I'm in the same situation as https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817/comments/12

Same Ethernet model, using kubuntu 18.04 after the kernel update 4.15.0-24, If I boot into 4.15.0-23 it works.

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

During the last days I tried to bisect the commit that caused the regression, but I'm not sure if the result is correct.

According to git the bad commit is b489141369f78ead6ed540cff29ac1974852cd7f

Can anyone confirm?

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Alright, I'll build an Bionic kernel without this commit.

Changed in linux (Ubuntu):
assignee: nobody → Kai-Heng Feng (kaihengfeng)
summary: - no internet after suspending
+ r8169 no internet after suspending
Revision history for this message
Albert Astals Cid (aacid) wrote :
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Bionic's kernel doesn't have commit b489141369f78ead6ed540cff29ac1974852cd7f.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

This patch did help, my network comes up again after suspend with that kernel version. Thanks alot!

Revision history for this message
Erik Kallen (erikkallen) wrote :

I have tried both kernels lp1779817 and mainline 4.18-rc6 but the problem is still there for me, also for me the

modprobe -r r8169 && modprobe r8169

has not worked with any of the kernels

My card is:

02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)

please let me know if there is any information I can send that will help

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Erik, did the ethernet work in previous kernel version?

Revision history for this message
Steve Dodd (anarchetic) wrote :

lp1779817 kernel from #19 doesn't help me, last working kernel is 4.15.0-23.

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

I have to correct myself a bit: Network coming up correctly after suspend only works from time to time. On the cases where it's not working, the driver still has to be reloaded.

Revision history for this message
Steve Dodd (anarchetic) wrote :

A bit off-topic but: on a third machine of mine, booting into into Bionic leaves the network card in a state where WoL won't work at all, even after rebooting back into Trusty (3.13.0.) Have to power cycle the machine, then WoL is fine.. (r8169 driver again, obviously)

John (johndoe22)
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
status: Incomplete → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Jan, so with 4.15.0-23, the NIC comes back every time the system suspend/resume?

Steve, Erik, let's discuss your issue on LP: #1784542, since this bug is limited to systemd suspend.

Revision history for this message
asche (j-launchpad-u) wrote :

i have the same issue with r8169 and Kubuntu 18.04.
with kernel version higher than 4.15.0-23 the network is not coming back from suspend.

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

@Kai-Heng
Weird, I reinstalled and tested 4.15.0-23 again, and it seems that with 4.15.0-23 it can also happen that network is not coming up after suspend and the driver has to be reloaded.
I'm going to further test this on another installation.

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

Ok, I can confirm what I wrote on #28 on another installation.

Very hard for me to tackle this down - since bisecting obviously didn't work.

Here is a summary of what I found out about this bug:

- Artful (or any other previous Ubuntu version) was not affected (at least not to that extend, see below).
- Debian 9 is also not affected.
- On previous Ubuntu versions, I could observe something regarding networking and suspend, that may also be a bug, but had a much less severe impact on daily usage: In ~1 out of 10 cases, the network connection was down after suspend, but it could be relatively easily be enabled again by deactivating the network and activating it again (either via graphical NetworkManager-icon or via terminal commands 'nmcli networking off|on'). The difference to the current bug is that it was not necessary to reload the kernel driver.

Revision history for this message
Steve Dodd (anarchetic) wrote : Re: [Bug 1779817] Re: r8169 no internet after suspending

I still think there are multiple bugs here. Some of us have a network that
absolutely will not come back up after suspend with kernel version
>4.15.0-23, even after reloading the driver. So there is a very clear
regression that I would really like see resolved / rolled back. Breaking
stuff in an LTS release for multiple users isn't much fun :(

On 3 August 2018 at 10:45, Jan Rathmann <email address hidden> wrote:

> Ok, I can confirm what I wrote on #28 on another installation.
>
> Very hard for me to tackle this down - since bisecting obviously didn't
> work.
>
> Here is a summary of what I found out about this bug:
>
> - Artful (or any other previous Ubuntu version) was not affected (at least
> not to that extend, see below).
> - Debian 9 is also not affected.
> - On previous Ubuntu versions, I could observe something regarding
> networking and suspend, that may also be a bug, but had a much less severe
> impact on daily usage: In ~1 out of 10 cases, the network connection was
> down after suspend, but it could be relatively easily be enabled again by
> deactivating the network and activating it again (either via graphical
> NetworkManager-icon or via terminal commands 'nmcli networking off|on').
> The difference to the current bug is that it was not necessary to reload
> the kernel driver.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

The commits that causes the regression is a fix for another issue, LP: #1752772. So we can't simply revert those fixes because it will regress other users.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

@Kai-Heng
With this kernel version, unfortunately my network doesn't come up at all (even after fresh power on. Reloading driver does not help).

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

@Kai-Heng
I did that, with net-next directly it seems that everything is working again at least as good as in 4.15.0-23 (Network card comes up at boot, and also after suspend.

So hopefully this got fixed in net-next.

Revision history for this message
Erik Kallen (erikkallen) wrote :

@Kai-Heng sorry for my late response to your question #22 (holidays) for me everything worked on 4.14.0-23 and stopped working after that

Revision history for this message
Erik Kallen (erikkallen) wrote :

I also tried the net-next from kai-heng and I have the same issue no network from boot.

I tried 4.17.12 using ukuu and I get the following firmware load error:

[ 43.570575] Bluetooth: hci0: rtl: examining hci_ver=06 hci_rev=000a lmp_ver=06 lmp_subver=8821
[ 43.570577] Bluetooth: hci0: rtl: loading rtl_bt/rtl8821a_config.bin
[ 43.570586] bluetooth hci0: Direct firmware load for rtl_bt/rtl8821a_config.bin failed with error -2
[ 43.570588] Bluetooth: hci0: rtl: loading rtl_bt/rtl8821a_fw.bin
[ 43.571398] done.
[ 43.571604] Bluetooth: hci0: rom_version status=0 version=1
[ 43.571610] Bluetooth: hci0: cfg_sz 0, total size 17428
[ 43.575798] rfkill: input handler enabled
[ 43.636062] PM: suspend exit
[ 43.738541] rfkill: input handler disabled
[ 43.742040] IPv6: ADDRCONF(NETDEV_UP): enp2s0: link is not ready
[ 43.816019] r8169 0000:02:00.0 enp2s0: link down
[ 43.816083] IPv6: ADDRCONF(NETDEV_UP): enp2s0: link is not ready

Revision history for this message
Steve Dodd (anarchetic) wrote :

I've just built net-next direct from git and still have the same issue - the NIC doesn't come back after suspend, even with rmmod / modprobe.

Is there any upstream awareness of this? Bug tracker and/or mailing list discussion?

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Jan, I made a new backport based on net-next, please try it out:

https://people.canonical.com/~khfeng/bionic-r8169-backport2/

Revision history for this message
Erik Kallen (erikkallen) wrote :

@Kai-Heng your last kernel seems to have done the trick for me!
My network comes back up after standby.

Also it installs nicely next to my current kernels awesome!

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

@Kai-Heng
I can confirm what Erik says - my network card seems to come up properly after suspend with the backported kernel in #39.

Revision history for this message
Steve Dodd (anarchetic) wrote :

I can't even boot Kai's latest kernel! The system hard locks early in the
boot process, I have tried blacklisting r8169 and rebuilding the initramfs,
pretty sure it is not loaded by the point at which it hangs. My hardware is
cursed...

On 7 August 2018 at 21:13, Jan Rathmann <email address hidden> wrote:

> @Kai-Heng
> I can confirm what Erik says - my network card seems to come up properly
> after suspend with the backported kernel in #39.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Michael Eischer (eischer) wrote :

@Kai-Heng your latest kernel completely breaks the network using r8169 for me. The system sort of manages to complete startup but fails to attach an address to the network card. This is accompanied with lots of console output similar to this

```
Aug 7 14:46:28 kernel: [ 25.761050] r8169 0000:05:00.0 eth0: rtl_counters_cond == 1 (loop: 1000, delay: 10).
Aug 7 14:46:28 kernel: [ 25.771937] r8169 0000:05:00.0 eth0: rtl_counters_cond == 1 (loop: 1000, delay: 10).
Aug 7 14:46:28 kernel: [ 26.068062] r8169 0000:05:00.0 eth0: rtl_ocpar_cond == 1 (loop: 100, delay: 1000).
Aug 7 14:46:28 kernel: [ 26.168057] r8169 0000:05:00.0 eth0: rtl_ocpar_cond == 1 (loop: 100, delay: 1000).
Aug 7 14:46:28 kernel: [ 26.268060] r8169 0000:05:00.0 eth0: rtl_ocpar_cond == 1 (loop: 100, delay: 1000).
Aug 7 14:46:28 kernel: [ 26.368055] r8169 0000:05:00.0 eth0: rtl_ocpar_cond == 1 (loop: 100, delay: 1000).
```

Using the standard bionic kernel the network works as long as I run `sudo ethtool -s eth0 wol g` between reboots. Otherwise the network card just vanishes (not visible in BIOS and Linux) until I unplug the computer for a minute or so.

Revision history for this message
Steve Dodd (anarchetic) wrote :

Progress!

I have built ubuntu-bionic.git myself and discovered the only commit I have to revert to get things working again is this one:

http://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/?id=41450d46ba126887b9548cfcdf99957da5418ca5

@Kai, does that help you at all? If nothing else might help classify the different types of problem we are seeing here. Happy to open a new bug if that helps unclutter / clarify things.

Revision history for this message
Heiner Kallweit (kalle1) wrote :

Apart from switching to a more up-to-date API after this commit MSI-X will be used if available. Maybe your system has a problem with MSI-X. If you keep the commit and just do the following change, does it fix the issue?

Replace
flags = PCI_IRQ_ALL_TYPES;
with
flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;

Also it would be important to know which exact chip version you have. Can you provide a full dmesg output?

Revision history for this message
Heiner Kallweit (kalle1) wrote :

One more hint: I found few reports where people had problems (independent of what is being discussed here) with MSI-X when VT-d is disabled in the BIOS. Could you check this as well?

Revision history for this message
Steve Dodd (anarchetic) wrote :

I'm about to dash out, but will try the above change later. Meanwhile,
relevant bits of dmesg:

[ 4.440185] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[ 4.445120] r8169 0000:01:00.0: enabling device (0000 -> 0003)
[ 4.458178] r8169 0000:01:00.0 eth0: RTL8168g/8111g at 0x
(ptrval), 84:39:be:67:5d:2f, XID 0c000800 IRQ 125
[ 4.458181] r8169 0000:01:00.0 eth0: jumbo features [frames: 9200 bytes,
tx checksumming: ko]

lspci:

00:00.0 Host bridge: Intel Corporation Celeron N3350/Pentium N4200/Atom
E3900 Series Host Bridge (rev 0b)
..
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c

The box itself is an Apollo Lake SoC fanless mini PC, if that helps..?

On 8 August 2018 at 14:19, Heiner Kallweit <email address hidden>
wrote:

> Apart from switching to a more up-to-date API after this commit MSI-X
> will be used if available. Maybe your system has a problem with MSI-X.
> If you keep the commit and just do the following change, does it fix the
> issue?
>
> Replace
> flags = PCI_IRQ_ALL_TYPES;
> with
> flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;
>
> Also it would be important to know which exact chip version you have.
> Can you provide a full dmesg output?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Steve Dodd (anarchetic) wrote :

Hmm, no "IOMMU" messages in dmesg, and the CPU is supposed to support it,
so I guess it is disabled,
however this machine really doesn't have a BIOS to speak of, in terms of
configurable settings anyway :(

On 8 August 2018 at 14:42, Heiner Kallweit <email address hidden>
wrote:

> One more hint: I found few reports where people had problems
> (independent of what is being discussed here) with MSI-X when VT-d is
> disabled in the BIOS. Could you check this as well?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Steve Dodd (anarchetic) wrote :

@kalle1

Your one liner seems to fix things - thank you! Also discovered I can work around by writing 0 to /sys/devices/pci/<blah>/msi_bus, which will at least mean I can run unpatched Ubuntu kernels until we figure this out :)

Anybody still having problems and wanting to test if this is their issue is welcome to try the kernels I built:

https://www.dropbox.com/sh/wq48inm7z2ycz41/AAA9o7yflDT-y0nR9hmX5u89a?dl=0&lst=

Revision history for this message
Heiner Kallweit (kalle1) wrote :

Good that it's fixed for you. Indeed your BIOS seems to be broken with regard to MSI-X. According to the vendor part of the MAC address your mini PC is some no-name China product?

Revision history for this message
Steve Dodd (anarchetic) wrote :

Yup, afraid so, Beelink S1: https://www.amazon.co.uk/dp/B077HKCT78 - I'm
basically using it as an old-fashioned X Terminal. They promised a BIOS
update to "make it work with Linux" but I never got it to install and they
stopped answering my emails :( Happy to put it down to broken hardware so
long as I have a workaround, but is there anything else we can check to
confirm that? No noticeable problems with any other part of the system, bit
I guess everything else is part of the SoC, so this is probably the only
PCIe device I'm actually using..

On 9 August 2018 at 09:47, Heiner Kallweit <email address hidden>
wrote:

> Good that it's fixed for you. Indeed your BIOS seems to be broken with
> regard to MSI-X. According to the vendor part of the MAC address your
> mini PC is some no-name China product?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Steve Dodd (anarchetic) wrote :

Just double-checked with lspci -vvv and indeed the Realtek NIC is the only
MSI-X capable device. Is there anyway to disable MSI-X globally but leave
MSI on? All I can find is pci=nomsi which seems a bit heavy handed..

On 9 August 2018 at 10:08, Steve Dodd <email address hidden> wrote:

> Yup, afraid so, Beelink S1: https://www.amazon.co.uk/dp/B077HKCT78 - I'm
> basically using it as an old-fashioned X Terminal. They promised a BIOS
> update to "make it work with Linux" but I never got it to install and they
> stopped answering my emails :( Happy to put it down to broken hardware so
> long as I have a workaround, but is there anything else we can check to
> confirm that? No noticeable problems with any other part of the system,
> bit I guess everything else is part of the SoC, so this is probably the
> only PCIe device I'm actually using..
>
> On 9 August 2018 at 09:47, Heiner Kallweit <email address hidden>
> wrote:
>
>> Good that it's fixed for you. Indeed your BIOS seems to be broken with
>> regard to MSI-X. According to the vendor part of the MAC address your
>> mini PC is some no-name China product?
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1779817
>>
>> Title:
>> r8169 no internet after suspending
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817
>> /+subscriptions
>>
>
>

Revision history for this message
Heiner Kallweit (kalle1) wrote :

Whether other systems suffer from the same MSI-X incompatability we'll know only once I get more such bug reports. So far your report is the only one.

You said if MSI-X is enabled it doesn't even boot. Any error message or does it just silently stop?

And no, I'm not aware of any way to disable MSI-X only. The kernel treats it as a sub-feature of MSI, therefore functions like pci_dev_msi_enabled() return true if MSI or MSI-X is active for a device.

Revision history for this message
Steve Dodd (anarchetic) wrote :

It's was only Kai's kernel from comment #39 that didn't boot - I've not had
any other problems on that front.

On 9 August 2018 at 12:05, Heiner Kallweit <email address hidden>
wrote:

> Whether other systems suffer from the same MSI-X incompatability we'll
> know only once I get more such bug reports. So far your report is the
> only one.
>
> You said if MSI-X is enabled it doesn't even boot. Any error message or
> does it just silently stop?
>
> And no, I'm not aware of any way to disable MSI-X only. The kernel
> treats it as a sub-feature of MSI, therefore functions like
> pci_dev_msi_enabled() return true if MSI or MSI-X is active for a
> device.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
David Jordan (dmj726) wrote :

I can reproduce the issue with ethernet failing on the r8169 on resume from suspend with the current kernel. This is a regression since a previous kernel on 18.04 since the symptoms occur after installing updates.

I can also confirm that Kai's kernel seems to fix ethernet after suspend on at least one of our products. Will do more testing, but this would be very good to get into the next bionic kernel update.

Revision history for this message
Steve Dodd (anarchetic) wrote :

@Heiner, I know a number of other modules have a "use_msi_x" option - I
guess that would be too much to ask given I'm the only one who seems to
need it so far? :)

On 9 August 2018 at 16:54, David Jordan <email address hidden> wrote:

> I can reproduce the issue with ethernet failing on the r8169 on resume
> from suspend with the current kernel. This is a regression since a
> previous kernel on 18.04 since the symptoms occur after installing
> updates.
>
> I can also confirm that Kai's kernel seems to fix ethernet after suspend
> on at least one of our products. Will do more testing, but this would
> be very good to get into the next bionic kernel update.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Heiner Kallweit (kalle1) wrote :

@Steve, there's some resistance amongst kernel maintainers against additional module parameters. Each new parameter increases complexity and makes maintenance harder.
Most likely it would result in a NAK when trying to fix an issue with one broken exotic (sorry..) system in the kernel, especially if a userspace workaround is available (disabling MSI via sysfs for the network device).

Revision history for this message
Steve Dodd (anarchetic) wrote :

Can we cherry-pick
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=7c53a722459c1d6ffb0f5b2058c06ca8980b8600
for bionic?

On 10 August 2018 at 12:49, Heiner Kallweit <email address hidden>
wrote:

> @Steve, there's some resistance amongst kernel maintainers against
> additional module parameters. Each new parameter increases complexity and
> makes maintenance harder.
> Most likely it would result in a NAK when trying to fix an issue with one
> broken exotic (sorry..) system in the kernel, especially if a userspace
> workaround is available (disabling MSI via sysfs for the network device).
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/
> 1779817/+subscriptions
>

Revision history for this message
Marco (marcorestom) wrote :

This is the only solution that worked for me: https://askubuntu.com/a/1053520/222403

Installing the lastest version of the r8168 driver found at http://mirrors.edge.kernel.org/ubuntu/pool/universe/r/r8168/ (r8168-dkms_8.046.00-1_all.deb)

Cheers!

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

@Heiner,

This is a regression caused by my backport fix for LP: #1752772.
I pulled these four commits:

r8169: fix interrupt number after adding support for MSI-X interrupts
r8169: improve interrupt handling
r8169: disable WOL per default
r8169: remove some WOL-related dead code
r8169: remove netif_napi_del in probe error path

According to Jan, commit "r8169: restore previous behavior to accept BIOS WoL settings" alone cannot fix the issue completely.

So do you think backporting r8169 in net-next to fix this issue a good approach?

Revision history for this message
MV (mvidal) wrote :

Thanks Marco #59 !!!

The last r8168 package solves this problem :

Installing the lastest version of the r8168 driver found at http://mirrors.edge.kernel.org/ubuntu/pool/universe/r/r8168/ (r8168-dkms_8.046.00-1_all.deb)

Revision history for this message
Steve Dodd (anarchetic) wrote :

Attached is a work-around for the in-kernel driver that is as unhacky as I can make it.

Drop it in /etc/initramfs-tools/scripts/init-top and chmod a+x it. Add 'r8169_disable_msi' to your kernel command line (/etc/default/grub, usually.) Remember to update-initramfs and update-grub as necessary.

For the moment it disables MSI on everything with the ID 0x10ec:0x8168, as there seems to be no way to get the MAC version from userspace - and certainly not before the driver is loaded. Other PCI IDs may need adding..

Still hoping we can cherry pick the in-driver workaround for bionic...?

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Folks,
Please try this kernel out:
https://people.canonical.com/~khfeng/r8169/

If there's no more regression, I'll make a SRU request and the next kernel will contain the fix.

Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

Kai-Heng, the bug seems to be gone on my system with your newest kernel in #63 and I don't experience any regression so far. Thanks for your work!

Revision history for this message
Steve Dodd (anarchetic) wrote :

Yup, this kernel finally fixes things for me too (I did remember to
disable my other workaround) - thanks Kai, look forward to seeing it
SRU'd.

On 24 August 2018 at 09:39, Jan Rathmann <email address hidden> wrote:
> Kai-Heng, the bug seems to be gone on my system with your newest kernel
> in #63 and I don't experience any regression so far. Thanks for your
> work!
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817/+subscriptions

description: updated
Revision history for this message
Peter Smith (pdo.smith) wrote :

Kai-Heng Feng,

Success, I tried #63 and it worked.
This is great, thanks.

Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Bionic):
status: New → Fix Committed
Revision history for this message
Malvina Pushkova (lady3mlnm) wrote :

I've found these discussion, installed latest r8169 driver and at last internet begins to restore after suspending! Before that nothing help, including "modprobe -r r8169" and restart NetworkManager. I'm happy! )))

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Steve Dodd (anarchetic) wrote :

Well 4.15.0-35.38 doesn't even finish booting on my machine. Doesn't
find the root volume and the USB keyboard doesn't finish booting.
Chances are the r8169 fix is fine but I can't test.

On 14 September 2018 at 18:02, Brad Figg <email address hidden> wrote:
> This bug is awaiting verification that the kernel in -proposed solves
> the problem. Please test the kernel and update this bug with the
> results. If the problem is solved, change the tag 'verification-needed-
> bionic' to 'verification-done-bionic'. If the problem still exists,
> change the tag 'verification-needed-bionic' to 'verification-failed-
> bionic'.
>
> If verification is not done by 5 working days from today, this fix will
> be dropped from the source code, and this bug will be closed.
>
> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
> to enable and use -proposed. Thank you!
>
>
> ** Tags added: verification-needed-bionic
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1779817
>
> Title:
> r8169 no internet after suspending
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817/+subscriptions

Revision history for this message
Steve Dodd (anarchetic) wrote :

New bug report opened:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1792635

On 14 September 2018 at 21:29, Steve Dodd <email address hidden> wrote:
> Well 4.15.0-35.38 doesn't even finish booting on my machine. Doesn't
> find the root volume and the USB keyboard doesn't finish booting.
> Chances are the r8169 fix is fine but I can't test.
>
> On 14 September 2018 at 18:02, Brad Figg <email address hidden> wrote:
>> This bug is awaiting verification that the kernel in -proposed solves
>> the problem. Please test the kernel and update this bug with the
>> results. If the problem is solved, change the tag 'verification-needed-
>> bionic' to 'verification-done-bionic'. If the problem still exists,
>> change the tag 'verification-needed-bionic' to 'verification-failed-
>> bionic'.
>>
>> If verification is not done by 5 working days from today, this fix will
>> be dropped from the source code, and this bug will be closed.
>>
>> See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
>> to enable and use -proposed. Thank you!
>>
>>
>> ** Tags added: verification-needed-bionic
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1779817
>>
>> Title:
>> r8169 no internet after suspending
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817/+subscriptions

Revision history for this message
Steve Dodd (anarchetic) wrote :

Earlier problems seem to have been caused by the packages not all arriving in the archive together, and apt-get/aptitude and myself getting very confused - apologies for noise. Can now confirm that 4.15.0-35.38 seems to solve the issue; tags adjusted.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Jan Rathmann (kaiserclaudius) wrote :

For me the version in bionic-proposed also seems to work fine.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (23.5 KiB)

This bug was fixed in the package linux - 4.15.0-36.39

---------------
linux (4.15.0-36.39) bionic; urgency=medium

  * CVE-2018-14633
    - iscsi target: Use hex2bin instead of a re-implementation

  * CVE-2018-17182
    - mm: get rid of vmacache_flush_all() entirely

linux (4.15.0-35.38) bionic; urgency=medium

  * linux: 4.15.0-35.38 -proposed tracker (LP: #1791719)

  * device hotplug of vfio devices can lead to deadlock in vfio_pci_release
    (LP: #1792099)
    - SAUCE: vfio -- release device lock before userspace requests

  * L1TF mitigation not effective in some CPU and RAM combinations
    (LP: #1788563)
    - x86/speculation/l1tf: Fix overflow in l1tf_pfn_limit() on 32bit
    - x86/speculation/l1tf: Fix off-by-one error when warning that system has too
      much RAM
    - x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+

  * CVE-2018-15594
    - x86/paravirt: Fix spectre-v2 mitigations for paravirt guests

  * CVE-2017-5715 (Spectre v2 s390x)
    - KVM: s390: implement CPU model only facilities
    - s390: detect etoken facility
    - KVM: s390: add etoken support for guests
    - s390/lib: use expoline for all bcr instructions
    - s390: fix br_r1_trampoline for machines without exrl
    - SAUCE: s390: use expoline thunks for all branches generated by the BPF JIT

  * Ubuntu18.04.1: cpuidle: powernv: Fix promotion from snooze if next state
    disabled (performance) (LP: #1790602)
    - cpuidle: powernv: Fix promotion from snooze if next state disabled

  * Watchdog CPU:19 Hard LOCKUP when kernel crash was triggered (LP: #1790636)
    - powerpc: hard disable irqs in smp_send_stop loop
    - powerpc: Fix deadlock with multiple calls to smp_send_stop
    - powerpc: smp_send_stop do not offline stopped CPUs
    - powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled

  * Security fix: check if IOMMU page is contained in the pinned physical page
    (LP: #1785675)
    - vfio/spapr: Use IOMMU pageshift rather than pagesize
    - KVM: PPC: Check if IOMMU page is contained in the pinned physical page

  * Missing Intel GPU pci-id's (LP: #1789924)
    - drm/i915/kbl: Add KBL GT2 sku
    - drm/i915/whl: Introducing Whiskey Lake platform
    - drm/i915/aml: Introducing Amber Lake platform
    - drm/i915/cfl: Add a new CFL PCI ID.

  * CVE-2018-15572
    - x86/speculation: Protect against userspace-userspace spectreRSB

  * Support Power Management for Thunderbolt Controller (LP: #1789358)
    - thunderbolt: Handle NULL boot ACL entries properly
    - thunderbolt: Notify userspace when boot_acl is changed
    - thunderbolt: Use 64-bit DMA mask if supported by the platform
    - thunderbolt: Do not unnecessarily call ICM get route
    - thunderbolt: No need to take tb->lock in domain suspend/complete
    - thunderbolt: Use correct ICM commands in system suspend
    - thunderbolt: Add support for runtime PM

  * random oopses on s390 systems using NVMe devices (LP: #1790480)
    - s390/pci: fix out of bounds access during irq setup

  * [Bionic] Spectre v4 mitigation (Speculative Store Bypass Disable) support
    for arm64 using SMC firmware call to set a hardware chicken bit
    (LP: #1787993) // CVE-2018...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
AceLan Kao (acelankao)
Changed in linux-oem (Ubuntu Bionic):
status: New → Fix Committed
AceLan Kao (acelankao)
Changed in linux-oem (Ubuntu):
status: New → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem - 4.15.0-1043.48

---------------
linux-oem (4.15.0-1043.48) bionic; urgency=medium

  [ Ubuntu: 4.15.0-52.56 ]

  * Remote denial of service (resource exhaustion) caused by TCP SACK scoreboard
    manipulation (LP: #1831638)
    - SAUCE: tcp: tcp_fragment() should apply sane memory limits
  * Remote denial of service (system crash) caused by integer overflow in TCP
    SACK handling (LP: #1831637)
    - SAUCE: tcp: limit payload size of sacked skbs

 -- Stefan Bader <email address hidden> Fri, 14 Jun 2019 10:39:16 +0200

Changed in linux-oem (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux-oem (Ubuntu):
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.