IP Subnet to internet routing fails 5x/day after latest 16.04 LTS updates

Bug #1695829 reported by Craig-z
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned

Bug Description

Since updating Ubuntu 16.04.02 LTS with all latest patches on 6/2, my subnet internet routing is failing 5x+/day for my Windows 10 machine. Unfortunately, Windows 10 is also a moving target, which also updated 6/2, about when this problem started. Bug report also filed with MS.

Why I think it's Ubuntu... When Windows 10 based web pages stop loading, email stops transmitting, and internet-based CAD licensing stops working, I can get everything working like new again by typing "firehol start", which refreshes iptables entries. So I suspect the issue might be related to either the Linux kernel or iptables.

I also see another bug dated 2-Jun-2017 for IP subnet routing, but that one is running on an ARM processor. Mine is AMD FX-8120, Ubuntu 16.04.2 LTS x64. The reported problem sounds similar.

I performed an IP Tables dump during "operating" and "not routing" states. I can't see any difference between the two dumps, but running firehol start clears the problem.

While in the "problem state", my Ubuntu server does have internet connectivity. i.e. Loading web pages on Firefox on the Server/router does work. On the "routed" Windows 10 machine, what doesn't work are; ping (internet), Firefox (internet), Thunderbird (internet)... From the Windows 10 machine what does work are; ping (server), Firefox (server apache web pages), Thunderbird (server imaps accounts). So the Windows 10 machine still has full server access over gigabit ethernet, but can't see any internet based services, where IP routing is required to succeed. "firehol start" restores routing until for a brief time when the Windows machine is busy accessing the internet. Routing stayed up all night when there was no Windows 10 internet routing activity. Resuming activity in the morning caused a quick routing failure.
---
ApportVersion: 2.20.1-0ubuntu2.6
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: craig 7726 F.... pulseaudio
 /dev/snd/controlC0: craig 7726 F.... pulseaudio
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=5967ff24-0a8c-4a29-bd33-408fc996d7a5
InstallationDate: Installed on 2016-06-13 (357 days ago)
InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.3)
MachineType: To be filled by O.E.M. To be filled by O.E.M.
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_US
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-78-generic root=UUID=5764c9d8-e512-48d9-8496-58db180684ac ro
ProcVersionSignature: Ubuntu 4.4.0-78.99-generic 4.4.62
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-78-generic N/A
 linux-backports-modules-4.4.0-78-generic N/A
 linux-firmware 1.157.10
RfKill:

Tags: xenial
Uname: Linux 4.4.0-78-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 11/24/2011
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0901
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: SABERTOOTH 990FX
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0901:bd11/24/2011:svnTobefilledbyO.E.M.:pnTobefilledbyO.E.M.:pvrTobefilledbyO.E.M.:rvnASUSTeKCOMPUTERINC.:rnSABERTOOTH990FX:rvrRev1.xx:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To be filled by O.E.M.
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: To be filled by O.E.M.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1695829/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
Craig-z (craig-z) wrote :

root@pluto:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial
root@pluto:~# uname -a
Linux pluto 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@pluto:~# iptables -V
iptables v1.6.0

FireHOL 2.0.3

affects: ubuntu → kernel
affects: kernel → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1695829

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Craig-z (craig-z) wrote :

Executed "apport-collect 1695829" as root.
It took a long time to run, and I can't tell if it's done. The pop-up dialog is gone.
This is what I see in the terminal window:

root@pluto:~# apport-collect 1695829
The authorization page:
 (https://launchpad.net/+authorize-token?oauth_token=JCzPZ4NNjgCB1QjCwR1n&allow_permission=DESKTOP_INTEGRATION)
should be opening in your browser. Use your browser to authorize
this program to access Launchpad on your behalf.
Waiting to hear from Launchpad about your decision...
dpkg-query: no packages found matching linux
1496705135180 addons.update-checker WARN Update manifest for <email address hidden> did not contain an updates property
1496705135188 addons.update-checker WARN Update manifest for <email address hidden> did not contain an updates property
1496705135196 addons.update-checker WARN Update manifest for <email address hidden> did not contain an updates property
1496705135204 addons.update-checker WARN Update manifest for <email address hidden> did not contain an updates property
1496705135213 addons.update-checker WARN Update manifest for {972ce4c6-7e08-4474-a285-3208198ce6fd} did not contain an updates property
1496705135315 addons.xpi WARN Add-on <email address hidden> is not compatible with application version.
1496705135428 addons.xpi WARN Add-on <email address hidden> is not compatible with application version.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Craig-z (craig-z) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected xenial
description: updated
Revision history for this message
Craig-z (craig-z) wrote : CRDA.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : IwConfig.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : JournalErrors.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : Lspci.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : Lsusb.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : ProcModules.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : UdevDb.txt

apport information

Revision history for this message
Craig-z (craig-z) wrote : WifiSyslog.txt

apport information

Revision history for this message
Daniel McLane (realtime-dsp) wrote :

In CRDA looks like region is not set.

Might add iw reg set US (or your country code)

Revision history for this message
Daniel McLane (realtime-dsp) wrote :

I should add its possible that your wifi selects a channel not legal in your region or that your client device cannot use.

Revision history for this message
Craig-z (craig-z) wrote :

I'm not sure where to set CRDA region. You are correct that I'm in the US.

This system is not connected by or to WiFi. It is a multi-homed CAT-6 ethernet connected system with 30Mb symmetric Internet-FIOS on eth0 (100 Base-T full duplex), and Local LAN on eth1 (Gigabit full duplex). There is a third card in the system (eth2) which currently isn't being used for anything.

The system is a full up server for VoIP, email, web services, LAN/Internet Router, SMB/CIFS, ... etc.

This is a "home" system. i.e. not a business. But darned inconvenient to the household when things aren't working right. ;) [i.e. You have my full cooperation] As of Summer 2016, I'm new to Ubuntu.

Revision history for this message
Daniel McLane (realtime-dsp) wrote : Re: [Bug 1695829] Re: IP Subnet to internet routing fails 5x/day after latest 16.04 LTS updates
Download full text (5.7 KiB)

Set the region from the command prompt using

iw reg set US

Check it with

iw reg get

I don't this is the root problem but it's not right. This setting tells the wifi what channels and power levels to use. Usually it's no problem if it's not set but just to be safe it's good to get it right. Some countries like Japan have some 2.4 G channels that are not used in US so if the radio were to choose one of those your laptop might not see it. Unlikely but just thought I'd mention it.

Sent from my iPhone

> On Jun 5, 2017, at 7:04 PM, Craig-z <email address hidden> wrote:
>
> I'm not sure where to set CRDA region. You are correct that I'm in the
> US.
>
> This system is not connected by or to WiFi. It is a multi-homed CAT-6
> ethernet connected system with 30Mb symmetric Internet-FIOS on eth0 (100
> Base-T full duplex), and Local LAN on eth1 (Gigabit full duplex). There
> is a third card in the system (eth2) which currently isn't being used
> for anything.
>
> The system is a full up server for VoIP, email, web services,
> LAN/Internet Router, SMB/CIFS, ... etc.
>
> This is a "home" system. i.e. not a business. But darned inconvenient
> to the household when things aren't working right. ;) [i.e. You have
> my full cooperation] As of Summer 2016, I'm new to Ubuntu.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1695829
>
> Title:
> IP Subnet to internet routing fails 5x/day after latest 16.04 LTS
> updates
>
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> Since updating Ubuntu 16.04.02 LTS with all latest patches on 6/2, my
> subnet internet routing is failing 5x+/day for my Windows 10 machine.
> Unfortunately, Windows 10 is also a moving target, which also updated
> 6/2, about when this problem started. Bug report also filed with MS.
>
> Why I think it's Ubuntu... When Windows 10 based web pages stop
> loading, email stops transmitting, and internet-based CAD licensing
> stops working, I can get everything working like new again by typing
> "firehol start", which refreshes iptables entries. So I suspect the
> issue might be related to either the Linux kernel or iptables.
>
> I also see another bug dated 2-Jun-2017 for IP subnet routing, but
> that one is running on an ARM processor. Mine is AMD FX-8120, Ubuntu
> 16.04.2 LTS x64. The reported problem sounds similar.
>
> I performed an IP Tables dump during "operating" and "not routing"
> states. I can't see any difference between the two dumps, but running
> firehol start clears the problem.
>
> While in the "problem state", my Ubuntu server does have internet connectivity. i.e. Loading web pages on Firefox on the Server/router does work. On the "routed" Windows 10 machine, what doesn't work are; ping (internet), Firefox (internet), Thunderbird (internet)... From the Windows 10 machine what does work are; ping (server), Firefox (server apache web pages), Thunderbird (server imaps accounts). So the Windows 10 machine still has full server access over gigabit ethernet, but can't see any internet based services, where IP routing is required to...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.12 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12-rc4

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Craig-z (craig-z) wrote :

I may need a little help to "test the latest upstream kernel" before I can say "yes".

Problem: I visited https://wiki.ubuntu.com/KernelMainlineBuilds and one of the first things I read is "These kernels are not supported and are not appropriate for production use."

My server is a "production" server in the sense that if my system goes down, so does a lot of my house (phone service, internet access, design activity, etc). My estimation is about 2-days of full time effort to re-install to get it all configured and running again.

So, do you have any good ways to restore the system back to the state it's at now? i.e. if I try the new kernel and either my production system won't work, or my production system won't boot.

Or is there a way to clone the system I have now?

Actually I may have just answered my own question, does Ubuntu have a "live DVD" so I can use "dd" to copy my boot SSD to another SSD for the new kernel "test" effort. Then if everything goes belly up, I can swap the original SSD back into the system to keep going (then copy "newer" email and logs off the defective installation back into the original production system.

Any tips on how to make this "kernel test" as risk free and painless as possible? Being new to Ubuntu means I don't have full knowledge of the Ubuntu eco-system of tools, like "iw reg set US" or "apport-collect 1695829".

As soon as I can reassemble my linux workstation, currently in pieces on the floor for new fan installs, my backup RAID array will again be available for "recovery" storage.

Tips for quick/painless recovery if the new kernel fails and breaks my production system (where the testing has to happen)?

Revision history for this message
Craig-z (craig-z) wrote :

With he latest kernel updates I haven't seen this issue for a while.

Revision history for this message
Daniel McLane (realtime-dsp) wrote :
Download full text (5.0 KiB)

Unfortunately I cannot move to the latest kernel since Freescale does not
yet support it. I worked around the issue by using a bridge in my setup to
eliminate the subnet. Not a solution, just a workaround.
Thanks for your update, I will move to the latest kernel when I can.

On Mon, Jun 26, 2017 at 9:49 AM, Craig-z <email address hidden> wrote:

> With he latest kernel updates I haven't seen this issue for a while.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1695829
>
> Title:
> IP Subnet to internet routing fails 5x/day after latest 16.04 LTS
> updates
>
> Status in linux package in Ubuntu:
> Incomplete
>
> Bug description:
> Since updating Ubuntu 16.04.02 LTS with all latest patches on 6/2, my
> subnet internet routing is failing 5x+/day for my Windows 10 machine.
> Unfortunately, Windows 10 is also a moving target, which also updated
> 6/2, about when this problem started. Bug report also filed with MS.
>
> Why I think it's Ubuntu... When Windows 10 based web pages stop
> loading, email stops transmitting, and internet-based CAD licensing
> stops working, I can get everything working like new again by typing
> "firehol start", which refreshes iptables entries. So I suspect the
> issue might be related to either the Linux kernel or iptables.
>
> I also see another bug dated 2-Jun-2017 for IP subnet routing, but
> that one is running on an ARM processor. Mine is AMD FX-8120, Ubuntu
> 16.04.2 LTS x64. The reported problem sounds similar.
>
> I performed an IP Tables dump during "operating" and "not routing"
> states. I can't see any difference between the two dumps, but running
> firehol start clears the problem.
>
> While in the "problem state", my Ubuntu server does have internet
> connectivity. i.e. Loading web pages on Firefox on the Server/router does
> work. On the "routed" Windows 10 machine, what doesn't work are; ping
> (internet), Firefox (internet), Thunderbird (internet)... From the Windows
> 10 machine what does work are; ping (server), Firefox (server apache web
> pages), Thunderbird (server imaps accounts). So the Windows 10 machine
> still has full server access over gigabit ethernet, but can't see any
> internet based services, where IP routing is required to succeed. "firehol
> start" restores routing until for a brief time when the Windows machine is
> busy accessing the internet. Routing stayed up all night when there was no
> Windows 10 internet routing activity. Resuming activity in the morning
> caused a quick routing failure.
> ---
> ApportVersion: 2.20.1-0ubuntu2.6
> Architecture: amd64
> AudioDevicesInUse:
> USER PID ACCESS COMMAND
> /dev/snd/controlC1: craig 7726 F.... pulseaudio
> /dev/snd/controlC0: craig 7726 F.... pulseaudio
> DistroRelease: Ubuntu 16.04
> HibernationDevice: RESUME=UUID=5967ff24-0a8c-4a29-bd33-408fc996d7a5
> InstallationDate: Installed on 2016-06-13 (357 days ago)
> InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Release
> amd64 (20160420.3)
> MachineType: To be filled by O.E.M. To be filled by O.E.M.
...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.