Comment 60 for bug 1447664

Thank you.
I am still having the problem during our cloning process, although it's not
so frequent. Before the patch I applied, each and every transfer would
ALWAYS kick the tg3 bug.

Here it seems related to problems with NAPI. AFAIK, this is an approach to
handle interrupt bursts. NIC's work typically in bursts: a long time
without packets, then a very large stream of packets, then silence. This is
the common scenario.

Having interrupts to serve sporadic data is ok. But a burst of packets
trigger a burst of interrupts, which is not as efficient as just polling
the NIC (during the burst).

What NAPI does is (in a very very simplified way): it expects the first
interrupt from the network, then switches off interrupts, poll the NIC (up
to a limit) until there are no more network packets, or the "work quota" is
exhausted, what happens first. Then it turns on interrupts and the cycle
repeats. This quota (sorry, don't remember the correct term) is very
important to prevent the kernel from being stuck just serving packets.

What's happening is (my understanding) that something went wrong during
this process and the tg3 driver gets stuck.

A colleague told me that it's related to the broadcom driver.

Please try this workaround. Remove the two drivers, then reload "broadcom"
and "tg3" in this order. Maybe then your network will restart.

sudo modprobe -r broadcom tg3
sudo modprobe broadcom
sudo modprobe tg3

Please tell us what happens when you try this. It won't solve the problem,
but perhaps it helps.


On Sat, Jan 26, 2019, 10:39 Bob Lawrence <<email address hidden> wrote:

> Confirmed that this is still an issue on 18.04.1. I have an HP 705 G1
> with the Broadcom 5762. In my case it's a Plex server. Whenever I try to
> stream something the interface goes "NO-CARRIER" and the only way to
> recover is to reboot. I've tried disabling highdma, tso and gso using
> ethtool, iommu=soft kernel parameter, and forcing every combo of
> 1gbps/100mbps & half/full duplex. Nothing seems to workaround the issue.
> System: Host: Bobs-HTPC Kernel: 4.15.0-43-generic x86_64 bits: 64
> Console: tty 1 Distro: Ubuntu 18.04.1 LTS
> Machine: Device: desktop System: Hewlett-Packard product: HP EliteDesk
> 705 G1 DM serial: N/A
> Mobo: Hewlett-Packard model: 225E serial: N/A BIOS:
> Hewlett-Packard v: L06 v02.31 date: 08/31/2018
> Battery hidpp__0: charge: N/A condition: NA/NA Wh
> CPU: Quad core AMD A8-7600 Radeon R7 10 Compute Cores 4C+6G (-MCP-)
> cache: 8192 KB
> clock speeds: max: 3100 MHz 1: 3094 MHz 2: 3094 MHz 3: 3094 MHz
> 4: 3094 MHz
> Graphics: Card: Advanced Micro Devices [AMD/ATI] Kaveri [Radeon R7
> Graphics]
> Display Server: N/A drivers: ati,radeon (unloaded:
> modesetting,fbdev,vesa)
> tty size: 120x53 Advanced Data: N/A out of X
> Audio: Card-1 Advanced Micro Devices [AMD] FCH Azalia Controller
> driver: snd_hda_intel
> Card-2 Advanced Micro Devices [AMD/ATI] Kaveri HDMI/DP Audio
> Controller driver: snd_hda_intel
> Sound: Advanced Linux Sound Architecture v: k4.15.0-43-generic
> Network: Card-1: Intel Wireless 7260 driver: iwlwifi
> IF: wlp2s0 state: up mac: cc:3d:82:a7:bf:ed
> Card-2: Broadcom Limited NetXtreme BCM5762 Gigabit Ethernet
> PCIe driver: tg3
> IF: eno1 state: up speed: 100 Mbps duplex: half mac:
> ec:b1:d7:4c:2d:8e
> Drives: HDD Total Size: 9501.7GB (42.8% used)
> ID-1: /dev/sda model: ST500LM000 size: 500.1GB
> ID-2: USB /dev/sdb model: 5 size: 9001.6GB
> Partition: ID-1: / size: 458G used: 23G (6%) fs: ext4 dev: /dev/sda1
> RAID: No RAID devices: /proc/mdstat, md_mod kernel module present
> Sensors: System Temperatures: cpu: 40.8C mobo: N/A gpu: 42.0
> Fan Speeds (in rpm): cpu: N/A
> Info: Processes: 227 Uptime: 12:49 Memory: 1608.0/5943.7MB Init:
> systemd runlevel: 5
> Client: Shell (bash) inxi: 2.3.56
> --
> You received this bug notification because you are subscribed to the bug
> report.
> Title:
> 14e4:1687 broadcom tg3 network driver disconnects under high load
> Status in linux package in Ubuntu:
> Triaged
> Status in linux package in Debian:
> New
> Bug description:
> The tg3 broadcom network driver that binds with chipset 5762 goes
> offline and unable to recover (even with tg3 watchdog timeout) when network
> transmit is under high load. Call trace:
> When this happens, only a reboot would be able to fix it. Sometimes,
> however, bringing the interface offline and online (via ifconfig)
> would recover networking. I've also tested with the latest tg3 driver
> (dec 2014 version) and networking is still problematic. I have also
> disabled TSO, GSO etc... with ethtool and the bug still surfaces.
> This bug may be related to the integrated Firmware.
> Here is the procedure to replicate the issue because it is hard to
> replicate it under moderate network load.
> 1. Bootup a machine with a broadcom 5762 NIC (ie. HP DeskElite 705)
> using a Ubuntu/Kubunu Live CD 14.04-15.04.
> 2. from another machine: start 5 sessions, repetitively copy (scp with
> public key authentication) a 70 meg file back and forth to the tg3 machine
> in each session. (not sure if this is necessary)
> 3. create a 1GB file on the tg3 machine, with something like dd
> if=/dev/urandom of=/my/test/file bs=1024 count=$((1024*1000))
> 4. from another machine: repetitively scp copy that 1GB file from the
> tg3 machine. This can be done with something like:
> while [ 0 ]; do
> scp -i /my/scp/private.key <email address hidden>:/my/test/file /tmp
> done;
> Networking will mostly goes offline in about 10-30 minutes.
> WORKAROUND: Add udev rule to make the changes permanent in
> /etc/udev/rules.d/80-tg3-fix.rules :
> ACTION=="add", SUBSYSTEM=="net", ATTRS{vendor}=="0x14e4",
> ATTRS{device}=="0x1687", RUN+="/sbin/ethtool -K %k highdma off"
> ProblemType: Bug
> DistroRelease: Ubuntu 15.04
> Package: linux-image-3.19.0-15-generic 3.19.0-15.15
> ProcVersionSignature: Ubuntu 3.19.0-15.15-generic 3.19.3
> Uname: Linux 3.19.0-15-generic x86_64
> ApportVersion: 2.17.2-0ubuntu1
> Architecture: amd64
> AudioDevicesInUse:
> /dev/snd/controlC1: kubuntu 3748 F.... pulseaudio
> /dev/snd/controlC0: kubuntu 3748 F.... pulseaudio
> CasperVersion: 1.360
> Date: Thu Apr 23 11:16:24 2015
> IwConfig:
> eth0 no wireless extensions.
> lo no wireless extensions.
> LiveMediaBuild: Kubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422)
> MachineType: Hewlett-Packard HP EliteDesk 705 G1 MT
> ProcEnviron:
> TERM=xterm
> PATH=(custom, no user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcFB: 0 radeondrmfb
> ProcKernelCmdLine: BOOT_IMAGE=/casper/vmlinuz.efi
> file=/cdrom/preseed/hostname.seed boot=casper maybe-ubiquity quiet splash
> ---
> PulseList:
> Error: command ['pacmd', 'list'] failed with exit code 1: Home
> directory not accessible: Permission denied
> No PulseAudio daemon running, or not running as session daemon.
> RelatedPackageVersions:
> linux-restricted-modules-3.19.0-15-generic N/A
> linux-backports-modules-3.19.0-15-generic N/A
> linux-firmware 1.143
> RfKill:
> SourcePackage: linux
> UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev'
> UpgradeStatus: No upgrade log present (probably fresh install)
> 10/22/2014
> dmi.bios.vendor: Hewlett-Packard
> dmi.bios.version: L06 v02.15
> dmi.board.asset.tag: 2UA5041TG4
> 2215
> dmi.board.vendor: Hewlett-Packard
> dmi.chassis.asset.tag: 2UA5041TG4
> dmi.chassis.type: 6
> dmi.chassis.vendor: Hewlett-Packard
> dmi.modalias:
> dmi:bvnHewlett-Packard:bvrL06v02.15:bd10/22/2014:svnHewlett-Packard:pnHPEliteDesk705G1MT:pvr:rvnHewlett-Packard:rn2215:rvr:cvnHewlett-Packard:ct6:cvr:
> HP EliteDesk 705 G1 MT
> dmi.sys.vendor: Hewlett-Packard
> To manage notifications about this bug go to: