network stop working after I've got sky2 eth0: tx timeout
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: linux-image-
I've seen many reports of similar problems with this card, but no one exactely with my kernel.
The mail server was working reliably since about nine month and some kernel update.
The last one was on Sunday to linux-source-2.6.15 (2.6.15-28.55).
The server was rebooted with the new kernel on
May 27 18:29:26 mail kernel: Inspecting /boot/System.
May 27 18:29:26 mail kernel: Loaded 23274 symbols from /boot/System.
May 27 18:29:26 mail kernel: Symbols match kernel version 2.6.15.
May 27 18:29:26 mail kernel: No module symbols loaded - kernel modules not enabled.
May 27 18:29:26 mail kernel: [42949372.960000] Linux version 2.6.15-28-server (buildd@terranova) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP Thu May 10 10:40:
Then I tranfered some GB (about 10) on such card without trouble (a network backup).
This morning when the transfer was presumably low (about 60 email clients) in kernel log
I've got:
May 29 12:15:18 mail kernel: [43099788.670000] NETDEV WATCHDOG: eth0: transmit timed out
May 29 12:15:18 mail kernel: [43099788.670000] sky2 eth0: tx timeout
May 29 12:15:18 mail kernel: [43099788.850000] sky2 eth0: transmit ring 365 .. 324 report=365 done=365
May 29 12:15:18 mail kernel: [43099788.850000] sky2 hardware hung? flushing
May 29 12:20:43 mail kernel: [43100113.850000] NETDEV WATCHDOG: eth0: transmit timed out
May 29 12:20:43 mail kernel: [43100113.850000] sky2 eth0: tx timeout
May 29 12:20:43 mail kernel: [43100114.030000] sky2 eth0: transmit ring 324 .. 283 report=365 done=365
May 29 12:20:43 mail kernel: [43100114.030000] sky2 status report lost?
May 29 12:21:18 mail kernel: [43100149.030000] NETDEV WATCHDOG: eth0: transmit timed out
May 29 12:21:18 mail kernel: [43100149.030000] sky2 eth0: tx timeout
May 29 12:21:18 mail kernel: [43100149.200000] sky2 eth0: transmit ring 365 .. 324 report=365 done=365
May 29 12:21:18 mail kernel: [43100149.200000] sky2 hardware hung? flushing
May 29 12:27:19 mail kernel: [43100509.200000] NETDEV WATCHDOG: eth0: transmit timed out
May 29 12:27:19 mail kernel: [43100509.200000] sky2 eth0: tx timeout
May 29 12:27:19 mail kernel: [43100509.360000] sky2 eth0: transmit ring 324 .. 283 report=365 done=365
May 29 12:27:19 mail kernel: [43100509.360000] sky2 status report lost?
May 29 12:27:54 mail kernel: [43100544.360000] NETDEV WATCHDOG: eth0: transmit timed out
May 29 12:27:54 mail kernel: [43100544.360000] sky2 eth0: tx timeout
May 29 12:27:54 mail kernel: [43100544.510000] sky2 eth0: transmit ring 365 .. 324 report=365 done=365
May 29 12:27:54 mail kernel: [43100544.510000] sky2 hardware hung? flushing
and then it didn't respond any more. I had to reboot switching off (the server was still responding on another
interface to webmail request, but a firewall didn't allow ssh access from there and the server is headeless
# lspci:
.root@mail:~# lspci
0000:00:00.0 Host bridge: Intel Corporation 915G/P/
0000:00:1b.0 0403: Intel Corporation 82801FB/
0000:00:1c.0 PCI bridge: Intel Corporation 82801FB/
0000:00:1c.1 PCI bridge: Intel Corporation 82801FB/
0000:00:1d.0 USB Controller: Intel Corporation 82801FB/
0000:00:1d.1 USB Controller: Intel Corporation 82801FB/
0000:00:1d.2 USB Controller: Intel Corporation 82801FB/
0000:00:1d.3 USB Controller: Intel Corporation 82801FB/
0000:00:1d.7 USB Controller: Intel Corporation 82801FB/
0000:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3)
0000:00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface Bridge (rev 03)
0000:00:1f.1 IDE interface: Intel Corporation 82801FB/
0000:00:1f.2 IDE interface: Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA Controller (rev 03)
0000:00:1f.3 SMBus: Intel Corporation 82801FB/
0000:01:09.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
0000:01:0a.0 Serial controller: Lava Computer mfg Inc Lava Single Serial
0000:01:0b.0 Ethernet controller: D-Link System Inc RTL8139 Ethernet (rev 10)
0000:02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 Gigabit Ethernet Controller (rev 19)
On the last ethernet controller:
# lspci -vv
0000:02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 Gigabit Ethernet Controller (rev 19)
Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet Controller (Asus)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0, Cache Line Size: 0x04 (16 bytes)
Interrupt: pin A routed to IRQ 50
Region 0: Memory at dfffc000 (64-bit, non-prefetchable) [size=16K]
Region 2: I/O ports at d800 [size=256]
Expansion ROM at dffc0000 [disabled] [size=128K]
Let me know if I can report some more usefull info.
Andrea
Is it possible to backport to this kernel the patches introduced in 2.6.20.2 by Stephen Hemminger for this unlucky card
or do I have to compile a 2.6.20 or 21 to overcome this trouble?
I'd like to do it my self, but sorry, I can't (for me is more easy to compile a 2.6.21, also if I'd like better to continue to use the mainstream Ubuntu kernel for server).