Sky2 ethernet failing randomly with Marvell 88E8056 gigabit

Bug #138611 reported by DrCore on 2007-09-10
74
This bug affects 8 people
Affects Status Importance Assigned to Milestone
Linux
Invalid
Medium
linux (Ubuntu)
Medium
Unassigned
linux-source-2.6.22 (Ubuntu)
High
Unassigned

Bug Description

Binary package hint: linux-source-2.6.22

Since Ubuntu Feisty the sky2 ethernet adapters have not been stable. Also after the bugfixes in kernel 2.6.22-11.32 (which is my current kernel) the failures continue to happen. Earlier on the problem appeared while doing big file transfers (>1 GB), as of 2.6.22-11.32 it now looks more like random freezes (without traffic). Doing a restart of the network (/etc/init.d/networking restart) does not work. As I have two sky2 adapters available, switching over to the other plug brings the network back up. When also this one fails, I have to resort to a reboot.

In the the dmesg log I find usually the following information after a crash:

[ 8767.471702] eth1: hw csum failure.
[ 8767.471723] [<c02813ec>] __skb_checksum_complete_head+0x5c/0x60
[ 8767.471739] [<c02813f8>] __skb_checksum_complete+0x8/0x10
[ 8767.471744] [<c02c2d30>] tcp_v4_rcv+0x5a0/0x990
[ 8767.471751] [<c02762f0>] pci_conf1_read+0xa0/0xe0
[ 8767.471766] [<c02776f9>] pci_read+0x29/0x30
[ 8767.471792] [<c02a51d2>] ip_local_deliver+0x122/0x2c0
[ 8767.471814] [<c02a4dfa>] ip_rcv+0x2ea/0x5a0
[ 8767.471829] [<f8a674aa>] packet_rcv_spkt+0x10a/0x1c0 [af_packet]
[ 8767.471837] [<c0280672>] __netdev_alloc_skb+0x22/0x50
[ 8767.471855] [<c0284677>] netif_receive_skb+0x237/0x400
[ 8767.471883] [<f89b503b>] sky2_poll+0x3bb/0xc10 [sky2]
[ 8767.471938] [<c0286978>] net_rx_action+0xc8/0x210
[ 8767.471958] [<c012d6b2>] __do_softirq+0x82/0x110
[ 8767.471976] [<c012d795>] do_softirq+0x55/0x60
[ 8767.471981] [<c012da7d>] irq_exit+0x6d/0x80
[ 8767.471984] [<c0106b20>] do_IRQ+0x40/0x70
[ 8767.472004] [<c0105223>] common_interrupt+0x23/0x30
[ 8767.472032] [<c01022b6>] mwait_idle_with_hints+0x46/0x60
[ 8767.472041] [<c01022d0>] mwait_idle+0x0/0x20
[ 8767.472048] [<c0102413>] cpu_idle+0x53/0xe0
[ 8767.472062] [<c03e3a85>] start_kernel+0x325/0x3b0
[ 8767.472069] [<c03e31f0>] unknown_bootoption+0x0/0x260
[ 8767.472095] =======================

Is there any hope this bug will be resolved, or should I just change to the proprietary sk86lin driver?

DrCore (launchpad-drsdre) wrote :

Additional information

lspci:
00:00.0 Host bridge: Intel Corporation 82925X/XE Memory Controller Hub (rev 0e)
00:01.0 PCI bridge: Intel Corporation 82925X/XE PCI Express Root Port (rev 0e)
00:1b.0 Audio device: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 1 (rev 04)
00:1c.1 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 2 (rev 04)
00:1c.2 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 3 (rev 04)
00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 04)
00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 04)
00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 04)
00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 04)
00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d4)
00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface Bridge (rev 04)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 04)
00:1f.2 IDE interface: Intel Corporation 82801FR/FRW (ICH6R/ICH6RW) SATA Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 04)
01:03.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01)
01:04.0 Mass storage controller: <pci_lookup_name: buffer too small> (rev 13)
01:05.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 15)
03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 15)
05:00.0 VGA compatible controller: nVidia Corporation Unknown device 0402 (rev a1)

DrCore (launchpad-drsdre) wrote :

More additional information:

uname -a:
Linux drebuntum 2.6.22-11-generic #1 SMP Fri Sep 7 05:07:05 GMT 2007 i686 GNU/Linux

Mahmoud Kassem (mmkassem) wrote :

I have the same problem
Motherboard: Gigabyte 965P-DS3
Network: Marvell 88E8056
tested also with Gusty Live CD,
$ uname -a
Linux ubuntu 2.6.22-10-generic #1 SMP Wed Aug 22 08:11:52 GMT 2007 i686 GNU/Linux
$ lspci
04:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12)
(It is detected correct as 88E8056 in Feisty)

/var/log/messages full of:
sky2 eth0: rx error, status 0x5ca0002 length 1482
and some times:
sky2 eth0: rx error, status 0x5ca0002 length 1514

after the eth0 fail, these commands seems to make it work again with reboot:
sudo ifdown eth0; sudo ifup eth0;

Mahmoud Kassem (mmkassem) wrote :

correction:
after the eth0 fail, these commands seems to make it work again **without** reboot:
sudo ifdown eth0; sudo ifup eth0;

xtknight (xt-knight) wrote :

I'm having the same problems, without any error messages in /var/log/messages. Also on a Gigabyte GA-965P-DS3.

Mine is detected like this in Gutsy too: 04:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12)
I can't remember how it was reported in Feisty.

Setting to confirmed. This needs to be addressed and I know it's not just me. Should be at least medium priority, probably affects TONS of people with new motherboards.

xtknight (xt-knight) wrote :

This problem needs to be addressed.

Changed in linux-source-2.6.22:
status: New → Confirmed
Changed in linux:
status: Unknown → Confirmed
Brian Murray (brian-murray) wrote :

Kernel team bug work flow indicates that confirmed bugs should be assigned to the kernel team, so in addition to adding an importance, I am assigning the bug to the kernel team.

Changed in linux-source-2.6.22:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → High
xtknight (xt-knight) wrote :

Thanks for taking a look at it, Brian.

Here is my "sudo lspci -vvnn".

I'm going to try disabling power states and disabling PCI MSI config tonight to see if it's interrupt- or power-related. I suggest everyone also post "cat /proc/interrupts":

           CPU0 CPU1
  0: 110357473 0 IO-APIC-edge timer
  1: 151725 0 IO-APIC-edge i8042
  6: 106 0 IO-APIC-edge floppy
  7: 0 0 IO-APIC-edge parport0
  8: 0 0 IO-APIC-edge rtc
  9: 1 0 IO-APIC-fasteoi acpi
 16: 11184206 0 IO-APIC-fasteoi uhci_hcd:usb3, ohci1394, nvidia
 18: 19924361 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb7, EMU10K1
 19: 2433345 0 IO-APIC-fasteoi uhci_hcd:usb6, ohci1394, libata, libata
 20: 9065082 0 IO-APIC-fasteoi pata_pdc2027x
 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb4
 23: 0 0 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb5
507: 2723565 0 PCI-MSI-edge eth0
NMI: 0 0
LOC: 109387069 109387245
ERR: 0

xtknight (xt-knight) wrote :

At one point I got this while pinging my router (sometimes it's just destination host unreachable):

ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available

Unfortunately at this time I didn't have debug=16 on.

xtknight (xt-knight) wrote :

I notice a marked improvement when using the 2.6.23-rc6 kernel (kernel.org vanilla, compiled using the Master Kernel Thread @ UbuntuForums).

My suggested solution is to use the sky2.c out of the vanilla kernel. The one in the Ubuntu kernel 2.6.22-11 reports "sky2 1.14". Type this to see your version. "sudo ethtool -i eth0"

With 2.6.23 you get sky2 1.17. I suspect that there are missing patches in the Ubuntu 2.6.22 kernel (some were backported while others were not, when all needed to be backported at once). I think it needs to be sync'd with the vanilla kernel version.

I am not certain that using 1.17 (from vanilla) is a solution but I think it is a good idea in general. The patches were committed to the mainstream kernel for good reason. Even if it doesn't solve my problems (which, so far, I think it has), it may solve others'. Sky2 is a fast-changing driver and all improvements need to be backported ASAP IMO, especially considering that sky2 hardly works for maybe 10% of those using it. And, sky2 supports a very popular network chipset on new computers (Marvell Yukon).

I have attached a diff of the sky2.c from 2.6.23-rc6 and the one from the current Gutsy kernel. I think this is all that is needed for the changes. Here are the changes from 2.6.22-11 (Ubuntu) to 2.6.23-rc6, attached. I diff'd sky2.h and sky2.c and combined them into one diff file. I haven't tested compiling 2.6.22-11 with the new sky2 files yet (I have only compiled a 2.6.23-rc6 kernel) but I have a good feeling that sky2.c/sky2.h is where the problem lies. I hope to be able to test it myself very soon.

Please consider this solution and it would be nice if others could test this patch so we can get it into the final Gutsy.

You should be able to apply the patch like this. Keep Master Kernel Thread @ http://ubuntuforums.org/showthread.php?t=311158 handy. Save 2.6.22-11.32_sky2+2.6.23-rc6.patch (attached) to desktop.

Prerequisite: "sudo apt-get update && sudo apt-get upgrade". Reboot if there is a kernel update. Then follow these steps. If your kernel isn't exactly 2.6.22-11.32 then the patch should still work (unless there have been more modifications to sky2 since).

1. sudo apt-get install linux-source-2.6.22
2. cd /usr/src
3. tar xjvf linux-source-2.6.22.tar.bz2
4. Follow steps 1-3 at Master Kernel Thread (including getting in root). (Not step 4 because we have unpacked the Ubuntu linux source. Not step 5 because I have a modification.)
5. rm -rf linux && ln -s /usr/src/linux-source-2.6.22 linux && cd /usr/src/linux
6. patch -p2 < ~/Desktop/2.6.22-11.32_sky2+2.6.23-rc6.patch
7. Continue, starting from step 8 at Master Kernel Thread. You shouldn't need to go through any menu items because our current kernel is identical to the one we're compiling. For Step 10 it doesn't matter what revision you use, maybe something unique like "sky2" so it doesn't get confused with your current kernel.

I hope this helps somebody compile an Ubuntu kernel w/ a new, fixed sky2. Most importantly, please report back if it works for you. Thank you. If there are any problems I apologize, but never delete your old kernel for any reason.

xtknight (xt-knight) wrote :
Mahmoud Kassem (mmkassem) wrote :

I would like to add that this problem is no longer available in the Fedora kernel 2.6.22.5-76.fc7 (Aug 30 2007) although the sky2 driver version 1.14, so may be it is not the sky2 driver problem or may be it is a modified 1.14. Can someone please check the difference?

The previous fedora kernel had the same problem and always generated the same rx error in the messages log.

xtknight (xt-knight) wrote :

By this point, it's pretty much confirmed that all my problems are solved by using 2.6.23-rc6. I haven't had an issue ever since I started using the new kernel.

Geoff (cheetaweb) wrote :

This still happens (i.e. sky2 dying ) on 2.6.23-0.181.rc6.git4 (from fedora/rawhide tree)

xtknight (xt-knight) wrote :

Geoff: please describe your problem more in depth. Do you have anything in dmesg? When does it die?

Also, I don't trust customized kernels. I use the official from kernel.org. If yours is easy to reproduce could you please try 2.6.23-rc6 ("latest prepatch") there? Thanks.

Tor Håkon Haugen (torh) wrote :

FYI: There is an ongoing discussion about the sky2 driver in the kernel mailing list, so I guess they are aware of the problem.

xtknight (xt-knight) wrote :

It seems my problem is different from Geoff's because mine is fixed by 2.6.23 and his is not.

I don't know how to help fix this problem but it is High importance and Confirmed. Is there a planned milestone for it?

Worse, I'm not sure if my problem can be fixed without causing further regressions for others.

Could others confirm if their issues are fixed by using 2.6.23, please? I see at least three or four different sky2 bugs in the wild (upload issues, maybe interrupt issues, hw checksum+crash dumps, no errors at all but no net at all)

xtknight (xt-knight) wrote :

I actually started having issues with 2.6.23-rc6 and -rc8 also. So, apparently not all of the problem is fixed.

Mahmoud Kassem (mmkassem) wrote :

I just to confirm that Fedora 7 kernel 2.6.22.5-76.fc7 does not have this problem, I've transferred 50GiB+ data using samba without problems. May be if Ubuntu kernel team looked at it, they can find a solution.

snoopy26 (kasper-biessenhofen) wrote :

My Fedora kernel 2.6.22.9-91.fc7 still has this problem:
While doing a permanent ping to my gateway, after a while (1-3 hours)

ping: sendmsg: No buffer space available

show up.
rmmod sky2; modprobe sky2 ;
help for some minutes, but then 'No buffer space available' turn up again.

Network controller: Intel Corporation PRO/Wireless 3945ABG Network Connection (rev 02)
        Subsystem: Intel Corporation Unknown device 1001
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at 90100000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
        Capabilities: [e0] Express Legacy Endpoint IRQ 0

ethtool -i eth0
driver: sky2
version: 1.14
firmware-version: N/A
bus-info: 0000:02:00.0

Mahmoud Kassem (mmkassem) wrote :

snoopy26, Your network controller is an Intel Wireless Network Adapter, this bug report is for Marvell 88E8056 Ethernet. If the bug you mentioned exist on your *Ubuntu* Installation I suggest you open a new bug report for it.

snoopy26 (kasper-biessenhofen) wrote :

Sorry, I extract the wrong section from lspci, the correct one is:

02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8055 PCI-E Gigabit Ethernet Controller (rev 12)
        Subsystem: Fujitsu Limited. Unknown device 139a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 2299
        Region 0: Memory at f0000000 (64-bit, non-prefetchable) [size=16K]
        Region 2: I/O ports at 2000 [size=256]
        [virtual] Expansion ROM at 90000000 [disabled] [size=128K]
        Capabilities: <access denied>

I apologize and will open a bug report on bugzilla for FC7.

Mahmoud Kassem (mmkassem) wrote :

snoopy26, are you still having problems with Fedora 7 kernel 2.6.23.1-10.fc7 (it's using sky2 driver version 1.18). I am not any network issues.

Mahmoud Kassem (mmkassem) wrote :

ubuntu kernel 2.6.22-14-server has the sky2 driver version 1.18 but still issues rx errors

Test:
2 PC with Marvell 88E8056 Ethernets
Source: Fedora 7 2.6.23.1-10.fc7 (Sky2 1.18)
Dist: Ubuntu Server 7.10 2.6.22-14-server (Sky2 1.18)
Data transferred: 145 GB using SCP

Summary:
The network device did not go down, but one rx error was added to the /var/log/messages on the ubuntu-server
No rx errors on the F7

xtknight (xt-knight) wrote :

Well, with 2.6.23-rc8 my problems are reduced about 99%. Just to let you know. I haven't had any problems that I can definitely relate to the sky2 driver itself even though it has gone down once(?) in the past month.

Running ...
  Linux oder 2.6.23.1-21.fc7 #1 SMP Thu Nov 1 20:28:15 EDT 2007 x86_64 x86_64 i386-64 GNU/Linux
... since one week without any problems in sky2.

My problem seems to be fixed :-))

On Mon, Nov 05, 2007 at 02:58:30PM -0000, Mahmoud Kassem wrote:
> ubuntu kernel 2.6.22-14-server has the sky2 driver version 1.18 but
> still issues rx errors
>
> Test:
> 2 PC with Marvell 88E8056 Ethernets
> Source: Fedora 7 2.6.23.1-10.fc7 (Sky2 1.18)
> Dist: Ubuntu Server 7.10 2.6.22-14-server (Sky2 1.18)
> Data transferred: 145 GB using SCP
>
> Summary:
> The network device did not go down, but one rx error was added to the /var/log/messages on the ubuntu-server
> No rx errors on the F7
>
> --
> Sky2 ethernet failing randomly with Marvell 88E8056 gigabit
> https://bugs.launchpad.net/bugs/138611
> You received this bug notification because you are a direct subscriber
> of the bug.

Changed in linux:
status: Confirmed → Incomplete

Hi Guys,

There hasn't been any recent activity in this report and we were wondering if this is still an issue in the latest Hardy Alpha release. The Hardy Heron Alpha series contains an updated version of the kernel, version 2.6.24. You can download and try the new Hardy Heron Alpha release from http://cdimage.ubuntu.com/releases/hardy/ . You should be able to then test the new kernel via the LiveCD. If you can, please verify if this bug still exists or not and report back your results. General information regarding the release can also be found here: http://www.ubuntu.com/testing/ . Also note that we will keep this report open against the actively developed kernel but this will be closed against 2.6.22. Thanks.

Changed in linux:
status: New → Incomplete
Changed in linux-source-2.6.22:
status: Confirmed → Won't Fix
treffer (rtreffer) wrote :

Hi Leann,

I can confirm this behaviour. Especially scp/scp -r on my local network kills eth0 after ~500MB (actually [50; 1000]MB).

Happens with hardy based mythbuntu (should have the same kernel), zen-sources (see http://forums.gentoo.org/viewtopic-t-672773.html) (master, 2.6.24 based and master-devel, 2.6.25_rc based). So I'd say this is a generic sky2 kernel bug... Propably worth some lkml mails....

So yes, this is still happening all the time here :( But just under heavy network load. I can run skype, rsync to the internet, apt-get or surf the web for days (literally days, the box was running for 2 days without network problems - even compiling gentoo in a chroot worked (wget, rsync, ...)).

I've tried to play around with ethtool but no real success, yet.

Hope this helps....

treffer (rtreffer) wrote :

Update....

NOAPIC (as mentioned on gentoo-wiki.com) didn't help. Patching the kernel with the original marvell driver (available via http://www.marvell.com/drivers/search.do ) didn't work. Changing the sysctl.conf (matching the senders sysctl) seemed to helped slightly, as well as playing around with ethtool. However I didn't get a stable connection with 100MBit. Odd. The best run was ~1GB till failure.

I'll add a pci 100MBit card and recheck - I've added a new switch/device (wrt150n with dd-wrt) so I'll have to recheck my network's hardware.... Everything else was stable till now :( (transfering terrabytes of data)

However I can confirm that the card seems to work like a charm at lower rates/higher latencies... Wired....

Happy hunting, I'm going to add a 100MBit card :( (this won't keep me from testing - the gbit card is onboard - asus p5k-vm for the record)

Oh, my tests worked like this:
"scp -r <user>@192.168.1.66:/usr/src ./" /usr/src on the host machine is filled with 3 compiled kernels (make-kpkg buildpackage on all kernel trees). Failure occured @ ~500MB (sometimes sooner, sometimes later). 192.168.1.66 is my laptop with
00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
connected via
<notebook/realtek chip> <-- 100MBit/Patch cable --> <cheap 100MBit switch> <-- 100MBit/Patch cable --> <wrt150n with ddwrt> <-- 100MBit/Cross --> <normal box/marvell 88E8056>

I'll recheck the network chain with a 100MBit card - but I wouldn't expect a network problem....

Hope this helps.

Russ Price (rjp-ubu) wrote :

I can reliably get my new system (XFX nForce 630i motherboard) with an integrated Marvell PCI-E Gigabit Ethernet controller to lock up HARD. All I have to do is FTP anything from another gigabit-equipped machine on my LAN, and the system instantly locks up requiring a hard reset. NFS transfers will also lock the system up. At slower speeds (i.e. pulling updates over the Internet) lockups are random. Once it locks up, only a hard reset will work.

I'm running Hardy Beta, but I had this problem in Gutsy as well. I'm also having problems with corrupted packages in {us,uk,de}.archive.ubuntu.com - so I can't even successfully download the nVidia proprietary display drivers (and thus I'm not using them). I'm not sure if that's an archive problem or if it's related to this awful sky2 driver. I'll have to find a cheap Gb NIC today and hope it's not Marvell.

I tried using "options sky2 disable_msi=1" in /etc/modprobe.d/options, but it did not fix the problem.

03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 14)
        Subsystem: Marvell Technology Group Ltd. Unknown device 00ba
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 20
        Region 0: Memory at efcfc000 (64-bit, non-prefetchable) [size=16K]
        Region 2: I/O ports at df00 [size=256]
        [virtual] Expansion ROM at efb00000 [disabled] [size=128K]
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
        Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
                Address: 0000000000000000 Data: 0000
        Capabilities: [e0] Express Legacy Endpoint IRQ 0
                Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s unlimited, L1 unlimited
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 1
                Link: Latency L0s <256ns, L1 unlimited
                Link: ASPM Disabled RCB 128 bytes CommClk+ ExtSynch-
                Link: Speed 2.5Gb/s, Width x1

Russ Price (rjp-ubu) wrote :

I dropped an RTL8169-based PCI card into the system and disabled the Marvell controller. This solved the lockup problem completely - and it also solved my problem with corrupted updates.

Now I'm debating whether or not to return the motherboard and get something with a Realtek, Nvidia, or Intel controller on-board, or keep the motherboard and get an Intel PCIe card so I don't have to waste a PCI slot. The on-board Marvell is simply unusable.

Mubashir Cheema (cheema) wrote :

I can confirm that this bug exists in Ubuntu 8.04 beta 1 with kernel 2.6.24-12-generic. lspci reports that I have:

04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 13)

I dont experience complete lockups. however I experience extreme slowdowns over a local gigabit network when using just interactive ssh.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
Alan Podlesek (seqdcer) wrote :

I'm also using Ubuntu 8.04 beta with kernel 2.6.24-12-generic, but have the 88E8055 controller.

I can ping sites without a problem, but firefox just times out.

Alan Podlesek (seqdcer) wrote :

Some more info...

lspci says:

Ethernet controller: Marvell Technology Group Ltd. 88E8055 PCI-E Gigabit Ethernet Controller (rev 12)

ethtool -i eth0

driver: sky2
version: 1.20
firmware-version: N/A
bus-info: 0000:02:00.0

cat /var/log/messages

kernel: [ 164.639614] sky2 eth0: rx length error: status 0x5ea0100 length 1342
kernel: [ 164.741511] sky2 eth0: rx length error: status 0x5ea0100 length 1342
kernel: [ 164.840244] sky2 eth0: rx length error: status 0x5ea0100 length 1342
kernel: [ 165.261115] sky2 eth0: rx length error: status 0x5ea0100 length 1342

Dossy Shiobara (dossy) wrote :

+1. Seeing this problem on a new EVGA e-7100/630i board that also has the crappy Marvell NIC. Running on 8.04/Hardy release.

$ uname -a
Linux doc 2.6.24-16-generic #1 SMP Thu Apr 10 12:47:45 UTC 2008 x86_64 GNU/Linux

$ lspci | grep Marvell
03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 14)

$ ethtool -i eth0
driver: sky2
version: 1.20
firmware-version: N/A
bus-info: 0000:03:00.0

Barius (bariuspelagic) wrote :

Seeing the same issues here too. I'm on a Shuttle SG31G2.

> lspci | grep Marvell
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)

> ethtool -i eth0
driver: sky2
version: 1.20
firmware-version: N/A
bus-info: 0000:02:00.0

There is a known issue with the 88E8056 network controller firmware related to this; you need to contact your motherboard vendor to request this specifically. I had similar issues with my 88E8053 (EC) until moving to firmware v2.2.

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Brian Long (briandlong) wrote :

I've installed the 2.6.27-2-generic kernel from the Intrepid debs and I'm still having sky2 issues. If I right-click on the nm-applet, uncheck "Enable Networking" and re-check it, my link comes back up.

I'm running the Abit P5K Pro motherboard with the following device:
02:00.0 0200: 11ab:4364 (rev 12)
 Subsystem: 1043:81f8
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
 Latency: 0, Cache Line Size: 32 bytes
 Interrupt: pin A routed to IRQ 219
 Region 0: Memory at feafc000 (64-bit, non-prefetchable) [size=16K]
 Region 2: I/O ports at d800 [size=256]
 Expansion ROM at feac0000 [disabled] [size=128K]
 Capabilities: <access denied>

I'm going to update my motherboard BIOS to see if that helps the situation. I'll also contact ASUS to see if there is a NIC firmware update I can apply as mentioned in another comment (no update is mentioned on their website for this motherboard).

flower (otherego) wrote :

i confirm sky2 rx length error on my x86-64 intrepid ibex alpha 6 after upgrade from 8.04.

misan (misan) wrote :

Same here, I've tested Ibex Beta with similar results: random lockup of the nextwork device whenever traffic levels are quite high (transfers over the Full dupley 100 Mbps LAN).

Disabling and enabling the network cures the problem as so does removing and modprobing again sky2 module.

I have a Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
Running Ubuntu 8.10 x64, latest updates as of this morning.

My network will only link at 100Mbps to my 1Gig Switch. It used to connect at 1Gbps a week or so ago.

Wilbur, if you're using a "desktop" (small) ethernet switch, power cycle it. It took me too long to realise that if one port negotiates down to 100Mb, it'll not run at higher speed until you powercycle the switch (found on at least 3 switches from 2 vendors). This is probably more a design limitation, so you pay a premium for expected functionality.

That did the trick.

Uzzi (andreaussi-yahoo) wrote :

Marvell Technology Group Ltd. 88E8040 on 2.6.27-7-generic doesn't work!

alfredo (software-zas) wrote :

I have a Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12) running Ubuntu 8.10 x86.
Same problem as Wilbur, but Daniels trick does not work for me.
Only 100Mbps to to 3Com switch...

alfredo (software-zas) wrote :

Found this in my dmesg:

Bridge firewalling registered
[ 29.060313] br0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.
[ 29.061997] device lan1 entered promiscuous mode
[ 29.062586] sky2 lan1: enabling interface
[ 32.008591] sky2 lan1: speed/duplex mismatch<6>sky2 lan1: Link is up at 1000 Mbps, full duplex, flow control both
[ 34.652174] br0: port 1(lan1) entering learning state
[ 34.999033] sky2 lan1: Link is down.
[ 35.652447] br0: port 1(lan1) entering disabled state
[ 36.807290] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 36.934054] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 36.934630] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Plase use
[ 36.934632] nf_conntrack.acct=1 kernel paramater, acct=1 nf_conntrack module option or
[ 36.934634] sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
[ 40.489485] sky2 lan1: Link is up at 100 Mbps, full duplex, flow control both
[ 40.489826] br0: port 1(lan1) entering learning state

lan1 is the marvell port...

alfredo (software-zas) wrote :

curious - i changed the switch and got 1Gbit on the link....

But before changing the switch i had booted a winxp on the machine - getting 1Gbit link - on both switches!

Don't know how to get the firmware level of the switch....it's a 3C1670500A five port, the new is an eight port 3Com.

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Changed in linux:
status: Incomplete → Invalid
angelinux (mail-angelodelia) wrote :

Hi all,

I'm experimenting similar troubles with
Ubuntu 8.10 server, 64 bit edition - Kernel 2.6.27-11-server
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)

It randomly crashes the entire kernel, with no message in any log, so hardware reset is needed. On the system console appears flashing and coloured characters but this doesn't help me smiling.

This apperas to be related to usagee of UDP based protocols.

Angelinux: the PCI revision is actually the firmware version, of which you have known-buggy version 1.2. I found behaviour moving to firmware 2.2 or newer to resolve these problems.

Your motherboard vendor has to supply you the updated firmware. If you're unable to get it, let me know.

Manoj Iyer (manjo) on 2009-04-21
Changed in linux (Ubuntu):
status: Triaged → Invalid
slashdevdsp (slashdevdsp) wrote :

Hi Daniel

From you comment above - https://bugs.launchpad.net/linux/+bug/138611/comments/51
Could you please send me the updated firmware for the 88E8056 marvell gigabit ethernet controller - This is for Asus p6t deluse V2 board with dual gigabit ethernet.

I am running the latest kernel from ppa 2.6.31rc1 with the sky2 driver and I am still getting
sky2 eth0: speed/duplex mismatch sky2 eth0 and the

slashdevdsp (slashdevdsp) wrote :

Hi Daniel

From you comment above - https://bugs.launchpad.net/linux/+bug/138611/comments/51
Could you please send me the updated firmware for the 88E8056 marvell gigabit ethernet controller - This is for Asus p6t deluse V2 board with dual gigabit ethernet.

I am running the latest kernel from ppa 2.6.31rc1 with the sky2 driver and I am still getting
sky2 eth0: speed/duplex mismatch sky2 eth0 and the link keeps going up and down

slashdevdsp (slashdevdsp) wrote :

I should also point out that this is on a 64bit 9.04 with 2.6.31rc1 amd64 sky2 driver

The firmware for a number of Marvell ethernet controllers is at:
http://quora.org/hive/yukon2-firmware.tar.bz2

I accept no responsibility and you flash this entirely at your own risk. There's documentation there and you can use 'strings' on the firmware files to identify which one is suitable.

Daniel J Blueman wrote:
> The firmware for a number of Marvell ethernet controllers is at:
> http://quora.org/hive/yukon2-firmware.tar.bz2
>
> I accept no responsibility and you flash this entirely at your own risk.
> There's documentation there and you can use 'strings' on the firmware
> files to identify which one is suitable.
>
>
Hi
     You see there aren't native drivers for your ethernet device, so
one thing you can do is to unplug your modem and then plug it back in
after 5 mins. This worked for me...

Tuomas (mrtuomas) wrote :

Some issues with Ubuntu 10.04 (64bit) on a Fujitsu-Siemens Lifebook S7210. For some time after bootup the network just doesn't work, apparently until the connection speed is reduced to 100Mbps.

lspci:
Ethernet controller: Marvell Technology Group Ltd. 88E8055 PCI-E Gigabit Ethernet Controller (rev 14)

ethtool -i eth0:
driver: sky2
version: 1.25
firmware-version: N/A
bus-info: 0000:04:00.0

uname -r:
2.6.32-22-generic

dmesg:
(lots of lines like this)
[ 965.596662] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
[ 966.029133] sky2 eth0: Link is down.
[ 968.782951] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
[ 969.387122] sky2 eth0: Link is down.
[ 971.949581] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
[ 972.999410] sky2 eth0: Link is down.
[ 975.662757] sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
[ 976.008737] sky2 eth0: Link is down.
[ 988.121189] sky2 eth0: Link is up at 100 Mbps, full duplex, flow control both

Andreas Meier (rumpumpel1) wrote :

I'm using ubuntu and I got this error message since about two years with every linux-kernel.
The problem appears with 32bit and with 64bit.
I have an Asus M2N-L mainboard which has two Marvell 88E8056 chips onboard.
The machine is used as router. eth0 is my internal interface and eth1 is the external interface
used by ppp0 to connect to the DSL-modem.
I get this error message only for eth1 !
Also I get a lot of dropped and frame errors for this interface:
eth1 ...
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:19876766 errors:614219 dropped:614219 overruns:0 frame:108483

Since Ubuntu 10.04 the situation has become worse: whereas prior only my logs got flooded,
now the interface stops working once a day.

Changed in linux:
importance: Unknown → Medium
Giardo (andrigiardo) wrote :

Same problem for me...

That's mine lspci -vvnn:

02:00.0 Ethernet controller [0200]: Marvell Technology Group Ltd. 88E8071 PCI-E Gigabit Ethernet Controller [11ab:436b] (rev 16)
 Subsystem: Acer Incorporated [ALI] Device [1025:014f]
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 64 bytes
 Interrupt: pin A routed to IRQ 26
 Region 0: Memory at febfc000 (64-bit, non-prefetchable) [size=16K]
 Region 2: I/O ports at e800 [size=256]
 Expansion ROM at febc0000 [disabled] [size=128K]
 Capabilities: [48] Power Management version 3
  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
  Status: D0 PME-Enable- DSel=0 DScale=1 PME-
 Capabilities: [50] Vital Product Data <?>
 Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
  Address: 00000000fee0100c Data: 4179
 Capabilities: [c0] Express (v2) Legacy Endpoint, MSI 00
  DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
   ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
   RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
   MaxPayload 128 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
  LnkCap: Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 unlimited
   ClockPM+ Suprise- LLActRep- BwNot-
  LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk+
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 Capabilities: [100] Advanced Error Reporting <?>
 Capabilities: [130] Device Serial Number 90-fb-a6-ff-ff-29-45-ea
 Kernel driver in use: sky2
 Kernel modules: sky2

Dmesg is full of:

Mar 22 19:39:56 PcStream kernel: [88166.328760] sky2 eth0: Link is down.
Mar 22 19:40:00 PcStream kernel: [88169.870833] sky2 eth0: Link is up at 100 Mbps, full duplex, flow control both
Mar 22 19:40:01 PcStream kernel: [88171.027428] sky2 eth0: Link is down.
Mar 22 19:40:07 PcStream kernel: [88177.375698] sky2 eth0: Link is up at 100 Mbps, full duplex, flow control both

No way to got it working properly....

I will install a new ethernet card soon, this situation is not acceptable ( i'm running a cronjob every minutes with "ifconfig eth0 up" -.- it reduces downtimes at 4-5 sec but that's not a good solution)

Please solve this problem! And contact me if you need other informations! (sorry for my bad english :-D)

Forest (foresto) wrote :

In case anyone wants to try a different driver, I just packaged the latest sk98lin from Marvell. It's working as a sky2 replacement with my Yukon-2 88E8056 chip.

When launchpad finishes the build, it should appear here:

https://launchpad.net/~foresto/+archive/extradrivers

Download full text (8.6 KiB)

after some kernels ago, this old bug was came back

changing MTU don't change anything...

[ 150.457097] sky2 0000:02:00.0: eth0: rx error, status 0x5ea0100 length 1502
[ 151.283689] sky2 0000:02:00.0: eth0: rx error, status 0x5ea0100 length 1502
[ 151.349995] sky2 0000:02:00.0: eth0: rx error, status 0x3020020 length 770
[ 151.360369] sky2 0000:02:00.0: eth0: rx error, status 0x2490002 length 585
[ 151.514566] sky2 0000:02:00.0: eth0: rx error, status 0x33c0020 length 828
[ 151.563110] sky2 0000:02:00.0: eth0: rx error, status 0x2bd0020 length 701
[ 151.667122] sky2 0000:02:00.0: eth0: rx error, status 0x59b0020 length 1435
[ 151.758506] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 151.763368] sky2 0000:02:00.0: eth0: rx error, status 0x1780020 length 380
[ 151.768621] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 159.334735] net_ratelimit: 5 callbacks suppressed
[ 159.334742] sky2 0000:02:00.0: eth0: rx error, status 0x5620020 length 1378
[ 159.430277] sky2 0000:02:00.0: eth0: rx error, status 0x9d0020 length 157
[ 165.603936] sky2 0000:02:00.0: eth0: rx error, status 0x3d60020 length 982
[ 165.824732] sky2 0000:02:00.0: eth0: rx error, status 0x5cc0002 length 1484
[ 165.846673] sky2 0000:02:00.0: eth0: rx error, status 0x5cc0002 length 1484
[ 165.851051] sky2 0000:02:00.0: eth0: rx error, status 0x3c10020 length 961
[ 165.859700] sky2 0000:02:00.0: eth0: rx error, status 0x3c00020 length 960
[ 167.804561] sky2 0000:02:00.0: eth0: rx error, status 0x40b0020 length 1035
[ 174.900664] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 181.612397] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 181.850806] sky2 0000:02:00.0: eth0: rx error, status 0x3980020 length 920
[ 182.031611] sky2 0000:02:00.0: eth0: rx error, status 0x1720020 length 370
[ 182.757526] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 182.760969] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 182.868165] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 183.319772] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 183.323903] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 183.660931] sky2 0000:02:00.0: eth0: rx error, status 0x2f50020 length 757
[ 209.649181] net_ratelimit: 3 callbacks suppressed
[ 209.649187] sky2 0000:02:00.0: eth0: rx error, status 0x5d60002 length 1494
[ 209.652552] sky2 0000:02:00.0: eth0: rx error, status 0x2340020 length 568
[ 213.390135] sky2 0000:02:00.0: eth0: rx error, status 0x4740020 length 1144
[ 217.839201] sky2 0000:02:00.0: eth0: rx error, status 0x1dc0002 length 476
[ 225.104332] sky2 0000:02:00.0: eth0: rx error, status 0x28f0002 length 655
[ 225.462348] sky2 0000:02:00.0: eth0: rx error, status 0x5a0020 length 90
[ 225.584245] sky2 0000:02:00.0: eth0: rx error, status 0x930020 length 147
[ 225.672717] sky2 0000:02:00.0: eth0: rx error, status 0x14c0002 length 332
[ 226.436303] sky2 0000:02:00.0: eth0: rx error, status 0x3920020 length 914
[ 226.448875] sky2 0000:02:00.0: eth0: rx error, status 0x3ea0020 length 1002
[ 226.823315] sky2 0000...

Read more...

uname -a
Linux uzzmaster 3.0.0-13-generic #21-Ubuntu SMP Mon Oct 17 20:18:51 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

This is still a bug on the 4.2.0 kernel in 15.10 ...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.