Intel ixgbe driver doesn't work at 10Gb speeds

Bug #1291660 reported by Jeff Lane 
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

I have a fresh install of 12.04.4 on an IBM server with an Intel 10GbE card (actually I have seen this on TWO systems, one with a PCIe 10GbE card and one with a mezzanine add-in card that has the same chipset).

12.04.4 ships with ixgbe version 3.13.10-k

Initially, when the ethernet device comes up, it sets itself at 1Gb not 10Gb. If you do an ethtool on one of the devices, they clearly show supported speeds of 100, 1000 and 10000Mb/s as well as advertised speeds of that nature.

Also, ethtool clearly shows the current speed at 1000 .

now, if you force the card to a speed of 10000 like so:

sudo ethtool -s eth0 speed 10000 duplex full

Ethtool will NOW show "Unknown!" for speed and duplex settings, and will drop the link.

The link will restore, as will the rest of the output, if you put the card back to the 1000Mb/s speed.

Thus, the ixgbe driver seems to be broken.

I tried to use apport to file this bug, however apport told me that the linux and linux-image-generic packages are NOT official Ubuntu packages, and refused to create the report for me.

Thus, I gathered what i could via sosreport and some other means, which I am attaching below.

This will gate any certification that uses the Intel 10Gb NICs (seen onboard at least a couple server systems now).

Revision history for this message
Jeff Lane  (bladernr) wrote :

this tarball has the results of sosreport and a few other things that demonstrate the bug above.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1291660

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-rc6-trusty/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key trusty
Revision history for this message
Jeff Lane  (bladernr) wrote :

Tried the mainline which provides ixgbe version3.19.1-k

Same result. Card connects at 1Gb and forcing it to 10Gb via teh tool results in the "Unknown!" Message for speed and duplex and the link is shit down.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Jeff - do you know that 10Gbit works ? Could it be cabling or switch related ? Can you test an older release ? ixgbe has supported 10Gbit for awhile now.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Note that you've _also_ got an igb device which is 1 Gbit only.

[ 5.207985] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.0.5-k
[ 5.207986] igb: Copyright (c) 2007-2013 Intel Corporation.
[ 5.208101] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 3.13.10-k
[ 5.208102] ixgbe: Copyright (c) 1999-2013 Intel Corporation.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Tim, I'm no longer on-site and do not have access to this. I've subscribed Mark Brown who is the IBM TPM. Mark, can you see if you can work with Gary and help find out if 10Gb works at all on those systems at all?

Revision history for this message
Mark Brown (mstevenbrown) wrote :

Have contacted IBM for further information.

Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :

See the attached eth*.log files attached above.

I ran ethtool, pings, and netperf on eth0 and eth1 in two different configurations. One configuration is with eth0 and eth1 directly connected to 10 Gb ixgbe adapters on another system. These are labeled ptp for point to point. I ran the same tests again using the ethernet switch that Jeff used when originally filing this bug. These are labeled switch.

The data seems to indicate that the ethernet switch/adapter conbination is not running at 10 gig. The point to point connections show better throughput, but not great throughput for a 10 gig adapter.

Let me know if you have something specific in mind you'd like me to test.

Gary

Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :
Revision history for this message
Gary Gaydos (gaydos) wrote :

See the above three log files, 2 with iperf and 1 with netperf.

iperf is equivalent between 3.11.0-15-generic and 3.13.6 upstream kernel when running point to point (no switch). TCP stream is as expected. UDP stream is poor in both cases.

netstat results are significantly different than iperf results. The netstat test run is also point to point (no switch).

Let me know if you'd like me to perform additional tests.

Gary

Revision history for this message
Mark Brown (mstevenbrown) wrote : Re: [Bug 1291660] Re: Intel ixgbe driver doesn't work at 10Gb speeds

On 03/21/2014 03:56 PM, Gary Gaydos wrote:
> See the above three log files, 2 with iperf and 1 with netperf.
>
> iperf is equivalent between 3.11.0-15-generic and 3.13.6 upstream kernel
> when running point to point (no switch). TCP stream is as expected.
> UDP stream is poor in both cases.
>
> netstat results are significantly different than iperf results. The
> netstat test run is also point to point (no switch).
>
> Let me know if you'd like me to perform additional tests.

Got these, let us digest them for a bit....

--
Mark Brown, Technical Partner Manager, Canonical
(US) 512.496.1593
<email address hidden>

Revision history for this message
Mark Brown (mstevenbrown) wrote :

Gary-

Have you/can you run the apport-collect command (Note #2 in the bug)
yet, and submit the results?

On 03/21/2014 03:56 PM, Gary Gaydos wrote:
> See the above three log files, 2 with iperf and 1 with netperf.
>
> iperf is equivalent between 3.11.0-15-generic and 3.13.6 upstream kernel
> when running point to point (no switch). TCP stream is as expected.
> UDP stream is poor in both cases.
>
> netstat results are significantly different than iperf results. The
> netstat test run is also point to point (no switch).
>
> Let me know if you'd like me to perform additional tests.
>
> Gary
>

--
Mark Brown, Technical Partner Manager, Canonical
(US) 512.496.1593
<email address hidden>

Revision history for this message
Gary Gaydos (gaydos) wrote : apport information

ApportVersion: 2.0.1-0ubuntu17.6
Architecture: amd64
DistroRelease: Ubuntu 12.04
InstallationMedia: Ubuntu-Server 12.04.4 LTS "Precise Pangolin" - Release amd64 (20140204)
MarkForUpload: True
Package: linux (not installed)
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
Tags: precise
Uname: Linux 3.13.6 x86_64
UnreportableReason: The running kernel is not an Ubuntu kernel
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

tags: added: apport-collected precise
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Gary Gaydos (gaydos) wrote :

AlsaDevices:
 total 0
 crw-rw---T 1 root audio 116, 1 Mar 25 14:14 seq
 crw-rw---T 1 root audio 116, 33 Mar 25 14:14 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu17.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 22.872156] usb 2-1.1.1: USB disconnect, device number 7
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=54122c9c-f52a-4825-8271-e2d88b04a1d0
InstallationMedia: Ubuntu-Server 12.04.4 LTS "Precise Pangolin" - Release amd64 (20140204)
MachineType: IBM System x3750 M4 -[8752AC1]-
MarkForUpload: True
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.11.0-15-generic root=UUID=9c6b7b8c-de6e-46fb-a730-5685cfbda1a7 ro crashkernel=384M-2G:64M,2G-:128M
ProcVersionSignature: Ubuntu 3.11.0-15.25~precise1-generic 3.11.10
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-15-generic N/A
 linux-backports-modules-3.11.0-15-generic N/A
 linux-firmware 1.79.9
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.11.0-15-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 10/31/2013
dmi.bios.vendor: IBM
dmi.bios.version: -[KOE135PUS-1.40]-
dmi.board.name: 47C9560
dmi.board.vendor: IBM
dmi.chassis.asset.tag: none
dmi.chassis.type: 23
dmi.chassis.vendor: IBM Corp.
dmi.chassis.version: none
dmi.modalias: dmi:bvnIBM:bvr-[KOE135PUS-1.40]-:bd10/31/2013:svnIBM:pnSystemx3750M4-[8752AC1]-:pvr02:rvnIBM:rn47C9560:rvr:cvnIBMCorp.:ct23:cvrnone:
dmi.product.name: System x3750 M4 -[8752AC1]-
dmi.product.version: 02
dmi.sys.vendor: IBM

Revision history for this message
Gary Gaydos (gaydos) wrote : AcpiTables.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : BootDmesg.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : IwConfig.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : Lspci.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : Lsusb.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : ProcModules.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : UdevDb.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : UdevLog.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote : WifiSyslog.txt

apport information

Revision history for this message
Gary Gaydos (gaydos) wrote :

I've performed some further experiments. I reran the netperf and iperf tests while using the iptraf tool to validate the netperf and iperf results.

Iptraf is within a few percent of netperf and iperf in all cases except iperf tcp stream. Iperf tcp stream shows 9.4 Gb/s, iptraf shows 2.7 Gb/s for the same test run.

Differences in the test configurations between netperf and iperf likely account for the different test results in the cases where iptraf validates them.

Gary

To post a comment you must log in.