Intel NICs not properly reporting link speed in SysFS in Xenial

Bug #1757191 reported by Jeff Lane on 2018-03-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned
Artful
High
Unassigned
Bionic
High
Unassigned

Bug Description

This was discovered during certification testing of 16.04.4 (I've now seen this behaviour at least 2 times)

A system under test has a 2 port Intel X550 NIC (10Gb)

Udev reports the NIC as this:
Category: NETWORK
Interface: enp94s0f0
Product: Ethernet Controller 10G X550T
Vendor: Intel Corporation
Driver: ixgbe (ver: 5.1.0-k)
Path: /devices/pci0000:5d/0000:5d:00.0/0000:5e:00.0
ID: [8086:1563]
Subsystem ID: [152d:8a13]

Ethtool shows this info (this is for the second port, which has the issue)
Settings for enp94s0f1:
 Supported ports: [ TP ]
 Supported link modes: 100baseT/Full
                         1000baseT/Full
                         10000baseT/Full
 Supported pause frame use: Symmetric
 Supports auto-negotiation: Yes
 Advertised link modes: 100baseT/Full
                         1000baseT/Full
                         10000baseT/Full
 Advertised pause frame use: Symmetric
 Advertised auto-negotiation: Yes
 Speed: 10000Mb/s
 Duplex: Full
 Port: Twisted Pair
 PHYAD: 0
 Transceiver: internal
 Auto-negotiation: on
 MDI-X: Unknown
 Supports Wake-on: umbg
 Wake-on: g
 Current message level: 0x00000007 (7)
          drv probe link
 Link detected: yes

Note ethtool shows an active 10Gb link.

The test tool determines the NIC speed by introspecting the sysfs data for each NIC port. In this case, by looking at /sys/class/net/DEVICENAME/speed

I've now seen this on a couple different NICs using the ixgbe driver. The first port will properly show connected link speed in /sys/class/net/DEVICENAME/speed but the second port shows -1 in that file.

Because of this, certification tests are failing because the tool believes that the link speed is incorrect.

This current example is using kernel 4.13.0-37.42~16.04.1

Jeff Lane (bladernr) on 2018-03-20
summary: - Intel NICs not properly reporting link speed in SysFS
+ Intel NICs not properly reporting link speed in SysFS in Xenial
description: updated

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1757191/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Jeff Lane (bladernr) on 2018-03-20
affects: ubuntu → linux (Ubuntu)
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.16 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.16-rc6

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu Artful):
importance: Undecided → Medium
status: New → Triaged
Changed in linux (Ubuntu):
status: New → Triaged
Jeff Lane (bladernr) wrote :

Hi Joeseph, I got the tester to try 4.16 and the tests still fail for the same reason. This is the test output indicating that /sys/class/net/DEVICENAME/speed still reports a -1 for the second port.

ERROR:root:Detected link speed (-1) is lower than detected max speed (10000)
ERROR:root:Check your device configuration and try again.
ERROR:root:If you want to override and test despite this under-speed link, use
ERROR:root:the --underspeed-ok option.

And just to verify, dmesg shows we are booted into 4.16:
[ 0.000000] Linux version 4.16.0-041600rc6-generic (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3.2)) #201803182230 SMP Mon Mar 19 02:32:18 UTC 2018
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.16.0-041600rc6-generic root=UUID=fcac9914-96c0-49b0-803b-269db52c3756 ro

tags: added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Triaged → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. We now know this bug also exists upstream. Do you happen to know if this is a regression? Was there a prior kernel version that did not exhibit this bug?

Jeff Lane (bladernr) wrote :

It looks like this happened perhaps sometime between 4.10 and 4.13:

4.4.0-112: https://certification.canonical.com/hardware/201802-26096/submission/127640/
Driver: ixgbe (ver: 4.2.1-k)
Shows proper speed for both ports

4.10.0-42: https://certification.canonical.com/hardware/201712-26025/submission/125481/
Driver: ixgbe (ver: 4.4.0-k)
Shows proper speed for both ports

4.13.0-37: https://certification.canonical.com/hardware/201803-26156/submission/128508/
Driver: ixgbe (ver: 5.1.0-k)
Shows -1 for second port

4.16.0-041600rc6 (tester sent via email)
Driver version unknown
Shows -1 for second port

Joseph Salisbury (jsalisbury) wrote :

We can perform a bisect to identify the commit that introduced this bug. Would it be possible for the tester to test some kernels? To perform a bisect, we need to identify the last kernel version that did not have the bug and the first version that did. The first kernels to test would be:

v4.11 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/
v4.12 Final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/

tags: added: performing-bisect
Changed in linux (Ubuntu Bionic):
status: Confirmed → Triaged
importance: Medium → High
Changed in linux (Ubuntu Artful):
importance: Medium → High
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers