Upgrade i40e and i40evf driver to latest

Bug #1482304 reported by Antony Messerli
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned

Bug Description

It appears that the latest i40e driver in Trusty is 0.3.36-k. We've been experiencing some kernel panics with this driver and have found that it looks like bringing the driver up to the latest 1.2.48 has alleviated some of the kernel panics on boot.

Is this something we could look at getting upgraded to a more modern revision of the driver?

Thanks

Output of Panic (Using Intel x710)

[ 3.143088] i40e 0000:05:00.0 p1p1: NIC Link is Up
[ 3.301138] Switched to clocksource tsc
[ 6.216639] random: nonblocking pool is initialized
[ 8.934783] ------------[ cut here ]------------
[ 8.934805] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/net/sched/sch_generic.c:264 dev_watchdog+0x276/0x280()
[ 8.934808] NETDEV WATCHDOG: p1p1 (i40e): transmit queue 0 timed out
[ 8.934828] Modules linked in: joydev hid_generic gpio_ich x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw usbhid hid lpc_ich hpilo ioatdma dca ipmi_si shpchp wmi acpi_power_meter mac_hid lp parport psmouse i40e vxlan ip_tunnel ptp hpsa pps_core
[ 8.934874] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-57-generic #95-Ubuntu
[ 8.934877] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 05/06/2015
[ 8.934878] 0000000000000009 ffff88103fc03d98 ffffffff817232f0 ffff88103fc03de0
[ 8.934891] ffff88103fc03dd0 ffffffff8106784d 0000000000000000 ffff881028d18000
[ 8.934895] ffff881023734f40 0000000000000040 0000000000000000 ffff88103fc03e30
[ 8.934904] Call Trace:
[ 8.934905] <IRQ> [<ffffffff817232f0>] dump_stack+0x45/0x56
[ 8.934921] [<ffffffff8106784d>] warn_slowpath_common+0x7d/0xa0
[ 8.934925] [<ffffffff810678bc>] warn_slowpath_fmt+0x4c/0x50
[ 8.934928] [<ffffffff816479d6>] dev_watchdog+0x276/0x280
[ 8.934935] [<ffffffff81647760>] ? dev_graft_qdisc+0x80/0x80
[ 8.934942] [<ffffffff810744e6>] call_timer_fn+0x36/0x100
[ 8.934946] [<ffffffff81647760>] ? dev_graft_qdisc+0x80/0x80
[ 8.934950] [<ffffffff8107547f>] run_timer_softirq+0x1ef/0x2f0
[ 8.934957] [<ffffffff8106ccbc>] __do_softirq+0xec/0x2c0
[ 8.934961] [<ffffffff8106d205>] irq_exit+0x105/0x110
[ 8.934971] [<ffffffff81736195>] smp_apic_timer_interrupt+0x45/0x60
[ 8.934976] [<ffffffff81734b1d>] apic_timer_interrupt+0x6d/0x80
[ 8.934977] <EOI> [<ffffffff815d55d2>] ? cpuidle_enter_state+0x52/0xc0
[ 8.934986] [<ffffffff815d56f9>] cpuidle_idle_call+0xb9/0x1f0
[ 8.934994] [<ffffffff8101d3ee>] arch_cpu_idle+0xe/0x30
[ 8.935000] [<ffffffff810bf205>] cpu_startup_entry+0xc5/0x290
[ 8.935008] [<ffffffff817114f7>] rest_init+0x77/0x80
[ 8.935016] [<ffffffff81d34f70>] start_kernel+0x438/0x443
[ 8.935021] [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c
[ 8.935023] [<ffffffff81d34120>] ? early_idt_handlers+0x120/0x120
[ 8.935027] [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
[ 8.935030] [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
[ 8.935034] ---[ end trace fabec2b76d314b12 ]---
[ 8.935037] i40e 0000:05:00.0 p1p1: tx_timeout recovery level 0
[ 8.935071] i40e 0000:05:00.0: VSI reinit requested
[ 8.945435] i40e 0000:05:00.0 p1p1: NIC Link is Up
[ 14.856689] i40e 0000:05:00.0: Detected Tx Unit Hang
[ 14.856689] VSI <518>
[ 14.856689] Tx Queue <0>
[ 14.856689] next_to_use <1>
[ 14.856689] next_to_clean <0>
[ 14.856697] i40e 0000:05:00.0: tx_bi[next_to_clean]
[ 14.856697] time_stamp <fffee736>
[ 14.856697] jiffies <fffee971>
[ 14.856699] i40e 0000:05:00.0: tx hang detected on queue 0, resetting adapter
[ 14.856701] i40e 0000:05:00.0 p1p1: tx_timeout recovery level 1
[ 14.882582] i40e 0000:05:00.0: i40e_ptp_init: added PHC on p1p1
[ 14.899577] i40e 0000:05:00.0 p1p1: NIC Link is Up
[ 14.899623] i40e 0000:05:00.0: reset complete
[ 24.928762] i40e 0000:05:00.0 p1p1: tx_timeout recovery level 2
[ 25.876839] i40e 0000:05:00.1: i40e_ptp_init: added PHC on p1p2
[ 25.889067] i40e 0000:05:00.1: reset complete
[ 25.917938] i40e 0000:05:00.0: i40e_ptp_init: added PHC on p1p1
[ 25.934578] i40e 0000:05:00.0 p1p1: NIC Link is Up
[ 25.934621] i40e 0000:05:00.0: reset complete

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1482304

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: trusty
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.2 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.2-rc5-unstable/

Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Antony Messerli (antonym) wrote :

Using 4.2.0-040200rc5-generic which is running 1.3.4-k, I did not experience any kernel panics during the boot up process.

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Have you tried linux-generic-lts-vivid ? That i40e driver is at 1.2.2. We won't have a 4.2 based kernel in the archive until 15.10 is released.

Revision history for this message
Andrew (andrewx-bowers) wrote :
Revision history for this message
Antony Messerli (antonym) wrote :

So I ended up upgrading the NVM to the latest version (f4.33.31377 a1.2 n4.42 e191b):

https://downloadcenter.intel.com/download/24769/NVM-Update-Utility-for-Intel-Ethernet-Converged-Network-Adapter-XL710-X710-Series

The 0.3.36-k now refuses to work at all throwing these messages into dmesg (no random kernel panic with the latest NVM, just doesn't work at all):

[ 1.602924] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 0.3.36-k
[ 1.602926] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[ 1.613938] i40e 0000:05:00.0: f4.33 a1.2 n04.42 e8000191b
[ 1.613940] i40e 0000:05:00.0: init_adminq failed: -65 expecting API 01.01
[ 1.613999] i40e: probe of 0000:05:00.0 failed with error -65
[ 1.634901] i40e 0000:05:00.1: Initial pf_reset failed: -15
[ 1.634962] i40e: probe of 0000:05:00.1 failed with error -15
[ 1.646010] i40e 0000:88:00.0: f4.33 a1.2 n04.42 e8000191b
[ 1.646013] i40e 0000:88:00.0: init_adminq failed: -65 expecting API 01.01
[ 1.646079] i40e: probe of 0000:88:00.0 failed with error -65
[ 1.666925] i40e 0000:88:00.1: Initial pf_reset failed: -15
[ 1.667004] i40e: probe of 0000:88:00.1 failed with error -15

I then installed linux-generic-lts-vivid and that works without issue:

root@ubuntu:~# ethtool -i p1p2
driver: i40e
version: 1.2.2-k
firmware-version: f4.33 a1.2 n04.42 e8000191b
bus-info: 0000:05:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
root@ubuntu:~# uname -a
Linux ubuntu 3.19.0-26-generic #28~14.04.1-Ubuntu SMP Wed Aug 12 14:09:17 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Antony Messerli (antonym) wrote :

Is version 1.2.2 of i40e something that could be backported to Trusty?

I had also opened https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1476393 to add the driver to the installer but I think the built in driver (0.3.36-k) is so old it could cause issues when installing on that card.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Antony - issues of this sort are meant to be solved with a backported kernel, e.g., linux-generic-lts-vivid. The next Trusty LTS ISO (14.04.3) will install this kernel by default.

Revision history for this message
Andreas Schröder (andreas-schroeder-s) wrote :

 I have a Supermicro AOC-STG-i4S quad port card (ftp://ftp.supermicro.com/Networking_Drivers/CDR-NIC_1.41_for_Add-on_NIC_Cards/MANUALS/datasheet-AOC-STG-i4S.pdf) which is also based on the Intel XL710 chip.

With the standard Trusty 3.13 kernel and the included i40e module version 0.3.36-k the card doesn't work at all. Interfaces are not created because of some firmware API mismatch.
With the LTS Vivid 3.19 kernel and the included i40e module version 1.2.2-k the card is detected, the interfaces are created, but bonding doesn't work properly. Seem to be this bug: http://sourceforge.net/p/e1000/bugs/475/ .

I had to manually install the driver version 1.3.39.1. Bonding works now. But the driver complains about the 'NVM image' (part of firmware) being too old. So I had to update the firmware as well.

A recent driver in the backported LTS kernels would be very important for many users, as the existing older versions don't seem to be matured enough.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Watch for a 4.2 based kernel (linux-lts-wily) when 15.10 is released.

Revision history for this message
Andreas Schröder (andreas-schroeder-s) wrote :

I switched to linux-image-generic-lts-wily (4.2.0-23-generic) kernel. Everything seems to work for the moment, but now the driver complains about the firmware beeing to new:

i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.3.4-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:41:00.0: f4.40.35115 a1.4 n4.53 e1dc1
i40e 0000:41:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:41:00.0: FCoE capability is disabled
i40e 0000:41:00.0: MAC address: 00:11:22:33:44:55
i40e 0000:41:00.0: SAN MAC: 00:11:22:33:44:55
i40e 0000:41:00.0: fcoe queues = 0
i40e 0000:41:00.0: enabling bridge mode: VEPA
i40e 0000:41:00.0: i40e_ptp_init: added PHC on eth2
i40e 0000:41:00.0: PCI-Express: Speed 8.0GT/s Width x8
i40e 0000:41:00.0: Features: PF-id[0] VFs: 32 VSIs: 34 QP: 24 RX: 1BUF RSS FD_ATR FD_SB NTUPLE DCB PTP

Revision history for this message
Bjoern (bjoern-t) wrote :

FYI, this combination seem to work now:

Kernel 3.13.0-96-generic
i40e NIC Driver 1.4.25
NVM Firmware package 5.04

have not done any extensive testing yet

Bjoern (bjoern-t)
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.