i40e/setting some ethtool features and remove module right after (modprobe -r) lead to crash in systems with more than 127 CPUs

Bug #1626782 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Canonical Kernel Team
Xenial
Fix Released
Undecided
Tim Gardner
Yakkety
Fix Released
High
Canonical Kernel Team

Bug Description

On systems with more than 127 online CPUs, all interrupt vectors on i40e driver are allocated to tx/rx queues and some features, like Flow Director SideBand (FD_SB), are kept disabled.

Then, changing some features like TSO/GSO via ethtool will forcibly enable FD_SB without any vectors available, and this way we get errors. If we unload the i40e (using "modprobe -r") after these errors, we end up crashing the system.

There is an upstream patch that address exactly this issue. We want to request Canonical to merge it on Ubuntu 16.04.1 kernel if possible.

The patch is: a70e407f6d ("i40e: Fix errors resulted while turning off TSO") [https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a70e407f6d0b0f63b17d468f78b666d33f264ba1]

Thanks in advance,

Guilherme

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-146656 severity-high targetmilestone-inin16041
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → High
status: New → Triaged
Revision history for this message
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Yakkety):
status: Triaged → Fix Released
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Seth Forshee (sforshee) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-10-20 09:39 EDT-------
Working fine with proposed kernel 4.4.0-44-generic.

Notice that proposed kernel 4.4.0-45-generic seems to *not include* the fix, since I tested in this version and was able to reproduce the issue.

Thanks,

Guilherme

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-47.68

---------------
linux (4.4.0-47.68) xenial; urgency=low

  [ Kamal Mostafa ]

  * Release Tracking Bug
    - LP: #1636941

  * Add a driver for Amazon Elastic Network Adapters (ENA) (LP: #1635721)
    - lib/bitmap.c: conversion routines to/from u32 array
    - net: ethtool: add new ETHTOOL_xLINKSETTINGS API
    - net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)
    - [config] enable CONFIG_ENA_ETHERNET=m (Amazon ENA driver)

  * unexpectedly large memory usage of mounted snaps (LP: #1636847)
    - [Config] switch squashfs to single threaded decode

 -- Kamal Mostafa <email address hidden> Wed, 26 Oct 2016 10:47:55 -0700

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-11-09 16:40 EDT-------
Perfect, thanks!

So, let's close this bug.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.