i40e bug: non physical MAC outbound frames appear as copied back inbound (mirrored)

Bug #1497812 reported by JuanJo Ciarlante
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
High
Unassigned
Vivid
Won't Fix
High
Unassigned
linux-lts-vivid (Ubuntu)
Confirmed
High
Unassigned
Vivid
Won't Fix
High
Unassigned

Bug Description

Using 3.19.0-28-generic #30~14.04.1-Ubuntu with stock i40e
driver version 2.2.2-k makes every 'non physical' MAC output
frame appear as copied back at input, as if the switch was
doing frame 'mirroring' (and/or hair-pinning).

FYI same setup, with i40e upgraded to 1.2.48 from
http://downloadmirror.intel.com/25282/eng/i40e-1.2.48.tar.gz
behaves OK, fyi also we did a port mirroring setup at
the switch directed to a different physical port for debugging,
and didn't observe these frames to be physically present.

See tcpdump -P in/out and more details at
http://paste.ubuntu.com/12511680/

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.19.0-28-generic 3.19.0-28.30~14.04.1
ProcVersionSignature: Ubuntu 3.19.0-28.30~14.04.1-generic 3.19.8-ckt5
Uname: Linux 3.19.0-28-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.13
Architecture: amd64
Date: Mon Sep 21 02:05:28 2015
ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-lts-vivid
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
JuanJo Ciarlante (jjo) wrote :
Revision history for this message
JuanJo Ciarlante (jjo) wrote :

FYI we found these issues while deploying openstack via juju/maas
over a pool of 8 nodes having 4x i40e NICs, where we also found
linux-hwe-generic-trusty (lts-utopic) to be unreliable from its old
i40e driver (0.4.10-k).

Below is a summary of our i40e findings using lts-vivid and lts-utopic
re: successful completed deploys:

#1 3.19.0-28-generic w/stock 1.2.2-k: non-phy mirrored frames (this bug)
#2 3.16.0-49-generic w/stock 0.4.10-k: unreliable deploys
#3 3.19.0-28-generic w/built 2.2.48: OK
#4 3.16.0-49-generic w/built 2.2.48: OK

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-vivid (Ubuntu):
status: New → Confirmed
Revision history for this message
JuanJo Ciarlante (jjo) wrote :

ERRATA on comment #2 : OK i40e driver version is 1.2.48,
as per original report URL.

Comment #2 table is actually:

#1 3.19.0-28-generic w/stock 1.2.2-k: non-phy mirrored frames (this bug)
#2 3.16.0-49-generic w/stock 0.4.10-k: unreliable deploys
#3 3.19.0-28-generic w/built 1.2.48: OK (*)
#4 3.16.0-49-generic w/built 1.2.48: OK (*)

(*) corrected to be 1.2.48

Changed in linux-lts-vivid (Ubuntu):
importance: Undecided → High
tags: added: kernel-key
Changed in linux (Ubuntu):
importance: Undecided → High
status: New → Triaged
Revision history for this message
Jay Vosburgh (jvosburgh) wrote :

Just looking at the log, it might be this:

commit fa11cb3d16a9b9b296a2b811a49faf1356240348
Author: Anjali Singhai Jain <email address hidden>
Date: Wed May 27 12:06:14 2015 -0400

    i40e: Make sure to be in VEB mode if SRIOV is enabled at probe

    If SRIOV is enabled we need to be in VEB mode not VEPA mode at probe.
    This fixes an NPAR bug when SRIOV is enabled in the BIOS.

    Change-ID: Ibf006abafd9a0ca3698ec24848cd771cf345cbbc
    Signed-off-by: Anjali Singhai Jain <email address hidden>
    Tested-by: Jim Young <email address hidden>
    Signed-off-by: Jeff Kirsher <email address hidden>

Revision history for this message
JuanJo Ciarlante (jjo) wrote :

Confirming _not_ observing reported issue on an
equivalent setup w/ LXCs frames hitting phy interfaces
( bridged towards br0 -> bond0 -> {eth3, eth4} ):

* linux 4.2.0-12-generic #14~14.04.1-Ubuntu (from canonical-kernel-team/ppa)
* i40e version 1.3.4-k

# ethtool -i eth3
driver: i40e
version: 1.3.4-k
firmware-version: f4.33.31377 a1.2 n4.41 e1863

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Vivid test kernel with a cherry pick of fa11cb3d. This commit also required the following two commits as prerequisits:
5161601
fc60861

Can you test this kernel and see if it resolves this bug? It can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1497812/

Thanks in advance!

Revision history for this message
JuanJo Ciarlante (jjo) wrote :

w000T! \o/ using @jsalisbury kernel from comment#7
3.19.0-30-generic #33~lp1497812 ,
I can't reproduce the failing behavior under same host + setup
- no mirrored frames or alike dmesg
- containers networking ok

Comparison between stock vivid
3.19.0-30-generic #33~14.04.1-Ubuntu and above:
- http://paste.ubuntu.com/12627042/

Changed in linux (Ubuntu):
status: Triaged → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Vivid):
status: New → In Progress
importance: Undecided → High
Changed in linux-lts-vivid (Ubuntu Vivid):
importance: Undecided → High
status: New → Confirmed
Changed in linux (Ubuntu Vivid):
assignee: nobody → Joseph Salisbury (jsalisbury)
tags: removed: kernel-key
Revision history for this message
Mick Gregg (macgreagoir) wrote :

Using the @jsalisbury kernel from comment#7 (same site as @jjo comment#8) I've just seen the issue reproduced, but after some hours of stability.

We seen this similarly with i40e-1.2.48, in which cases the module had to be reloaded (or the machine rebooted) to recover container networking.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@jjo, did you also see this bug come back using the test kernel?

Changed in linux (Ubuntu Vivid):
status: In Progress → Incomplete
Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Vivid):
assignee: Joseph Salisbury (jsalisbury) → nobody
Changed in linux (Ubuntu):
assignee: Joseph Salisbury (jsalisbury) → nobody
tags: added: kernel-da-key
Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie vivid. The bug task representing the vivid nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Vivid):
status: Incomplete → Won't Fix
Andy Whitcroft (apw)
Changed in linux-lts-vivid (Ubuntu Vivid):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.