bond: An illegal loopback occurred on slave

Bug #1851819 reported by James Page
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Mellanox ConnectX 5 adapter configured with LACP bonding; adapters in switchdev (offload) mode with VF's configure on both ports.

Kernel continually reports:

Nov 8 10:57:51 node-laveran kernel: [ 2087.912308] bond1: (slave enp3s0f0): An illegal loopback occurred on slave
Nov 8 10:57:51 node-laveran kernel: [ 2087.912308] Check the configuration to verify that all adapters are connected to 802.3ad complian
t switch ports
Nov 8 10:57:51 node-laveran kernel: [ 2087.917329] bond1: (slave enp3s0f1): An illegal loopback occurred on slave
Nov 8 10:57:51 node-laveran kernel: [ 2087.917329] Check the configuration to verify that all adapters are connected to 802.3ad complian
t switch ports

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-generic-hwe-18.04-edge 5.3.0.19.85
ProcVersionSignature: Ubuntu 5.3.0-19.20~18.04.2-generic 5.3.1
Uname: Linux 5.3.0-19-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
Date: Fri Nov 8 10:54:32 2019
ProcEnviron:
 TERM=screen-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-meta-hwe-edge
UpgradeStatus: No upgrade log present (probably fresh install)
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 19 16:45 seq
 crw-rw---- 1 root audio 116, 33 Nov 19 16:45 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 002: ID 8087:8002 Intel Corp.
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 003: ID 413c:a001 Dell Computer Corp. Hub
 Bus 001 Device 002: ID 8087:800a Intel Corp.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Inc. PowerEdge R630
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=screen-256color
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.3.0-23-generic root=UUID=2ff5e234-ee62-4bce-8266-cd9aa78c532f ro intel_iommu=on iommu=pt probe_vf=0
ProcVersionSignature: Ubuntu 5.3.0-23.25~18.04.1-generic 5.3.7
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-23-generic N/A
 linux-backports-modules-5.3.0-23-generic N/A
 linux-firmware 1.173.12
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic uec-images
Uname: Linux 5.3.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 11/08/2016
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.3.4
dmi.board.name: 02C2CP
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.3.4:bd11/08/2016:svnDellInc.:pnPowerEdgeR630:pvr:rvnDellInc.:rn02C2CP:rvrA03:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R630
dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R630
dmi.sys.vendor: Dell Inc.

Revision history for this message
James Page (james-page) wrote :
James Page (james-page)
affects: linux-meta-hwe-edge (Ubuntu) → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1851819

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote : CRDA.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
James Page (james-page) wrote : CurrentDmesg.txt

apport information

Revision history for this message
James Page (james-page) wrote : Lspci.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcInterrupts.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcModules.txt

apport information

Revision history for this message
James Page (james-page) wrote : UdevDb.txt

apport information

Revision history for this message
James Page (james-page) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
James Page (james-page)
tags: added: hwoffload
Revision history for this message
James Page (james-page) wrote :

Did a bit more testing; the error message is only seen when the card is in offload mode; when its in legacy mode (the default) the error message is not seen.

Revision history for this message
James Page (james-page) wrote :

With https://patchwork.ozlabs.org/patch/1224104/ and the secondary port link set to down I get reliable network connections to instances with offloaded ports.

Upstream developers pointed me to a commit that might resolve this.

Revision history for this message
James Page (james-page) wrote :

Bug 1852077 - however that is marked fix released so I think something else is at fault.

Revision history for this message
Mohammed Naser (mnaser) wrote :

FYI, I was being hit by this and the issue disappeared once I updated the firmware on those NICs to 14.31.1014. The problematic firmware version I had was 14.26.1040. Not only did I see those errors, but I actually saw traffic being dropped.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.