Problems on Fibre Channel over Ethernet on Ubuntu 22.04.4
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
New
|
Undecided
|
Kleber Sacilotto de Souza |
Bug Description
Since I've upgraded my server's to Ubuntu 22.04.4 I'm facing problems to make my fcoe setup to work correctly.
Firstly, some context. We have a HPE Synergy appliance with a bunch of Synergy 480 Gen10 servers still running Ubuntu 20.04.
Each 480 server has 4 QLogic BCM57840 NetXtreme II Ethernet Multi Function network adapters with two of them dedicated to storage communication through FCoE.
Recently, I've upgraded one of them to Ubuntu 22.04 and realized that after the upgrade, all the Fibre Channel LUNs have disappeared. During the diagnostic process I noticed that both network (bnx2x) and fcoe (fcoe, bnx2fc) kernel drivers have been loaded properly, FCoE vlans were correctly detected by fcoe-utils, but there was no connection between the server and the storage and the server could not detect any LUN.
I also noticed that fcoeadm was reporting the following status:
# fcoeadm -i
Description: BCM57840 NetXtreme II Ethernet Multi Function
Revision: 11
Manufacturer: Broadcom Inc. and subsidiaries
Serial Number: 9440C953B7F0
Driver: bnx2x Unknown
Number of Ports: 1
Symbolic Name: bnx2fc (QLogic BCM57840) v2.12.13 over ens3f2.4093-fco
OS Device Name: host2
Node Name: 0x10000a3d27200005
Port Name: 0x10000a3d27200004
Fabric Name: 0x0
Speed: 20 Gbit
Supported Speed: 1 Gbit, 10 Gbit
FC-ID (Port ID): 0xffffffff
State: Offline
Description: BCM57840 NetXtreme II Ethernet Multi Function
Revision: 11
Manufacturer: Broadcom Inc. and subsidiaries
Serial Number: 9440C953B7F0
Driver: bnx2x Unknown
Number of Ports: 1
Symbolic Name: bnx2fc (QLogic BCM57840) v2.12.13 over ens3f3.4094-fco
OS Device Name: host3
Node Name: 0x10000a3d27200007
Port Name: 0x10000a3d27200006
Fabric Name: 0x0
Speed: 20 Gbit
Supported Speed: 1 Gbit, 10 Gbit
FC-ID (Port ID): 0xffffffff
State: Offline
Although most of information displayed was correct, the port state was always offline (what explains no storage connectivity) and the FC-ID reported was 0xffffffff for both interfaces.
Supposing that something went wrong during the upgrade process, I decided to make a fresh install of the server and, for my surprise, the fcoe ports went online and all luns were detected correctly.
Unfortunately, the ports went offline again after a dist-upgrade.
Trying to understand what was happening I performed another fresh install of the Ubuntu 22.04, but this time collecting some information about kernel and packages versions from before and after dist-upgrade.
What I discovered was that all packages related (fcoe-utils, lldpad) had the same versions before and after, the only exception was the kernel itself. Before the dist-upgrade the kernel in use was 5.15.0-25-generic and after dist-upgrade the kernel was 5.15.0-100-generic.
Suspecting a kernel problem I installed and booted different kernel versions from 5.15.0-25 and 5.15.0-100 and discovered that the problem probably was introduced with kernel 5.15.0-94, since it is the first version showing the describe behavior.
Looking at the changelog of linux-modules package, I noticed some changes on fcoe and other scsi modules, right after the 5.15.0-91 version.
Could those changes be the cause of the behavior observed on my server or just a coincidence? Is there any fix to it?
Thanks in advance.
Anderson
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: linux-image-
ProcVersionSign
Uname: Linux 5.15.0-94-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Mar 14 11:03 seq
crw-rw---- 1 root audio 116, 33 Mar 14 11:03 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu82.5
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CasperMD5CheckR
Date: Thu Mar 14 14:57:14 2024
HibernationDevice: RESUME=
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 002: ID 0bda:0329 Realtek Semiconductor Corp. USB3.0-CRW
Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 002 Device 002: ID 0424:2660 Microchip Technology, Inc. (formerly SMSC) Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Lsusb-t:
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=
|__ Port 4: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=
|__ Port 3: Dev 2, If 0, Class=Hub, Driver=hub/2p, 480M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/8p, 480M
MachineType: HPE Synergy 480 Gen10
PciMultimedia:
ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware 20220329.
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to jammy on 2024-03-12 (2 days ago)
dmi.bios.date: 10/26/2020
dmi.bios.release: 2.40
dmi.bios.vendor: HPE
dmi.bios.version: I42
dmi.board.name: Synergy 480 Gen10 Compute Module
dmi.board.vendor: HPE
dmi.board.version: 10 A0
dmi.chassis.type: 28
dmi.chassis.vendor: HPE
dmi.ec.
dmi.modalias: dmi:bvnHPE:
dmi.product.family: Synergy
dmi.product.name: Synergy 480 Gen10
dmi.product.sku: 871940-B21
dmi.sys.vendor: HPE
Changed in linux (Ubuntu): | |
assignee: | nobody → Kleber Sacilotto de Souza (kleber-souza) |
Hello Anderson,
Thank you for reporting the issue. We have a new kernel for 22.04 (version 5.15.0-102.112) that is about to be released in the next few days. Could you please test it again with this kernel and report the results? You can either enable -proposed in your system and update the kernel, or wait until it gets promoted to -updates and a simple dist-upgrate will update the kernel.