mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test

Bug #2064163 reported by Asmaa Mnebhi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-bluefield (Ubuntu)
In Progress
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Jammy
Fix Committed
Undecided
Unassigned

Bug Description

SRU Justification:

[Impact]

During the QA reboot test, the BF3 Vitesse PHY gets stuck in a bad state, resulting in no ip provisioning. The only way to recover is to powercycle.
We found a software workaround to avoid getting in this state in the first place: disable the OOB port in the shutdown function.

[Fix]

* Prevent the PHY from entering this bad state by disabling the OOB port
  during shutdown.

[Test Case]

* do the reboot test (at least 2000 reboots): run 'reboot' from linux.
* Check that the oob_net0 interface is up and the ip is assigned.
* please note that if the the OOB doesn't get an ip, try reloading the driver (rmmod/modprobe). it that solves the issue, that would be a different bug. In the bug at stake, nothing recovers the OOB ip except power cycle.

[Regression Potential]

* Make sure the redfish DHCP is still working during the reboot test
* Make sure the OOB gets an ip

Asmaa Mnebhi (asmaam)
description: updated
Revision history for this message
Asmaa Mnebhi (asmaam) wrote :

This SW WA didnt work so the HW team will have to review it.

Changed in linux-bluefield (Ubuntu):
status: New → Invalid
Changed in linux-bluefield (Ubuntu Jammy):
status: New → Invalid
Revision history for this message
Asmaa Mnebhi (asmaam) wrote :

Reopening this bug since we found a software workaround for it.

Changed in linux-bluefield (Ubuntu):
status: Invalid → In Progress
Changed in linux-bluefield (Ubuntu Jammy):
status: Invalid → In Progress
description: updated
Changed in linux-bluefield (Ubuntu Jammy):
status: In Progress → Fix Committed
Changed in linux-bluefield (Ubuntu Focal):
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-bluefield/5.4.0-1086.93 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-bluefield' to 'verification-done-focal-linux-bluefield'. If the problem still exists, change the tag 'verification-needed-focal-linux-bluefield' to 'verification-failed-focal-linux-bluefield'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-bluefield-v2 verification-needed-focal-linux-bluefield
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-bluefield/5.15.0-1044.46 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-bluefield' to 'verification-done-jammy-linux-bluefield'. If the problem still exists, change the tag 'verification-needed-jammy-linux-bluefield' to 'verification-failed-jammy-linux-bluefield'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-bluefield-v2 verification-needed-jammy-linux-bluefield
Tien Do (tienmdo)
tags: added: verification-done-jammy-linux-bluefield
tags: removed: kernel-spammed-jammy-linux-bluefield-v2 verification-needed-focal-linux-bluefield verification-needed-jammy-linux-bluefield
tags: added: kernel-spammed-jammy-linux-bluefield-v2 verification-needed-jammy-linux-bluefield
removed: verification-done-jammy-linux-bluefield
Revision history for this message
Bartlomiej Zolnierkiewicz (bzolnier) wrote (last edit ):

Re-add verification-done-jammy-linux-bluefield tag (it was removed by ubuntu-kernel-bot after incorrect removal of kernel-spammed-jammy-linux-bluefield-v2 tag).

tags: added: verification-done-jammy-linux-bluefield
removed: verification-needed-jammy-linux-bluefield
Tien Do (tienmdo)
tags: added: verification-done-focal-linux-bluefield
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.0 KiB)

This bug was fixed in the package linux-bluefield - 5.4.0-1086.93

---------------
linux-bluefield (5.4.0-1086.93) focal; urgency=medium

  * focal/linux-bluefield: 5.4.0-1086.93 -proposed tracker (LP: #2063770)

  * mlxbf-gige: Vitesse PHY stuck in a bad state during reboot test
    (LP: #2064163)
    - SAUCE: mlxbf-gige: OOB PHY stuck in a bad state during reboot test

  * mlxbf-gige: autonegotiation fails to complete on BF2 (LP: #2062384)
    - SAUCE: mlxbf-gige: autonegotiation fails to complete on BF2

  [ Ubuntu: 5.4.0-186.206 ]

  * focal/linux: 5.4.0-186.206 -proposed tracker (LP: #2063812)
  * Mount CIFS fails with Permission denied (LP: #2061986)
    - cifs: fix ntlmssp auth when there is no key exchange
  * USB stick can't be detected (LP: #2040948)
    - usb: Disable USB3 LPM at shutdown
  * CVE-2024-26733
    - net: dev: Convert sa_data to flexible array in struct sockaddr
    - arp: Prevent overflow in arp_req_get().
    - stddef: Introduce DECLARE_FLEX_ARRAY() helper
  * CVE-2024-26712
    - powerpc/kasan: Fix addr error caused by page alignment
  * CVE-2023-52530
    - wifi: mac80211: fix potential key use-after-free
  * CVE-2021-47063
    - drm: bridge/panel: Cleanup connector on bridge detach
  * [Ubuntu 22.04.4/linux-image-6.5.0-26-generic] Kernel output "UBSAN: array-
    index-out-of-bounds in /build/linux-hwe-6.5-34pCLi/linux-
    hwe-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1445:41" multiple times,
    especially during boot. (LP: #2058477)
    - hv: hyperv.h: Replace one-element array with flexible-array member
  * CVE-2024-26614
    - tcp: make sure init the accept_queue's spinlocks once
    - ipv6: init the accept_queue's spinlocks in inet6_create
  * Focal update: v5.4.271 upstream stable release (LP: #2060216)
    - netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter
    - net: ip_tunnel: prevent perpetual headroom growth
    - tun: Fix xdp_rxq_info's queue_index when detaching
    - ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()
    - lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is
      detected
    - net: usb: dm9601: fix wrong return value in dm9601_mdio_read
    - Bluetooth: Avoid potential use-after-free in hci_error_reset
    - Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
    - Bluetooth: Enforce validation on max value of connection interval
    - netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
    - rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
    - efi/capsule-loader: fix incorrect allocation size
    - power: supply: bq27xxx-i2c: Do not free non existing IRQ
    - ALSA: Drop leftover snd-rtctimer stuff from Makefile
    - afs: Fix endless loop in directory parsing
    - gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
    - wifi: nl80211: reject iftype change with mesh ID change
    - btrfs: dev-replace: properly validate device names
    - dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read
    - dmaengine: fsl-qdma: init irq after reg initialization
    - mmc: core: Fix eMMC initialization with 1-bit bus connection
    - x86/cpu/intel: Detect TME keyid bits before setting MTRR mas...

Read more...

Changed in linux-bluefield (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.