[hns-1126]net: hns3: revert to old channel when setting new channel num fail

Bug #1853983 reported by Fred Kimmy on 2019-11-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kunpeng920
Undecided
Unassigned
Ubuntu-18.04
Undecided
Unassigned
Ubuntu-18.04-hwe
Undecided
Unassigned
Ubuntu-19.04
Undecided
Ike Panhc
Ubuntu-19.10
Undecided
Ike Panhc
Ubuntu-20.04
Undecided
Unassigned
Upstream-kernel
Undecided
Unassigned
linux (Ubuntu)
Status tracked in Focal
Disco
Undecided
Ike Panhc
Eoan
Undecided
Ike Panhc
Focal
Undecided
Unassigned

Bug Description

[Impact]
With specific ethtool commands, ethernet does not work

[Test Case]
run VF on VM with 3G memory and load VF driver
ethtool -G eth0 tx 30000 rx 30000
ethtool -G eth0 tx 24 rx 24;ethtool -L eth0 combined 1
ethtool -G eth0 tx 32760 rx 32760
ethtool -L eth0 combined 32
ethernet shall still be functional after these commands

[Fix]
3a5a5f06d4d2 net: hns3: revert to old channel when setting new channel num fail

[Regression Risk]
Patch restricted to hns3 driver.

"[Bug Description]
After setting new channel num, it needs free old ring memory and
allocate new ring memory. If there is no enough memory and allocate
new ring memory fail, the ring may initialize fail. To make sure
the network interface can work normally, driver should revert the
channel to the old configuration.

[Steps to Reproduce]
1.run VF on VM with 3G memory
2.load VF driver
3.ethtool -G eth0 tx 30000 rx 30000
4.ethtool -G eth0 tx 24 rx 24;ethtool -L eth0 combined 1
5.ethtool -G eth0 tx 32760 rx 32760
6.ethtool -L eth0 combined 32
7.ping

[Actual Results]
netdevice can not work

[Expected Results]
ping ok

[Reproducibility]
Inevitably

[Additional information]
Hardware: D06
Firmware: NA
Kernel: NA

[Resolution]
Revert the channel to the old configuration when allocate new ring memory fail."

Ike Panhc (ikepanhc) wrote :

Is this patch the fix?

commit 3a5a5f06d4d2ddb257cb71e9c52106f50abad8fc
Author: Peng Li <email address hidden>
Date: Wed Sep 11 10:40:34 2019 +0800

    net: hns3: revert to old channel when setting new channel num fail

    After setting new channel num, it needs free old ring memory and
    allocate new ring memory. If there is no enough memory and allocate
    new ring memory fail, the ring may initialize fail. To make sure
    the network interface can work normally, driver should revert the
    channel to the old configuration.

    Signed-off-by: Peng Li <email address hidden>
    Signed-off-by: Huazhong Tan <email address hidden>
    Signed-off-by: David S. Miller <email address hidden>

Changed in kunpeng920:
status: New → Incomplete
Fred Kimmy (kongzizaixian) wrote :

it is ok.

Ike Panhc (ikepanhc) on 2019-12-10
Changed in kunpeng920:
status: Incomplete → New
Ike Panhc (ikepanhc) wrote :

Set to won't fix for 4.15 since it can not be clean cherry-picked

Ike Panhc (ikepanhc) on 2020-01-01
Changed in kunpeng920:
status: New → In Progress
Ike Panhc (ikepanhc) on 2020-01-02
description: updated
Ike Panhc (ikepanhc) on 2020-01-02
Changed in linux (Ubuntu Focal):
status: New → Fix Released
Changed in linux (Ubuntu Eoan):
assignee: nobody → Ike Panhc (ikepanhc)
status: New → In Progress
Changed in linux (Ubuntu Disco):
assignee: nobody → Ike Panhc (ikepanhc)
status: New → In Progress
Ike Panhc (ikepanhc) on 2020-01-06
description: updated
description: updated
Changed in linux (Ubuntu Disco):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Eoan):
status: In Progress → Fix Committed
Ike Panhc (ikepanhc) on 2020-01-08
Changed in kunpeng920:
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
Ike Panhc (ikepanhc) wrote :

Thanks. Ubuntu-5.0.0-40.44 works for me.

tags: added: verification-done-disco
removed: verification-needed-disco
Launchpad Janitor (janitor) wrote :
Download full text (22.6 KiB)

This bug was fixed in the package linux - 5.0.0-40.44

---------------
linux (5.0.0-40.44) disco; urgency=medium

  * disco/linux: 5.0.0-40.44 -proposed tracker (LP: #1859724)

  * use-after-free in i915_ppgtt_close (LP: #1859522) // CVE-2020-7053
    - SAUCE: drm/i915: Fix use-after-free when destroying GEM context

  * CVE-2019-14615
    - drm/i915/gen9: Clear residual context state on context switch

  * System hang with kernel traces while entering reboot process on a Disco
    ARM64 moonshot node (LP: #1859582)
    - Revert "RDMA/cm: Fix memory leak in cm_add/remove_one"

linux (5.0.0-39.43) disco; urgency=medium

  * disco/linux: 5.0.0-39.43 -proposed tracker (LP: #1858547)

  * [Regression] usb usb2-port2: Cannot enable. Maybe the USB cable is bad?
    (LP: #1856608)
    - SAUCE: Revert "usb: handle warm-reset port requests on hub resume"

  * PAN is broken for execute-only user mappings on ARMv8 (LP: #1858815)
    - arm64: Revert support for execute-only user mappings

  * Fix unusable USB hub on Dell TB16 after S3 (LP: #1855312)
    - SAUCE: USB: core: Make port power cycle a seperate helper function
    - SAUCE: USB: core: Attempt power cycle port when it's in eSS.Disabled state

  * [sas-1126]scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()
    (LP: #1853992)
    - scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()

  * [sas-1126]scsi: hisi_sas: Assign NCQ tag for all NCQ commands (LP: #1853995)
    - scsi: hisi_sas: Assign NCQ tag for all NCQ commands

  * [sas-1126]scsi: hisi_sas: Fix the conflict between device gone and host
    reset (LP: #1853997)
    - scsi: hisi_sas: Fix the conflict between device gone and host reset

  * scsi: hisi_sas: Check sas_port before using it (LP: #1855952)
    - scsi: hisi_sas: Check sas_port before using it

  * CVE-2019-18885
    - btrfs: refactor btrfs_find_device() take fs_devices as argument
    - btrfs: merge btrfs_find_device and find_device

  * Integrate Intel SGX driver into linux-azure (LP: #1844245)
    - [Packaging] Add systemd service to load intel_sgx

  * [SRU][B/OEM-B/OEM-OSP1/D/E/F] Add LG I2C touchscreen multitouch support
    (LP: #1857541)
    - SAUCE: HID: multitouch: Add LG MELF0410 I2C touchscreen support

  * cifs: DFS Caching feature causing problems traversing multi-tier DFS setups
    (LP: #1854887)
    - cifs: Fix retrieval of DFS referrals in cifs_mount()

  * qede driver causes 100% CPU load (LP: #1855409)
    - qede: Handle infinite driver spinning for Tx timestamp.

  * [roce-1126]RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver
    (LP: #1853989)
    - RDMA/hns: Bugfix for slab-out-of-bounds when unloading hip08 driver
    - RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver

  * [roce-1126]RDMA/hns: Fixs hw access invalid dma memory error (LP: #1853990)
    - RDMA/hns: Fixs hw access invalid dma memory error

  * [hns-1126]net: hns3: revert to old channel when setting new channel num fail
    (LP: #1853983)
    - net: hns3: revert to old channel when setting new channel num fail

  * [hns-1126]net: hns3: fix port setting handle for fibre port
    (LP: #1853984)
    - net: hns3: fix port setting handle for fibre...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Ike Panhc (ikepanhc) wrote :

linux 5.3.0-40.32 works fine for me. Thanks.

tags: added: verification-done-eoan
removed: verification-needed-eoan
Launchpad Janitor (janitor) wrote :
Download full text (78.1 KiB)

This bug was fixed in the package linux - 5.3.0-40.32

---------------
linux (5.3.0-40.32) eoan; urgency=medium

  * eoan/linux: 5.3.0-40.32 -proposed tracker (LP: #1861214)

  * No sof soundcard for 'ASoC: CODEC DAI intel-hdmi-hifi1 not registered' after
    modprobe sof (LP: #1860248)
    - ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers

  * ocfs2-tools is causing kernel panics in Ubuntu Focal (Ubuntu-5.4.0-9.12)
    (LP: #1852122)
    - ocfs2: fix the crash due to call ocfs2_get_dlm_debug once less

  * QAT drivers for C3XXX and C62X not included as modules (LP: #1845959)
    - [Config] CRYPTO_DEV_QAT_C3XXX=m, CRYPTO_DEV_QAT_C62X=m and
      CRYPTO_DEV_QAT_DH895xCC=m

  * Eoan update: upstream stable patchset 2020-01-24 (LP: #1860816)
    - scsi: lpfc: Fix discovery failures when target device connectivity bounces
    - scsi: mpt3sas: Fix clear pending bit in ioctl status
    - scsi: lpfc: Fix locking on mailbox command completion
    - Input: atmel_mxt_ts - disable IRQ across suspend
    - f2fs: fix to update time in lazytime mode
    - iommu: rockchip: Free domain on .domain_free
    - iommu/tegra-smmu: Fix page tables in > 4 GiB memory
    - dmaengine: xilinx_dma: Clear desc_pendingcount in xilinx_dma_reset
    - scsi: target: compare full CHAP_A Algorithm strings
    - scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
    - scsi: csiostor: Don't enable IRQs too early
    - scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()
    - powerpc/pseries: Mark accumulate_stolen_time() as notrace
    - powerpc/pseries: Don't fail hash page table insert for bolted mapping
    - powerpc/tools: Don't quote $objdump in scripts
    - dma-debug: add a schedule point in debug_dma_dump_mappings()
    - leds: lm3692x: Handle failure to probe the regulator
    - clocksource/drivers/asm9260: Add a check for of_clk_get
    - clocksource/drivers/timer-of: Use unique device name instead of timer
    - powerpc/security/book3s64: Report L1TF status in sysfs
    - powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
    - ext4: update direct I/O read lock pattern for IOCB_NOWAIT
    - ext4: iomap that extends beyond EOF should be marked dirty
    - jbd2: Fix statistics for the number of logged blocks
    - scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
    - scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
    - f2fs: fix to update dir's i_pino during cross_rename
    - clk: qcom: Allow constant ratio freq tables for rcg
    - clk: clk-gpio: propagate rate change to parent
    - irqchip/irq-bcm7038-l1: Enable parent IRQ if necessary
    - irqchip: ingenic: Error out if IRQ domain creation failed
    - fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
    - scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
    - PCI: rpaphp: Fix up pointer to first drc-info entry
    - scsi: ufs: fix potential bug which ends in system hang
    - powerpc/pseries/cmm: Implement release() function for sysfs device
    - PCI: rpaphp: Don't rely on firmware feature to imply drc-info support
    - PCI: rpaphp: Annotate and corr...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Andrew Cloke (andrew-cloke) wrote :

Updating kunpeng920 19.10 series to match linux eoan series.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers