qede driver causes 100% CPU load

Bug #1855409 reported by Przemyslaw Hausman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Guilherme G. Piccoli
Xenial
Invalid
Undecided
Guilherme G. Piccoli
Bionic
Fix Released
Medium
Guilherme G. Piccoli
Disco
Fix Released
Medium
Guilherme G. Piccoli
Eoan
Fix Released
Undecided
Guilherme G. Piccoli
Focal
Fix Released
Undecided
Guilherme G. Piccoli

Bug Description

[Impact]

* The PTP feature in qede driver is implemented in a way that if the NIC firmware takes some time to perform the timestamping then the PTP worker function will reschedule itself indefinitely until the value read from a device register is meaningful. With that behavior, if an userspace tool requests a bad configured TX/RX filter (or if NIC firmware has any other issue in timestamping), the function qede_ptp_task() will reschedule itself forever and cause an unbound resource consumption. This manifests as a kworker thread consuming 100% of CPU.

* The dmesg log will show a message like this: "qede_ptp_tx_ts:533(eno3)]Timestamping in progress"

Also, by using perf user can observe a stack like the following:
- 44.76% 0.00% kworker/16:5 [kernel.kallsyms]
     ret_from_fork
   - kthread
      - 44.74% worker_thread
         - 44.57% process_one_work
            - 42.67% qede_ptp_task
               - 38.86% qed_ptp_hw_read_tx_ts
                    qed_rd
               - 3.03% queue_work_on
                  - 2.06% __queue_work
                     - 0.68% get_work_pool
                        - 0.61% radix_tree_lookup
                             __radix_tree_lookup
              0.50% set_work_pool_and_clear_pending

* The patch proposed in this SRU request refactors the PTP worked in qede by adding a time limit, after which the task doesn't reschedule itself anymore, failing the timestamp procedure: 9adebac37e7d ("qede: Handle infinite driver spinning for Tx timestamp.") http://git.kernel.org/linus/9adebac37e7d

Besides fixing the issue, it also adds an ethtool statistics for accounting the PTP errors.

[Test case]

By using chrony in Bionic, the following steps will reproduce the issue:

a) Install chrony on Bionic in a system with working NIC managed by qede;
b) Edit chrony configuration and add: "hwtimestamp *" to the top of its conf file;
c) Restart chrony service

Check dmesg for the "[...]Timestamping in progress" message and the overall CPU workload using a tool like "top" to observe a kthread consuming 100% of CPU.

[Regression potential]

The patch scope is restricted to qede PTP handler, and is upstream for more than 7 months. If there's any possibility of regressions, the worst would be an issue affecting the packet timestamping, not messing with the regular xmit path of the driver.

Revision history for this message
Przemyslaw Hausman (phausman) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1855409

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Changed in linux (Ubuntu Xenial):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Disco):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Eoan):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Focal):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
status: Incomplete → New
Changed in linux (Ubuntu Disco):
status: New → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu Xenial):
status: New → Invalid
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks for the report @phausman! I've checked mainline and there's a commit that matches your report, we should try this one: 9adebac37e7d ("qede: Handle infinite driver spinning for Tx timestamp.") [0].

This is present in Eoan and Focal, but not in Bionic/Disco. After we confirm this is the fix we can proceed with the SRU - in case this patch doesn't help to alleviate the problem, then Eoan/Focal will be targeted too and we'll need to think in a proper solution.

Notice Xenial GA kernel (v4.4) does not have PTP support in qede driver, so this issue doesn't apply to Xenial (and once it's fixed in Bionic it'll reach organically Xenial-HWE kernel).

I'll work in building a 4.15 kernel for Bionic with the aforementioned fix, in order you can test.
Cheers,

Guilherme

[0] http://git.kernel.org/linus/9adebac37e7d

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1855409

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Eoan):
status: New → Incomplete
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

I've built a kernel for Bionic with the commit 9adebac37e7d ("qede: Handle infinite driver spinning for Tx timestamp.") in order to validate if such patch fixes or at least alleviate the symptom.
It's built on the following PPA: https://launchpad.net/~gpiccoli/+archive/ubuntu/test1855409

@phausman or anybody affected, I appreciate if you can give it a try, I don't have the HW currently for the experiment.

Thanks,

Guilherme

Changed in linux (Ubuntu Focal):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Eoan):
status: Incomplete → Fix Released
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

The kernel commit 9adebac37e7d ("qede: Handle infinite driver spinning for Tx timestamp.") is present in kernels 5.3 and subsequent ones, and after some preliminary results from @phausman, seems the commit properly fixes the issue.

Hence, we will start the SRU process for Bionic/Disco.
Cheers,

Guilherme

Revision history for this message
Przemyslaw Hausman (phausman) wrote :

@gpiccoli, I can now confirm that I've been running the system with the test kernel from your PPA for a day and the issue did not appear anymore. Looks like the upstream commit fixes the problem.

Thanks a lot for figuring it out and for providing the test kernel!

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Przemyslaw, thank you for the great report and debug! I'll proceed with the SRU proposal.
Cheers,

Guilherme

tags: added: disco sts
description: updated
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
Changed in linux (Ubuntu Disco):
importance: Undecided → Medium
Changed in linux (Ubuntu Bionic):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Disco):
status: Confirmed → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
Revision history for this message
Khaled El Mously (kmously) wrote :

I reached out to phausman on #canonical-support

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Unfortunately seems we don't have hardware to fully verify the patch. But given it's a minor fix, restricted to a single driver and upstream for a while (also, it's on E/F already), I consider that safe to ship without full verification.

Also, I've download the Disco kernel source, and validated the fix against the upstream patch - all fine. Hence, I'm marking Disco as verified here.
Cheers,

Guilherme

tags: added: verification-done-disco
removed: verification-needed-disco
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (22.6 KiB)

This bug was fixed in the package linux - 5.0.0-40.44

---------------
linux (5.0.0-40.44) disco; urgency=medium

  * disco/linux: 5.0.0-40.44 -proposed tracker (LP: #1859724)

  * use-after-free in i915_ppgtt_close (LP: #1859522) // CVE-2020-7053
    - SAUCE: drm/i915: Fix use-after-free when destroying GEM context

  * CVE-2019-14615
    - drm/i915/gen9: Clear residual context state on context switch

  * System hang with kernel traces while entering reboot process on a Disco
    ARM64 moonshot node (LP: #1859582)
    - Revert "RDMA/cm: Fix memory leak in cm_add/remove_one"

linux (5.0.0-39.43) disco; urgency=medium

  * disco/linux: 5.0.0-39.43 -proposed tracker (LP: #1858547)

  * [Regression] usb usb2-port2: Cannot enable. Maybe the USB cable is bad?
    (LP: #1856608)
    - SAUCE: Revert "usb: handle warm-reset port requests on hub resume"

  * PAN is broken for execute-only user mappings on ARMv8 (LP: #1858815)
    - arm64: Revert support for execute-only user mappings

  * Fix unusable USB hub on Dell TB16 after S3 (LP: #1855312)
    - SAUCE: USB: core: Make port power cycle a seperate helper function
    - SAUCE: USB: core: Attempt power cycle port when it's in eSS.Disabled state

  * [sas-1126]scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()
    (LP: #1853992)
    - scsi: hisi_sas: Fix out of bound at debug_I_T_nexus_reset()

  * [sas-1126]scsi: hisi_sas: Assign NCQ tag for all NCQ commands (LP: #1853995)
    - scsi: hisi_sas: Assign NCQ tag for all NCQ commands

  * [sas-1126]scsi: hisi_sas: Fix the conflict between device gone and host
    reset (LP: #1853997)
    - scsi: hisi_sas: Fix the conflict between device gone and host reset

  * scsi: hisi_sas: Check sas_port before using it (LP: #1855952)
    - scsi: hisi_sas: Check sas_port before using it

  * CVE-2019-18885
    - btrfs: refactor btrfs_find_device() take fs_devices as argument
    - btrfs: merge btrfs_find_device and find_device

  * Integrate Intel SGX driver into linux-azure (LP: #1844245)
    - [Packaging] Add systemd service to load intel_sgx

  * [SRU][B/OEM-B/OEM-OSP1/D/E/F] Add LG I2C touchscreen multitouch support
    (LP: #1857541)
    - SAUCE: HID: multitouch: Add LG MELF0410 I2C touchscreen support

  * cifs: DFS Caching feature causing problems traversing multi-tier DFS setups
    (LP: #1854887)
    - cifs: Fix retrieval of DFS referrals in cifs_mount()

  * qede driver causes 100% CPU load (LP: #1855409)
    - qede: Handle infinite driver spinning for Tx timestamp.

  * [roce-1126]RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver
    (LP: #1853989)
    - RDMA/hns: Bugfix for slab-out-of-bounds when unloading hip08 driver
    - RDMA/hns: bugfix for slab-out-of-bounds when loading hip08 driver

  * [roce-1126]RDMA/hns: Fixs hw access invalid dma memory error (LP: #1853990)
    - RDMA/hns: Fixs hw access invalid dma memory error

  * [hns-1126]net: hns3: revert to old channel when setting new channel num fail
    (LP: #1853983)
    - net: hns3: revert to old channel when setting new channel num fail

  * [hns-1126]net: hns3: fix port setting handle for fibre port
    (LP: #1853984)
    - net: hns3: fix port setting handle for fibre...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Khaled El Mously (kmously) wrote :

Thanks @gpiccoli
Doing the same for bionic, based on the same logic as comment #12

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thank you Khaled! I've just checked the source from bionic-proposed (4.15.0-88) and the patch is there, as expected - I've verified that comparing with the upstream patch, it's all correct.

Given we don't have the HW and the bug reporter seems to be unable to verify it (and the patch is self-contained/straightforward), we can consider it ready to go!

Cheers,

Guilherme

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (79.8 KiB)

This bug was fixed in the package linux - 4.15.0-88.88

---------------
linux (4.15.0-88.88) bionic; urgency=medium

  * bionic/linux: 4.15.0-88.88 -proposed tracker (LP: #1862824)

  * Segmentation fault (kernel oops) with memory-hotplug in
    ubuntu_kernel_selftests on Bionic kernel (LP: #1862312)
    - Revert "mm/memory_hotplug: fix online/offline_pages called w.o.
      mem_hotplug_lock"
    - mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock

linux (4.15.0-87.87) bionic; urgency=medium

  * bionic/linux: 4.15.0-87.87 -proposed tracker (LP: #1861165)

  * Bionic update: upstream stable patchset 2020-01-22 (LP: #1860602)
    - scsi: lpfc: Fix discovery failures when target device connectivity bounces
    - scsi: mpt3sas: Fix clear pending bit in ioctl status
    - scsi: lpfc: Fix locking on mailbox command completion
    - Input: atmel_mxt_ts - disable IRQ across suspend
    - iommu/tegra-smmu: Fix page tables in > 4 GiB memory
    - scsi: target: compare full CHAP_A Algorithm strings
    - scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
    - scsi: csiostor: Don't enable IRQs too early
    - powerpc/pseries: Mark accumulate_stolen_time() as notrace
    - powerpc/pseries: Don't fail hash page table insert for bolted mapping
    - powerpc/tools: Don't quote $objdump in scripts
    - dma-debug: add a schedule point in debug_dma_dump_mappings()
    - clocksource/drivers/asm9260: Add a check for of_clk_get
    - powerpc/security/book3s64: Report L1TF status in sysfs
    - powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
    - ext4: update direct I/O read lock pattern for IOCB_NOWAIT
    - jbd2: Fix statistics for the number of logged blocks
    - scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
    - scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
    - f2fs: fix to update dir's i_pino during cross_rename
    - clk: qcom: Allow constant ratio freq tables for rcg
    - irqchip/irq-bcm7038-l1: Enable parent IRQ if necessary
    - irqchip: ingenic: Error out if IRQ domain creation failed
    - fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
    - scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
    - scsi: ufs: fix potential bug which ends in system hang
    - powerpc/pseries/cmm: Implement release() function for sysfs device
    - powerpc/security: Fix wrong message when RFI Flush is disable
    - scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE
    - clk: pxa: fix one of the pxa RTC clocks
    - bcache: at least try to shrink 1 node in bch_mca_scan()
    - HID: logitech-hidpp: Silence intermittent get_battery_capacity errors
    - libnvdimm/btt: fix variable 'rc' set but not used
    - HID: Improve Windows Precision Touchpad detection.
    - scsi: pm80xx: Fix for SATA device discovery
    - scsi: ufs: Fix error handing during hibern8 enter
    - scsi: scsi_debug: num_tgts must be >= 0
    - scsi: NCR5380: Add disconnect_mask module parameter
    - scsi: iscsi: Don't send data to unbound connection
    - scsi: target: iscsi: Wait for all commands to finish before freeing a
...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.