Activity log for bug #1855409

Date Who What changed Old value New value Message
2019-12-06 08:02:19 Przemyslaw Hausman bug added bug
2019-12-06 08:08:47 Przemyslaw Hausman attachment added perf-report.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1855409/+attachment/5310185/+files/perf-report.txt
2019-12-06 08:30:07 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-12-06 08:30:09 Ubuntu Kernel Bot tags bionic
2019-12-09 18:36:21 Guilherme G. Piccoli nominated for series Ubuntu Disco
2019-12-09 18:36:21 Guilherme G. Piccoli bug task added linux (Ubuntu Disco)
2019-12-09 18:36:21 Guilherme G. Piccoli nominated for series Ubuntu Focal
2019-12-09 18:36:21 Guilherme G. Piccoli bug task added linux (Ubuntu Focal)
2019-12-09 18:36:21 Guilherme G. Piccoli nominated for series Ubuntu Bionic
2019-12-09 18:36:21 Guilherme G. Piccoli bug task added linux (Ubuntu Bionic)
2019-12-09 18:36:21 Guilherme G. Piccoli nominated for series Ubuntu Xenial
2019-12-09 18:36:21 Guilherme G. Piccoli bug task added linux (Ubuntu Xenial)
2019-12-09 18:36:21 Guilherme G. Piccoli nominated for series Ubuntu Eoan
2019-12-09 18:36:21 Guilherme G. Piccoli bug task added linux (Ubuntu Eoan)
2019-12-09 18:36:33 Guilherme G. Piccoli linux (Ubuntu Xenial): assignee Guilherme G. Piccoli (gpiccoli)
2019-12-09 18:36:36 Guilherme G. Piccoli linux (Ubuntu Bionic): assignee Guilherme G. Piccoli (gpiccoli)
2019-12-09 18:36:39 Guilherme G. Piccoli linux (Ubuntu Disco): assignee Guilherme G. Piccoli (gpiccoli)
2019-12-09 18:36:40 Guilherme G. Piccoli linux (Ubuntu Eoan): assignee Guilherme G. Piccoli (gpiccoli)
2019-12-09 18:36:43 Guilherme G. Piccoli linux (Ubuntu Focal): assignee Guilherme G. Piccoli (gpiccoli)
2019-12-09 18:36:59 Guilherme G. Piccoli linux (Ubuntu Focal): status Incomplete New
2019-12-09 18:37:05 Guilherme G. Piccoli linux (Ubuntu Disco): status New Confirmed
2019-12-09 18:37:09 Guilherme G. Piccoli linux (Ubuntu Bionic): status New Confirmed
2019-12-09 18:37:24 Guilherme G. Piccoli linux (Ubuntu Xenial): status New Invalid
2019-12-09 19:00:09 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-12-09 19:00:13 Ubuntu Kernel Bot linux (Ubuntu Eoan): status New Incomplete
2019-12-17 17:38:33 Guilherme G. Piccoli linux (Ubuntu Focal): status Incomplete Fix Released
2019-12-17 17:38:36 Guilherme G. Piccoli linux (Ubuntu Eoan): status Incomplete Fix Released
2019-12-18 14:41:51 Guilherme G. Piccoli tags bionic bionic disco sts
2019-12-18 18:47:16 Guilherme G. Piccoli description This bug is similar to #1832082 (bnx2x driver causes 100% CPU load) but applies for qede driver instead of bnx2x. The symptoms are the same: With chrony installed, and configured with "hwtimestamp *", I observe 100% CPU load on 2 CPU cores. Running perf report shows that kernel is busy executing qede_ptp_task function in qede driver. A workaround is to disable "hwtimestamp *" in chrony configuration. --- $ modinfo qede filename: /lib/modules/4.15.0-72-generic/kernel/drivers/net/ethernet/qlogic/qede/qede.ko version: 8.10.10.21 license: GPL description: QLogic FastLinQ 4xxxx Ethernet Driver srcversion: D5EC89D815FC81B973EE9F0 alias: pci:v00001077d00008090sv*sd*bc*sc*i* alias: pci:v00001077d00008070sv*sd*bc*sc*i* alias: pci:v00001077d00001664sv*sd*bc*sc*i* alias: pci:v00001077d00001656sv*sd*bc*sc*i* alias: pci:v00001077d00001654sv*sd*bc*sc*i* alias: pci:v00001077d00001644sv*sd*bc*sc*i* alias: pci:v00001077d00001636sv*sd*bc*sc*i* alias: pci:v00001077d00001666sv*sd*bc*sc*i* alias: pci:v00001077d00001634sv*sd*bc*sc*i* depends: ptp,qed retpoline: Y intree: Y name: qede vermagic: 4.15.0-72-generic SMP mod_unload signat: PKCS#7 signer: sig_key: sig_hashalgo: md4 parm: debug: Default debug msglevel (uint) $ uname -a Linux dcn1-clm-inf-1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ lspci | grep -i ether 19:00.0 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02) 19:00.1 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02) 19:00.2 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02) 19:00.3 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE Controller (rev 02) # perf report snippet: Children Self Command Shared Object - 44.76% 0.00% kworker/16:5 [kernel.kallsyms] ret_from_fork - kthread - 44.74% worker_thread - 44.57% process_one_work - 42.67% qede_ptp_task - 38.86% qed_ptp_hw_read_tx_ts qed_rd - 3.03% queue_work_on - 2.06% __queue_work - 0.68% get_work_pool - 0.61% radix_tree_lookup __radix_tree_lookup 0.50% set_work_pool_and_clear_pending [Impact] * The PTP feature in qede driver is implemented in a way that if the NIC firmware takes some time to perform the timestamping then the PTP worker function will reschedule itself indefinitely until the value read from a device register is meaningful. With that behavior, if an userspace tool requests a bad configured TX/RX filter (or if NIC firmware has any other issue in timestamping), the function qede_ptp_task() will reschedule itself forever and cause an unbound resource consumption. This manifests as a kworker thread consuming 100% of CPU. * The dmesg log will show a message like this: "qede_ptp_tx_ts:533(eno3)]Timestamping in progress" Also, by using perf user can observe a stack like the following: - 44.76% 0.00% kworker/16:5 [kernel.kallsyms] ret_from_fork - kthread - 44.74% worker_thread - 44.57% process_one_work - 42.67% qede_ptp_task - 38.86% qed_ptp_hw_read_tx_ts qed_rd - 3.03% queue_work_on - 2.06% __queue_work - 0.68% get_work_pool - 0.61% radix_tree_lookup __radix_tree_lookup 0.50% set_work_pool_and_clear_pending * The patch proposed in this SRU request refactors the PTP worked in qede by adding a time limit, after which the task doesn't reschedule itself anymore, failing the timestamp procedure: 9adebac37e7d ("qede: Handle infinite driver spinning for Tx timestamp.") http://git.kernel.org/linus/9adebac37e7d Besides fixing the issue, it also adds an ethtool statistics for accounting the PTP errors. [Test case] By using chrony in Bionic, the following steps will reproduce the issue: a) Install chrony on Bionic in a system with working NIC managed by qede; b) Edit chrony configuration and add: "hwtimestamp *" to the top of its conf file; c) Restart chrony service Check dmesg for the "[...]Timestamping in progress" message and the overall CPU workload using a tool like "top" to observe a kthread consuming 100% of CPU. [Regression potential] The patch scope is restricted to qede PTP handler, and is upstream for more than 7 months. If there's any possibility of regressions, the worst would be an issue affecting the packet timestamping, not messing with the regular xmit path of the driver.
2020-01-07 12:59:52 Stefan Bader linux (Ubuntu Bionic): importance Undecided Medium
2020-01-07 12:59:59 Stefan Bader linux (Ubuntu Disco): importance Undecided Medium
2020-01-07 13:07:35 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status Confirmed Fix Committed
2020-01-07 13:07:37 Kleber Sacilotto de Souza linux (Ubuntu Disco): status Confirmed Fix Committed
2020-01-10 18:03:10 Ubuntu Kernel Bot tags bionic disco sts bionic disco sts verification-needed-disco
2020-01-24 13:06:38 Guilherme G. Piccoli tags bionic disco sts verification-needed-disco bionic disco sts verification-done-disco
2020-01-27 13:21:23 Launchpad Janitor linux (Ubuntu Disco): status Fix Committed Fix Released
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-14615
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-18885
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-19050
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-19077
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-19078
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-19082
2020-01-27 13:21:23 Launchpad Janitor cve linked 2019-19332
2020-01-27 13:21:23 Launchpad Janitor cve linked 2020-7053
2020-02-03 23:12:21 Ubuntu Kernel Bot tags bionic disco sts verification-done-disco bionic disco sts verification-done-disco verification-needed-bionic
2020-02-13 03:17:08 Khaled El Mously tags bionic disco sts verification-done-disco verification-needed-bionic bionic disco sts verification-done-bionic verification-done-disco
2020-02-17 10:36:02 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2020-02-17 10:36:02 Launchpad Janitor cve linked 2019-20096
2020-02-17 10:36:02 Launchpad Janitor cve linked 2019-5108