bnx2x: fatal hardware error/reboot/tx timeout with LLDP enabled

Bug #1840789 reported by Mauricio Faria de Oliveira on 2019-08-20
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Status tracked in Eoan
Xenial
Critical
Mauricio Faria de Oliveira
Bionic
Critical
Mauricio Faria de Oliveira
Disco
Critical
Mauricio Faria de Oliveira
Eoan
Critical
Mauricio Faria de Oliveira

Bug Description

[Impact]

 * The bnx2x driver may cause hardware faults (leading to
   panic/reboot) and other behaviors as transmit timeouts,
   after commit 3968d38917eb ("bnx2x: Fix Multi-Cos.") is
   introduced.

 * This issue has been observed by an user shortly
   after starting docker & kubelet, with adapters:
   - Broadcom NetXtreme II BCM57800 [14e4:168a] from Dell [1028:1f5c]
   - Broadcom NetXtreme II BCM57840 [14e4:16a1] from Dell [1028:1f79]

 * If options to ignore hardware faults are used
   (erst_disable=1 hest_disable=1 ghes.disable=1)
   the system doesn't panic/reboot and continues
   on to timeout on adapter stats, then transmit
   timeouts, spewing some adapter firmware dumps,
   but the network interface is non-functional.

 * The issue only happened when LLDP is enabled
   on the network switches, and crashdump shows
   the bnx2x driver is stuck/waits for firmware
   to complete the stop traffic command in LLDP
   handling. Workaround used is to disable LLDP
   in the network switches/ports.

 * Analysis of the driver and firmware dumps
   didn't help significantly towards finding
   the root cause.

 * Upstream/mainline recently just reverted the
   patch, due to similar problem reports, while
   looking for the root cause/proper fix.

[Test Case]

 * No reproducible test case found outside
   the user's systems/cluster, where it is
   enough to start docker & kubelet & wait.

 * The user verified test kernels for Xenial
   and Bionic - the problem does not happen;
   build-tested on Disco.

[Regression Potential]

 * Users who significantly use/apply the non-default
   traffic class (tc) / class of service (cos) might
   possibly see performance changes (if any at all)
   in such applications, however that's unclear now.

 * This is a recent revert upstream (v5.3-rc'ish),
   so there's chance things might change in this area.

 * Nonetheless, the patch is authored by the driver
   vendor, and made its way into stable kernels
   (e.g., v5.2.8 which made Eoan/19.10 recently).

Tags: sts Edit Tag help
Changed in linux (Ubuntu):
status: New → In Progress
assignee: nobody → Mauricio Faria de Oliveira (mfo)

This fix is already present in Eoan and Unstable:

~/git/ubuntu-eoan$ git log --oneline origin/master-next -- drivers/net/ethernet/broadcom/bnx2x/ | head | grep cos
1c41d7b7cf60 bnx2x: Disable multi-cos feature.

~/git/ubuntu-eoan$ git describe --contains 1c41d7b7cf60
Ubuntu-5.2.0-12.13~51

~/git/ubuntu-unstable$ git log --oneline origin/master -- drivers/net/ethernet/broadcom/bnx2x/ | head | grep cos
d1f0b5dce8fd bnx2x: Disable multi-cos feature.
~/git/ubuntu-unstable$ git describe --contains d1f0b5dce8fd
Ubuntu-5.3.0-4.5~313^2~91

description: updated
Changed in linux (Ubuntu Eoan):
status: In Progress → Fix Released
Changed in linux (Ubuntu Disco):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
Changed in linux (Ubuntu Disco):
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Mauricio Faria de Oliveira (mfo)
description: updated

For documentation purposes, in a recent Xenial/4.4 kernel,
this kernel error log is seen (with options to ignore the
hardware error/fault that panics/reboots the system).

[ 113.658876] bnx2x: [bnx2x_stats_comp:205(eno1)]timeout waiting for stats finished
[ 123.648066] bnx2x: [bnx2x_state_wait:310(eno1)]timeout waiting for state 6
[ 123.730345] bnx2x: [bnx2x_dcbx_stop_hw_tx:443(eno1)]Unable to hold traffic for HW configuration
[ 123.834443] bnx2x: [bnx2x_dcbx_stop_hw_tx:444(eno1)]driver assert
[ 123.907439] bnx2x: [bnx2x_panic_dump:919(eno1)]begin crash dump -----------------
...
[ 123.907662] bnx2x 0000:19:00.0 eno1: bc 7.14.11
[ 123.907666] begin fw dump (mark 0x3c65c8)
[ 123.908033] end of fw dump
[ 123.908048] bnx2x: [bnx2x_mc_assert:751(eno1)]Chip Revision: everest3, FW Version: 7_12_30
[ 123.908049] bnx2x: [bnx2x_panic_dump:1182(eno1)]end crash dump -----------------
[ 128.701944] bnx2x: [bnx2x_func_state_change:6306(eno1)]timeout waiting for previous ramrod completion
[ 128.701946] bnx2x: [bnx2x_dcbx_resume_hw_tx:469(eno1)]Unable to resume traffic after HW configuration
[ 128.701946] bnx2x: [bnx2x_dcbx_resume_hw_tx:470(eno1)]driver assert
[ 128.701948] bnx2x: [bnx2x_panic_dump:919(eno1)]begin crash dump -----------------
...
[ 128.702170] bnx2x 0000:19:00.0 eno1: bc 7.14.11
[ 128.702173] begin fw dump (mark 0x3c65c8)
[ 128.702542] end of fw dump
[ 128.702557] bnx2x: [bnx2x_mc_assert:751(eno1)]Chip Revision: everest3, FW Version: 7_12_30
[ 128.702558] bnx2x: [bnx2x_panic_dump:1182(eno1)]end crash dump -----------------
[ 128.702565] bnx2x: [bnx2x_sp_rtnl_task:10229(eno1)]Indicating link is down due to Tx-timeout
[ 130.704628] bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[0]: txdata->tx_pkt_prod(4) != txdata->tx_pkt_cons(3)
[ 132.706968] bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[8]: txdata->tx_pkt_prod(445) != txdata->tx_pkt_cons(443)
[ 134.710090] bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[16]: txdata->tx_pkt_prod(29) != txdata->tx_pkt_cons(25)
...
[ 202.648543] bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[7]: txdata->tx_pkt_prod(25) != txdata->tx_pkt_cons(24)
[ 204.792441] bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[23]: txdata->tx_pkt_prod(51) != txdata->tx_pkt_cons(46)
[ 204.940151] bnx2x: [bnx2x_del_all_macs:8499(eno1)]Failed to delete MACs: -5
[ 205.023453] bnx2x: [bnx2x_chip_cleanup:9319(eno1)]Failed to schedule DEL commands for UC MACs list: -5
[ 206.351810] bnx2x: [bnx2x_func_stop:9078(eno1)]FUNC_STOP ramrod failed. Running a dry transaction
[ 206.778590] bnx2x: [bnx2x_issue_dmae_with_comp:550(eno1)]DMAE timeout!
[ 206.856735] bnx2x: [bnx2x_write_dmae:598(eno1)]DMAE returned failure -1
[ 207.134674] bnx2x: [bnx2x_issue_dmae_with_comp:550(eno1)]DMAE timeout!
[ 207.212785] bnx2x: [bnx2x_write_dmae:598(eno1)]DMAE returned failure -1
[ 207.490725] bnx2x: [bnx2x_issue_dmae_with_comp:550(eno1)]DMAE timeout!
...

Somewhat similarly on recent 5.2 kernel without the fix.
(again with options to ignore hardware errors/faults)

Aug 19 17:15:15 HOSTNAME kernel: Uhhuh. NMI received for unknown reason 21 on CPU 0.
Aug 19 17:15:15 HOSTNAME kernel: perf interrupt took too long (3222 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
Aug 19 17:15:15 HOSTNAME kernel: TCP: request_sock_TCP: Possible SYN flooding on port 9300. Sending cookies. Check SNMP counters.
Aug 19 17:15:15 HOSTNAME kernel: Do you have a strange power saving mode enabled?
Aug 19 17:15:15 HOSTNAME kernel: Dazed and confused, but trying to continue
...
Aug 19 17:15:21 HOSTNAME kernel: NETDEV WATCHDOG: eno1 (bnx2x): transmit queue 0 timed out
...
Aug 19 17:15:21 HOSTNAME kernel: bnx2x: [bnx2x_sp_rtnl_task:10229(eno1)]Indicating link is down due to Tx-timeout
Aug 19 17:15:21 HOSTNAME kernel: bond0: link status down for interface eno1, disabling it in 200 ms
Aug 19 17:15:21 HOSTNAME kernel: bnx2x: [bnx2x_stats_comp:205(eno1)]timeout waiting for stats finished
Aug 19 17:15:21 HOSTNAME kernel: bnx2x: [bnx2x_stats_comp:205(eno1)]timeout waiting for stats finished
Aug 19 17:15:23 HOSTNAME kernel: bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[0]: txdata->tx_pkt_prod(4) != txdata->tx_pkt_cons(2)
Aug 19 17:15:25 HOSTNAME kernel: bnx2x: [bnx2x_clean_tx_queue:1204(eno1)]timeout waiting for queue[8]: txdata->tx_pkt_prod(1) != txdata->tx_pkt_cons(0)
...
Aug 19 17:17:14 HOSTNAME kernel: bnx2x: [bnx2x_state_wait:310(eno1)]timeout waiting for state 0
Aug 19 17:17:14 HOSTNAME kernel: bnx2x: [bnx2x_del_all_macs:8499(eno1)]Failed to delete MACs: -16
Aug 19 17:17:14 HOSTNAME kernel: bnx2x: [bnx2x_chip_cleanup:9319(eno1)]Failed to schedule DEL commands for UC MACs list: -16
Aug 19 17:17:24 HOSTNAME kernel: bnx2x: [bnx2x_state_wait:310(eno1)]timeout waiting for state 9
Aug 19 17:17:34 HOSTNAME kernel: bnx2x: [bnx2x_state_wait:310(eno1)]timeout waiting for state 2
Aug 19 17:17:34 HOSTNAME kernel: bnx2x: [bnx2x_func_stop:9078(eno1)]FUNC_STOP ramrod failed. Running a dry transaction
Aug 19 17:17:35 HOSTNAME kernel: bnx2x: [bnx2x_issue_dmae_with_comp:550(eno1)]DMAE timeout!
Aug 19 17:17:35 HOSTNAME kernel: bnx2x: [bnx2x_write_dmae:598(eno1)]DMAE returned failure -1
Aug 19 17:17:35 HOSTNAME kernel: bnx2x: [bnx2x_issue_dmae_with_comp:550(eno1)]DMAE timeout!
...

Download full text (6.7 KiB)

Older crashdump analysis confirmed the bnx2x driver/status
being in traffic class setup / stop hardware in LLDP path.

PID: 3936 TASK: ffff883fdc9b1c00 CPU: 11 COMMAND: "kworker/11:0"
 #0 [ffff883fec593ce0] __schedule at ffffffff81850bae
 #1 [ffff883fec593d30] schedule at ffffffff818510f5
 #2 [ffff883fec593d48] schedule_preempt_disabled at ffffffff8185139e
 #3 [ffff883fec593d58] __mutex_lock_slowpath at ffffffff81852fd9
 #4 [ffff883fec593db0] mutex_lock at ffffffff8185306f
 #5 [ffff883fec593dc8] rtnl_lock at ffffffff81756e15
 #6 [ffff883fec593dd8] bnx2x_sp_rtnl_task at ffffffffc025d8c4 [bnx2x]
 #7 [ffff883fec593e20] process_one_work at ffffffff8109e68b
 #8 [ffff883fec593e60] worker_thread at ffffffff8109e9fb
 #9 [ffff883fec593ec0] kthread at ffffffff810a4dc7
#10 [ffff883fec593f50] ret_from_fork at ffffffff81855735

Check this stack frame:

 #6 [ffff883fec593dd8] bnx2x_sp_rtnl_task at ffffffffc025d8c4 [bnx2x]

Which is 9 x 8-byte/64-bit values long:

 #7 [ffff883fec593e20]

ffff883fec593e20 - ffff883fec593dd8 = 0x48 bytes = 72 bytes = 9 x 8 bytes.

crash> rd ffff883fec593dd8 9
ffff883fec593dd8: ffffffffc025d8c4 ffff883feaa0a178 ..%.....x...?...
ffff883fec593de8: 6199482b89f76272 ffff883fe9571080 rb..+H.a..W.?...
ffff883fec593df8: ffff883ffdf56b40 ffff883ffdf5b400 @k..?.......?...
ffff883fec593e08: 00000000000002c0 ffff881fe93f0dd8 ..........?.....
ffff883fec593e18: ffff883fec593e58 X>Y.?...

The top of the stack has the RIP/next-instruction contents,
which matches what's in the stack frame line.

ffffffffc025d8c4

Looking at the disassembly, it's right after the 'callq rtnl_lock', as expected.

 static void bnx2x_sp_rtnl_task(struct work_struct *work)
 {

  rdi = work

0xffffffffc025d890 <bnx2x_sp_rtnl_task>: nopl 0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc025d895 <bnx2x_sp_rtnl_task+5>: push %rbp
0xffffffffc025d896 <bnx2x_sp_rtnl_task+6>: mov %rsp,%rbp
0xffffffffc025d899 <bnx2x_sp_rtnl_task+9>: push %r15
0xffffffffc025d89b <bnx2x_sp_rtnl_task+11>: push %r14
0xffffffffc025d89d <bnx2x_sp_rtnl_task+13>: push %r13
0xffffffffc025d89f <bnx2x_sp_rtnl_task+15>: push %r12

0xffffffffc025d8a1 <bnx2x_sp_rtnl_task+17>: lea -0x598(%rdi),%r12

^
  struct bnx2x *bp = container_of(work, struct bnx2x, sp_rtnl_task.work);

  r12 = bp

0xffffffffc025d8a8 <bnx2x_sp_rtnl_task+24>: push %rbx
0xffffffffc025d8a9 <bnx2x_sp_rtnl_task+25>: mov %rdi,%rbx

  rbx = rdi = work

<from the future.. stackframe from mutex_lock().. >

work = rbx = 0xffff881fe93f0dd8

 crash> struct work_struct ffff881fe93f0dd8
 struct work_struct {
   data = {
     counter = 704
   },
   entry = {
     next = 0xffff881fe93f0de0,
     prev = 0xffff881fe93f0de0
   },
   func = 0xffffffffc025d890 <bnx2x_sp_rtnl_task>
 }

bp = 0xffff881fe93f0840 (offset in asm above)

 crash> eval 0xffff881fe93f0dd8 - 0x598
 hexadecimal: ffff881fe93f0840
     decimal: 18446612269371426880 (-131804338124736)
       octal: 1777774201775117604100
      binary: 1111111111111111100010000001111111101001001111110000100001000000

 crash> struct bnx2x ffff881fe93f0840
 struct bnx2x {
   fp = 0xffff881fe95c4000,
...

Read more...

[X/B][PATCH] bnx2x: Disable multi-cos feature.
https://lists.ubuntu.com/archives/kernel-team/2019-August/103282.html

[D][PATCH] bnx2x: Disable multi-cos feature.
https://lists.ubuntu.com/archives/kernel-team/2019-August/103283.html

tags: added: sts
Changed in linux (Ubuntu Xenial):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Xenial):
importance: High → Critical
Changed in linux (Ubuntu Bionic):
importance: High → Critical
Changed in linux (Ubuntu Disco):
importance: Undecided → Critical
Changed in linux (Ubuntu Eoan):
importance: Undecided → Critical

Marking status on B/X/D as Incomplete.

(email below sent to kernel-team mailing list
as replies to both patch series above).

Please hold / don't apply this patch for now.

The reporter hit an apparently unrelated Oops in 3 of 40 nodes,
and it hasn't been possible yet to determine whether this patch
is at all related or at fault, due to timing/deployment matters
preventing a methodical approach to revert to a original kernel.

Since the patch is recent even in the mainline kernel, holding
it up for a bit seemed to be the most prudent action for LTSes
and thus drop the patch which would be required on Disco too.

We'll be following up on this as possible on the reporter's end.

Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Changed in linux (Ubuntu Bionic):
status: In Progress → Incomplete
Changed in linux (Ubuntu Disco):
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers