Comment 36 for bug 1853638

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

We are closing this LP bug for now as we aren't able to reproduce
in-house, and we cannot get access to a live testing repro env
at this time.

Here is what we know:

- There seems to be different performance for some tests when
  the NIC is configured with active-backup bonding mode, between
  the case when the active interface is the primary port, and
  when the active interface is the secondary port.
  i.e.:

Primary port: enp94s0f0 // when this is the active, works fine
Secondary port: enp94s0f1d1 // when this is the active, more drops

- Switch info: 2 x Fortigate 1024D switches, each machine is connected
  to both

- NIC info: root@u072:~# lspci | grep BCM57416
01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller (rev 01)

# ethtool -i enp1s0f0np0
driver: bnxt_en
version: 1.10.0
firmware-version: 214.0.253.1/pkg 21.40.25.31

- Our attempt at a reproducer (initially reported in production env via graphical
monitoring):

mtr --no-dns --report --report-cycles 60 --udp -s 1428 $DEST
good system = ~ 0% drops
bad systems = ~ 8% drops

We are not getting NIC stats drops, nor UDP kernel drops, so it's
not clear where the packet is being dropped, whether it's being
dropped silently somewhere (?), or if that's a red herring and
a mtr test issue, and what's seen in production is something else.

If someone can reproduce this, or something similar, or if we manage
to, we will re-open this bug or file a new one.