Comment 10 for bug 1814095

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

I am not sure we could deterministically provoke the
issue. At the very least to ensure no other regression
was introduced, I would run it under heavy network load.

The environment in question which saw the issue had
network load, contention for cpus and several other
issues occur.

The basic environment is:

1. For any 25Gb NIC/chipset that requires the 4.4 bnxt_en_bpo
   driver, set its 2 ports/interfaces up in bonding mode
   as follows:

bond-lacp-rate fast
bond-master bond0
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer3+4
mtu 9000

2. Run any heavy TCP network load test over the systems
   (e.g. iperf, netperf, file transfer, etc.)

3. Theoretically, it would appear that if the number of tx
   ring descriptors were lower, than that would be more
   likely to hit this (not successfully proven by testing
   here), but can lower it and see if that helps:

   # ethtool -G eno49 tx 128 // for example

I am not sure if that helps, Scott. I'll try and smoke
up more specific steps but I cannot guarantee you will
see the issue.