veth.sh from ubuntu_kselftests_net failed on J-5.15 (with xdp attached - gro flag)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned |
Bug Description
Issue found with Jammy 5.15.0-111.121 in sru-20240429
Reproduce rate is 100% across different arches on openstack cloud.
Test log:
ubuntu@
default - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation ok
- aggregation with TSO off ok
with gro on - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation with TSO off ok
setting tx channels ok
bad setting: combined channels ok
setting invalid channels nr ok
bad setting: XDP with RX nr less than TX ok
bad setting: reducing RX nr below peer TX with XDP set ok
with xdp attached - gro flag fail - expected on found off
- peer gro flag ok
- tso flag ok
- peer tso flag ok
- aggregation fail - got 10 packets, expected 1
- after dev off, flag fail - expected on found off
- peer flag ok
- after gro on xdp off, gro flag ok
- peer gro flag ok
- tso flag ok
- peer tso flag ok
decreasing tx channels with device down ok
- aggregation ok
increasing tx channels with device down ok
aggregation again with default and TSO off ok
This failure is different than our known issue of this test (LP: #1949569 with gro on/aggregation with TSO off) And we don't have this failure on openstack cloud in the previous cycles.
I have also verified the following combinations:
* 105 kernel + 106 source code - GOOD
* 106 kernel + 106 source code - GOOD
* 111 kernel + 106 source code - BAD
* 111 kernel + 111 source code - BAD
* 106 kernel + 111 source code - GOOD
This appears to be a possible regression in the kernel to me.
description: | updated |
This happens in noble too.
Commit "net: veth: do not manipulate GRO when using XDP" introduces the problem. Reverting it makes the test pass again.
I haven't seen anything upstream related to this, I'll first understand what it does and then talk to them to solve it.