veth.sh from ubuntu_kselftests_net failed on J-5.15 (with xdp attached - gro flag)

Bug #2065369 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned

Bug Description

Issue found with Jammy 5.15.0-111.121 in sru-20240429

Reproduce rate is 100% across different arches on openstack cloud.

Test log:
ubuntu@kt-j-l-gen-5-15-bc2r4d20-u-kselftests-net-amd64:~/autotest/client/tmp/ubuntu_kselftests_net/src/linux/tools/testing/selftests/net$ sudo ./veth.sh
default - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation ok
        - aggregation with TSO off ok
with gro on - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
default channels ok
with gro enabled on link down - gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation with TSO off ok
setting tx channels ok
bad setting: combined channels ok
setting invalid channels nr ok
bad setting: XDP with RX nr less than TX ok
bad setting: reducing RX nr below peer TX with XDP set ok
with xdp attached - gro flag fail - expected on found off
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
        - aggregation fail - got 10 packets, expected 1
        - after dev off, flag fail - expected on found off
        - peer flag ok
        - after gro on xdp off, gro flag ok
        - peer gro flag ok
        - tso flag ok
        - peer tso flag ok
decreasing tx channels with device down ok
        - aggregation ok
increasing tx channels with device down ok
aggregation again with default and TSO off ok

This failure is different than our known issue of this test (LP: #1949569 with gro on/aggregation with TSO off) And we don't have this failure on openstack cloud in the previous cycles.

I have also verified the following combinations:
* 105 kernel + 106 source code - GOOD
* 106 kernel + 106 source code - GOOD
* 111 kernel + 106 source code - BAD
* 111 kernel + 111 source code - BAD
* 106 kernel + 111 source code - GOOD

This appears to be a possible regression in the kernel to me.

Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Roxana Nicolescu (roxanan) wrote :

This happens in noble too.
Commit "net: veth: do not manipulate GRO when using XDP" introduces the problem. Reverting it makes the test pass again.
I haven't seen anything upstream related to this, I'll first understand what it does and then talk to them to solve it.

tags: added: 6.8 noble
Revision history for this message
Roxana Nicolescu (roxanan) wrote :

I think the test should be updated, they do not enable GRO when XDP is enabled by default, therefore it must be requested in the userspace I believe

Revision history for this message
Roxana Nicolescu (roxanan) wrote :

Adding
ip netns exec $NS_DST ethtool -K veth$DST gro on

Just before the test works. I'll send this to upstream tomorrow morning

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.