[Jammy, mlx5, ConnectX-7] add CX7 support for software steering

Bug #1966194 reported by Amir Tzin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Medium
Unassigned

Bug Description

[Impact]
Add support for software steering on cx7

[Test Case]
configure software steering on cx7 setup
run asap testing
(reference https://github.com/Mellanox/ovs-tests)

[Regression Potential]
TBD

[Other Info]

Feature patchset:
All patches are cleanly applied on Jamy master-next beside
these two who add minor conflicts due to context difference.
net/mlx5: Introduce software defined steering capabilities
net/mlx5: DR, Add support for matching on

#fixes
624bf42c2e39 net/mlx5: DR, Fix querying eswitch manager vport for ECPF
0aec12d97b20 net/mlx5: DR, Fix slab-out-of-bounds in mlx5_cmd_dr_create_fte
9091b821aaa4 net/mlx5: DR, Handle eswitch manager and uplink vports separately

#CX7 SMFS support
6862c787c7e8 net/mlx5: DR, Add support for ConnectX-7 steering
638a07f1090e net/mlx5: DR, Refactor ste_ctx handling for STE v0/1
75a3926ca6a4 net/mlx5: DR, Rename action modify fields to reflect naming in HW spec
bdc3ab5795a6 net/mlx5: DR, Fix handling of different actions on the same STE in STEv1
11659ef8d28e net/mlx5: DR, Remove unneeded comments
5c422bfad2fb net/mlx5: DR, Add support for matching on Internet Header Length (IHL)

#dependencies:
60dc0ef674ec net/mlx5: VLAN push on RX, pop on TX
8348b71ccd92 net/mlx5: Introduce software defined steering capabilities

#depencecies:
#SW STEERING DEBUG DUMP
aa36c94853b2 net/mlx5: Set SMFS as a default steering mode if device supports it
4ff725e1d4ad net/mlx5: DR, Ignore modify TTL if device doesn't support it
cc2295cd54e4 net/mlx5: DR, Improve steering for empty or RX/TX-only matchers
f59464e257bd net/mlx5: DR, Add support for matching on geneve_tlv_option_0_exist field
09753babaf46 net/mlx5: DR, Support matching on tunnel headers 0 and 1
8c2b4fee9c4b net/mlx5: DR, Add misc5 to match_param structs
0f2a6c3b9219 net/mlx5: Add misc5 flow table match parameters
b54128275ef8 net/mlx5: DR, Warn on failure to destroy objects due to refcount
e3a0f40b2f90 net/mlx5: DR, Add support for UPLINK destination type
9222f0b27da2 net/mlx5: DR, Add support for dumping steering info
7766c9b922fe net/mlx5: DR, Add missing reserved fields to dr_match_param
89cdba3224f0 net/mlx5: DR, Add check for flex parser ID value
08fac109f7bb net/mlx5: DR, Rename list field in matcher struct to list_node
32e9bd585307 net/mlx5: DR, Remove unused struct member in matcher
c3fb0e280b4c net/mlx5: DR, Fix lower case macro prefix "mlx5_" to "MLX5_"
84dfac39c61f net/mlx5: DR, Fix error flow in creating matcher

58a606dba708 net/mlx5: Introduce new uplink destination type

455832d49666 net/mlx5: DR, Fix check for unsupported fields in match param
9091b821aaa4 net/mlx5: DR, Handle eswitch manager and uplink vports separately

#SW STEERING SF
515ce2ffa621 net/mlx5: DR, init_next_match only if needed
5dde00a73048 net/mlx5: DR, Fix typo 'offeset' to 'offset'
1ffd498901c1 net/mlx5: DR, Increase supported num of actions to 32
11a45def2e19 net/mlx5: DR, Add support for SF vports
c0e90fc2ccaa net/mlx5: DR, Support csum recalculation flow table on SFs
ee1887fb7cdd net/mlx5: DR, Align error messages for failure to obtain vport caps
dd4acb2a0954 net/mlx5: DR, Add missing query for vport 0
7ae8ac9a5820 net/mlx5: DR, Replace local WIRE_PORT macro with the existing MLX5_VPORT_UPLINK
f9f93bd55ca6 net/mlx5: DR, Fix vport number data type to u16
c228dce26222 net/mlx5: DR, Fix code indentation in dr_ste_v1

CVE References

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1966194

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Jeff Lane  (bladernr)
tags: added: soa
Revision history for this message
dann frazier (dannf) wrote :

@Amir: After preparing a backport of these changes to jammy, I've a few comments/questions:

I dropped the following patches because they are listed as dependencies, but I didn't have any problem cherry-picking/building without them:
 c228dce262225 net/mlx5: DR, Fix code indentation in dr_ste_v1
  net/mlx5: VLAN push on RX, pop on TX

Also, I see that 9/10 of Yevgeny's patches here are included:
  https://www.spinics.net/lists/netdev/msg770514.html
But one is missing. Can you clarify why the following patch is omitted?
 98576013bf283 net/mlx5: DR, Add missing string for action type SAMPLER

Finally, apologies, but since we last spoke I learned that we added a new release milestone with 21.10 that I was unaware of. There is now a kernel feature freeze milestone:
  https://wiki.ubuntu.com/KernelFeatureFreeze
And for 22.04, this milestone occurred last week. I'm looking to see if there's an exception process for this.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Amir Tzin (amirtz) wrote :

Hi Dan,
98576013bf283 net/mlx5: DR, Add missing string for action type SAMPLER
was omitted by mistake, sorry for that.

Revision history for this message
dann frazier (dannf) wrote :

Amir, I've prepared a test build with that additional commit added. Can you run it through your test suite and report back? The kernel team will take that testing into account to determine whether or not a kernel feature freeze exception can be granted.

sudo apt-add-repository ppa:dannf/lp1966194
sudo apt install linux-modules-extra-5.15.0-23.23-generic

Revision history for this message
dann frazier (dannf) wrote :

I've refreshed the above ppa to include the guys for bug 1967754, can you verify this build?

sudo apt-add-repository ppa:dannf/lp1966194
sudo apt install linux-modules-extra-5.15.0-25.25-generic

Revision history for this message
Amir Tzin (amirtz) wrote (last edit ):

Hi,
I run some tests to verify the feature. SMFS was enabled on CX7 setup

# devlink dev param show pci/0000:08:00.0 name flow_steering_mode
pci/0000:08:00.0:
  name flow_steering_mode type driver-specific
    values:
      cmode runtime value smfs

but some of the asap test failed (see below), it will be great to have a test kernel without the feature
and without the degradation from bug 1967754 to make sure those failure are not caused by this ticket patch set (testing with Cx6-dx).

test-ct-tcp.sh TEST PASSED
test-eswitch-add-in-mode1-del-in-mode2.sh TEST PASSED
test-eswitch-netdev-tx.sh TEST PASSED
test-eswitch-reload-modules-different-state.sh TEST PASSED
test-eswitch-set-vf-vlan.sh TEST PASSED
test-mod-depends.sh TEST PASSED
test-ovs-ct-scapy-udp-nat-dnat.sh TEST PASSED
test-ovs-ct-vxlan.sh TEST PASSED
test-ovs-sf-tcp.sh TEST PASSED
test-tc-groups-multi-fgs.sh TEST PASSED
test-tc-groups-overlapping.sh TEST PASSED
test-tc-hairpin-rules.sh TEST PASSED
test-tc-icmp-4-channels.sh TEST PASSED
test-tc-insert-rules.sh FAILED
test-tc-insert-rules-geneve.sh TEST PASSED
test-tc-insert-rules-goto.sh TEST PASSED
test-tc-insert-rules-goto2.sh TEST PASSED
test-tc-insert-rules-mirror.sh TEST PASSED
test-tc-insert-rules-pedit.sh TEST PASSED
test-tc-merged-esw-vf-vf.sh TEST PASSED
test-tc-vf-mirror.sh TEST PASSED
test-tc-vxlan-decap-inner-match-drop.sh TEST PASSED
test-vf-lag.sh TEST PASSED
test-vf-vf-ping.sh TEST PASSED
test-vxlan-neigh-update.sh FAILED

Revision history for this message
dann frazier (dannf) wrote :

I've now updated the PPA w/ a kernel that reverts the patch that caused bug 1967754. Same package name, just version 5.15.0-25.25.25+lp1966194.2 now.

Revision history for this message
Amir Tzin (amirtz) wrote (last edit ):

I applied same test plan on 5.15.0-25-generic with and without the feature patchset on CX6-dx.
In both cases the test kernel had the the problematic patch from (bug https://bugs.launchpad.net/bugs/1967754) reverted.
I got the exact same results as in comment #6 so the patch set does not introduce a degradation.

Revision history for this message
dann frazier (dannf) wrote :

Thanks Amir. If the test results are the same w/ and w/o these patches, then I assume those tests are regression tests. Are you able to also confirm that the new functionality (CX7 software steering support) is working as expected?

Revision history for this message
Amir Tzin (amirtz) wrote :

Hi Dan,

Those test are ASAP which uses steering as its infrastructure thus this test set when configured to run with software steering verifies the functionality.

dann frazier (dannf)
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
importance: Undecided → Medium
status: New → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.15.0-28.29 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-jammy
Revision history for this message
Amir Tzin (amirtz) wrote :

Hi,

We tested the feature with 5.15.0-28 from proposed.
The features is working and SMFS mode is enabled for Connectx-7.
No degradation detected comparing 5.15.0-28 from proposed and 5.15.0-28 compiled without the feature patch set.

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-5.15/5.15.0-32.33~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
dann frazier (dannf) wrote :

@kernel when we've already tested a kernel in its GA release, do we also now need to retest the HWE backport?

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (101.7 KiB)

This bug was fixed in the package linux - 5.15.0-35.36

---------------
linux (5.15.0-35.36) jammy; urgency=medium

  * CVE-2022-21499
    - SAUCE: debug: Lock down kgdb

linux (5.15.0-34.35) jammy; urgency=medium

  * jammy/linux: 5.15.0-34.35 -proposed tracker (LP: #1974322)

  * AMD APU s2idle is broken after the ASIC reset fix (LP: #1972134)
    - drm/amdgpu: unify BO evicting method in amdgpu_ttm
    - drm/amdgpu: explicitly check for s0ix when evicting resources

  * amd_gpio AMDI0030:00: Failed to translate GPIO pin 0x0000 to IRQ, err -517
    (LP: #1971597)
    - gpio: Request interrupts after IRQ is initialized

  * config CONFIG_HISI_PMU for kunpeng920 (LP: #1956086)
    - [Config] CONFIG_HISI_PMU=m

  * Mute/mic LEDs no function on EliteBook G9 platfroms (LP: #1970552)
    - ALSA: hda/realtek: Enable mute/micmute LEDs support for HP Laptops

  * network-manager/1.36.4-2ubuntu1 ADT test failure with linux/5.15.0-28.29
    (LP: #1971418)
    - Revert "rfkill: make new event layout opt-in"

  * PCIE LnkCtl ASPM not enabled under VMD mode for Alder Lake platforms
    (LP: #1942160)
    - SAUCE: vmd: fixup bridge ASPM by driver name instead

  * Mute/mic LEDs no function on HP EliteBook 845/865 G9 (LP: #1970178)
    - ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on EliteBook
      845/865 G9

  * Enable headset mic on Lenovo P360 (LP: #1967069)
    - ALSA: hda/realtek: Enable headset mic on Lenovo P360

  * WCN6856 BT keep in OFF state after coldboot system (LP: #1967067)
    - Bluetooth: btusb: Improve stability for QCA devices

  * Screen sometimes can't update [Failed to post KMS update: CRTC property
    (GAMMA_LUT) not found] (LP: #1967274)
    - drm/i915/xelpd: Enable Pipe color support for D13 platform
    - drm/i915: Use unlocked register accesses for LUT loads
    - drm/i915/xelpd: Enable Pipe Degamma
    - drm/i915/xelpd: Add Pipe Color Lut caps to platform config

  * Jammy update: v5.15.35 upstream stable release (LP: #1969857)
    - drm/amd/display: Add pstate verification and recovery for DCN31
    - drm/amd/display: Fix p-state allow debug index on dcn31
    - hamradio: defer 6pack kfree after unregister_netdev
    - hamradio: remove needs_free_netdev to avoid UAF
    - cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
    - ACPI: processor idle: Check for architectural support for LPI
    - ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40
    - btrfs: remove unused parameter nr_pages in add_ra_bio_pages()
    - btrfs: remove no longer used counter when reading data page
    - btrfs: remove unused variable in btrfs_{start,write}_dirty_block_groups()
    - soc: qcom: aoss: Expose send for generic usecase
    - dt-bindings: net: qcom,ipa: add optional qcom,qmp property
    - net: ipa: request IPA register values be retained
    - btrfs: release correct delalloc amount in direct IO write path
    - ALSA: core: Add snd_card_free_on_error() helper
    - ALSA: sis7019: Fix the missing error handling
    - ALSA: ali5451: Fix the missing snd_card_free() call at probe error
    - ALSA: als300: Fix the missing snd_card_free() call at probe error
    - ALSA: als4000: Fix ...

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
dann frazier (dannf)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.