[22.04 FEAT] SMC-R v2 Support

Bug #1929035 reported by bugproxy
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Skipper Bug Screeners
linux (Ubuntu)
Fix Released
Undecided
Canonical Kernel Team

Bug Description

Feature Description:
SMC-R is currently limited to traffic within a single IP subnet only. SMC-R v2 will lift this restriction.

Business Case:
Allow SMC to be applicable in more diverse customer environments, prereq for container support

Feature will be included within kernel >=5.14.

Backport info will be provided, once available

bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-192590 severity-high targetmilestone-inin2110
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
importance: Undecided → High
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in ubuntu-z-systems:
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2021-08-04 11:23 EDT-------
Feature will not make it in time for Impish/21.10, therefore moving to U22.04

tags: added: targetmilestone-inin2204
removed: targetmilestone-inin2110
Frank Heimes (fheimes)
summary: - [21.10 FEAT] SMC-R v2 Support
+ [22.04 FEAT] SMC-R v2 Support
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-03-15 18:28 EDT-------
Accepted in kernel 5.16:

# ed990df29f5b ("net/smc: save stack space and allocate smc_init_info")
# 42042dbbc2eb ("net/smc: prepare for SMC-Rv2 connection")
# e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment")
# e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2")
# 8ade200c269f ("net/smc: add v2 format of CLC decline message")
# 24fb68111d45 ("net/smc: retrieve v2 gid from IB device")
# 8799e310fb3f ("net/smc: add v2 support to the work request layer")
# b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2")
# b0539f5eddc2 ("net/smc: add netlink support for SMC-Rv2")
# 29397e34c76b ("net/smc: stop links when their GID is removed")

Revision history for this message
Frank Heimes (fheimes) wrote :
Download full text (5.5 KiB)

Thx for the list of commits.

Looks like LP#1929060 "[22.04 FEAT] smc: Add User-defined EID (Enterprise ID) Support - kernel" is a prereq - if not worked based on that, I run into EID issues already starting with 42042dbbc2eb.

Applying the patches based on LP#1929060 I get further down, but still run into conflicts, this time with 8799e310fb3f ("net/smc: add v2 support to the work request layer").

$ git cherry-pick -s -e -x 8799e310fb3f
Auto-merging net/smc/smc_wr.h
Auto-merging net/smc/smc_wr.c
CONFLICT (content): Merge conflict in net/smc/smc_wr.c
Auto-merging net/smc/smc_ib.c
Auto-merging net/smc/smc_core.h
Auto-merging net/smc/smc_core.c
error: could not apply 8799e310fb3f... net/smc: add v2 support to the work request layer
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
fheimes@T570:~/ubuntu-jammy-master-next/jammy-lp1929035$ git status
On branch master-next
Your branch is ahead of 'origin/master-next' by 9 commits.
  (use "git push" to publish your local commits)

You are currently cherry-picking commit 8799e310fb3f.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
 modified: net/smc/smc_core.c
 modified: net/smc/smc_core.h
 modified: net/smc/smc_ib.c
 modified: net/smc/smc_wr.h

Unmerged paths:
  (use "git add <file>..." to mark resolution)
 both modified: net/smc/smc_wr.c
$

$ git diff
diff --cc net/smc/smc_wr.c
index 59ca1a2d5c65,22d7324969cd..000000000000
--- a/net/smc/smc_wr.c
+++ b/net/smc/smc_wr.c
@@@ -96,20 -101,47 +96,59 @@@ static inline void smc_wr_tx_process_cq
        }

        pnd_snd_idx = smc_wr_tx_find_pending_index(link, wc->wr_id);
- if (pnd_snd_idx == link->wr_tx_cnt)
- return;
- link->wr_tx_pends[pnd_snd_idx].wc_status = wc->status;
- if (link->wr_tx_pends[pnd_snd_idx].compl_requested)
- complete(&link->wr_tx_compl[pnd_snd_idx]);
- memcpy(&pnd_snd, &link->wr_tx_pends[pnd_snd_idx], sizeof(pnd_snd));
- /* clear the full struct smc_wr_tx_pend including .priv */
- memset(&link->wr_tx_pends[pnd_snd_idx], 0,
- sizeof(link->wr_tx_pends[pnd_snd_idx]));
- memset(&link->wr_tx_bufs[pnd_snd_idx], 0,
- sizeof(link->wr_tx_bufs[pnd_snd_idx]));
- if (!test_and_clear_bit(pnd_snd_idx, link->wr_tx_mask))
- return;
+ if (pnd_snd_idx == link->wr_tx_cnt) {
+ if (link->lgr->smc_version != SMC_V2 ||
+ link->wr_tx_v2_pend->wr_id != wc->wr_id)
+ return;
+ link->wr_tx_v2_pend->wc_status = wc->status;
+ memcpy(&pnd_snd, link->wr_tx_v2_pend, sizeof(pnd_snd));
+ /* clear the full struct smc_wr_tx_pend including .priv */
+ memset(link->wr_tx_v2_pend, 0,
+ sizeof(*link->wr_tx_v2_pend));
+ memset(link->lgr->wr_tx_buf_v2, 0,
+ sizeof(*link->lgr->wr_tx_buf_v2));
+ } else {
+ ...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (6.0 KiB)

------- Comment From <email address hidden> 2022-03-16 13:19 EDT-------
(In reply to comment #9)
> Thx for the list of commits.
>
> Looks like LP#1929060 "[22.04 FEAT] smc: Add User-defined EID (Enterprise
> ID) Support - kernel" is a prereq - if not worked based on that, I run into
> EID issues already starting with 42042dbbc2eb.
>
> Applying the patches based on LP#1929060 I get further down, but still run
> into conflicts, this time with 8799e310fb3f ("net/smc: add v2 support to the
> work request layer").
>
> $ git cherry-pick -s -e -x 8799e310fb3f
> Auto-merging net/smc/smc_wr.h
> Auto-merging net/smc/smc_wr.c
> CONFLICT (content): Merge conflict in net/smc/smc_wr.c
> Auto-merging net/smc/smc_ib.c
> Auto-merging net/smc/smc_core.h
> Auto-merging net/smc/smc_core.c
> error: could not apply 8799e310fb3f... net/smc: add v2 support to the work
> request layer
> hint: after resolving the conflicts, mark the corrected paths
> hint: with 'git add <paths>' or 'git rm <paths>'
> hint: and commit the result with 'git commit'
> fheimes@T570:~/ubuntu-jammy-master-next/jammy-lp1929035$ git status
> On branch master-next
> Your branch is ahead of 'origin/master-next' by 9 commits.
> (use "git push" to publish your local commits)
>
> You are currently cherry-picking commit 8799e310fb3f.
> (fix conflicts and run "git cherry-pick --continue")
> (use "git cherry-pick --skip" to skip this patch)
> (use "git cherry-pick --abort" to cancel the cherry-pick operation)
>
> Changes to be committed:
> modified: net/smc/smc_core.c
> modified: net/smc/smc_core.h
> modified: net/smc/smc_ib.c
> modified: net/smc/smc_wr.h
>
> Unmerged paths:
> (use "git add <file>..." to mark resolution)
> both modified: net/smc/smc_wr.c
> $
>
> $ git diff
> diff --cc net/smc/smc_wr.c
> index 59ca1a2d5c65,22d7324969cd..000000000000
> --- a/net/smc/smc_wr.c
> +++ b/net/smc/smc_wr.c
> @@@ -96,20 -101,47 +96,59 @@@ static inline void smc_wr_tx_process_cq
> }
>
> pnd_snd_idx = smc_wr_tx_find_pending_index(link, wc->wr_id);
> - if (pnd_snd_idx == link->wr_tx_cnt)
> - return;
> - link->wr_tx_pends[pnd_snd_idx].wc_status = wc->status;
> - if (link->wr_tx_pends[pnd_snd_idx].compl_requested)
> - complete(&link->wr_tx_compl[pnd_snd_idx]);
> - memcpy(&pnd_snd, &link->wr_tx_pends[pnd_snd_idx], sizeof(pnd_snd));
> - /* clear the full struct smc_wr_tx_pend including .priv */
> - memset(&link->wr_tx_pends[pnd_snd_idx], 0,
> - sizeof(link->wr_tx_pends[pnd_snd_idx]));
> - memset(&link->wr_tx_bufs[pnd_snd_idx], 0,
> - sizeof(link->wr_tx_bufs[pnd_snd_idx]));
> - if (!test_and_clear_bit(pnd_snd_idx, link->wr_tx_mask))
> - return;
> + if (pnd_snd_idx == link->wr_tx_cnt) {
> + if (link->lgr->smc_version != SMC_V2 ||
> + link->wr_tx_v2_pend->wr_id != wc->wr_id)
> + return;
> + link->wr_tx_v2_pend->wc_status = wc->status;
> + memcpy(&pnd_snd, link->wr_tx_v2_pend, sizeof(pnd_snd));
> + /* clear the full struct smc_wr_tx_pend including .priv */
> + memset(link->wr_tx_v2_pend, 0,
...

Read more...

Revision history for this message
Frank Heimes (fheimes) wrote :

(repeating my comment from LP#1929060)
Well, if commits are provided here, we expect that they make up the functionality described by the title and bug description, and can be taken as they are (cherry-picked) - and in case not, a backport would be needed.

We cannot just pick all the commits up to a certain date - it's (very) unlikely that this will be acceptable by our kernel team (who reviews all submissions in detail).

What's required (for a patch here, or for an sru) is a minimal set of patches that make up a logical functional unit (like here at the title: "SMC-R v2 Support").
The code also needs to be upstream accepted (that's the case here) and in case of s390x code, it needs to be signed-off by you (actually by IBM, at the commit level - as usual), pre-tested - there must be confidence about the patch set - and the willingness to help maintaining it.

And for the reason of maintenance it's not desired to smash multiple functions addressed in different LP bugs again into one big submission.

(Please notice that we discuss adding code to an upcoming LTS release that will need to be supported by up to 10 years.)

Revision history for this message
Frank Heimes (fheimes) wrote :

Hi, so (like mentioned before) IBM usually shares with us a (self-contained and minimal) list of commits (either hashes or - if needed - backports) that constitute a certain function/feature ('logical unit') and that apply cleanly to jammy (22.04) master-next.

Bu anywayt I had a look at the list (that you've referenced) and went down to 2021-10-28 (because the 2021-10-16 are the commits that are target of this ticket anyway) and checked whether these have already landed in jammy master-next or not -- and the majority is already in.
But the following 7 are not:
45c3ff7a9ac1 "net/smc: Clean up local struct sock variables"
af1877b6cad1 "net/smc: Print function name in smcr_link_down tracepoint"
a3a0e81b6fd5 "net/smc: Introduce tracepoint for smcr link down"
aff3083f10bf "net/smc: Introduce tracepoints for tx and rx msg"
482626086820 "net/smc: Introduce tracepoint for fallback"
4c1e34c0dbff "vsock: Enable y2038 safe timeval for timeout"
0daa55d033b0 "octeontx2-af: cn10k: debugfs for dumping LMTST map table"
(the last two make up - with add. two - the "Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net" -- I'm for example not convinced if these are really needed, esp. 0daa55d033b0).

It would be important to know whether these are really needed or can be safely ignored.

At the end a list of commits is needed that can be applied (ideally cherry-picked to) to the current jammy master-next.
(git clone git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy --branch master-next --single-branch jammy-master-next)

And if starting to pick the three EID commits from LP#1929060 - "smc: Add User-defined EID Support" (which seem to be self-contained) and proceed further with the 10 commits mentioned here (LP#1929035 - "SMC-R v2 Support") [so going bottom up] I face a conflict at 8799e310fb3f "net/smc: add v2 support to the work request layer".
That means even if some of the above 7 are needed, they will not help on solving the conflict with 8799e310fb3f, since the 7 above are all newer.
Hence I assume there is one or even more commits needed (even from before 2021-10-16 ?) that help solving that conflict (...or a backport of 8799e310fb3f).

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-03-17 18:49 EDT-------
(In reply to comment #13)
> Hi, so (like mentioned before) IBM usually shares with us a (self-contained
> and minimal) list of commits (either hashes or - if needed - backports) that
> constitute a certain function/feature ('logical unit') and that apply
> cleanly to jammy (22.04) master-next.
>
> Bu anywayt I had a look at the list (that you've referenced) and went down
> to 2021-10-28 (because the 2021-10-16 are the commits that are target of
> this ticket anyway) and checked whether these have already landed in jammy
> master-next or not -- and the majority is already in.
> But the following 7 are not:
> 45c3ff7a9ac1 "net/smc: Clean up local struct sock variables"
> af1877b6cad1 "net/smc: Print function name in smcr_link_down tracepoint"
> a3a0e81b6fd5 "net/smc: Introduce tracepoint for smcr link down"
> aff3083f10bf "net/smc: Introduce tracepoints for tx and rx msg"
> 482626086820 "net/smc: Introduce tracepoint for fallback"
> 4c1e34c0dbff "vsock: Enable y2038 safe timeval for timeout"
> 0daa55d033b0 "octeontx2-af: cn10k: debugfs for dumping LMTST map table"
> (the last two make up - with add. two - the "Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net" -- I'm for example
> not convinced if these are really needed, esp. 0daa55d033b0).
>
> It would be important to know whether these are really needed or can be
> safely ignored.
>
These patches are not necessary for now. As I mentioned, these patches don't come from us. Though we have already tested generally, they still can trigger errors.

> At the end a list of commits is needed that can be applied (ideally
> cherry-picked to) to the current jammy master-next.
> (git clone
> git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy
> --branch master-next --single-branch jammy-master-next)
>
> And if starting to pick the three EID commits from LP#1929060 - "smc: Add
> User-defined EID Support" (which seem to be self-contained) and proceed
> further with the 10 commits mentioned here (LP#1929035 - "SMC-R v2 Support")
> [so going bottom up] I face a conflict at 8799e310fb3f "net/smc: add v2
> support to the work request layer".
> That means even if some of the above 7 are needed, they will not help on
> solving the conflict with 8799e310fb3f, since the 7 above are all newer.
> Hence I assume there is one or even more commits needed (even from before
> 2021-10-16 ?) that help solving that conflict (...or a backport of
> 8799e310fb3f).

By using "git blame", I found the problem patch :
61005756c824 ("net/smc: fix kernel panic caused by race of smc_sock")
which is from Alibaba from 2021-12-28

Again, all of the patches till the Date 2021-10-16 should be taken, i.e. the last necessary patch is:
29397e34c76b ("net/smc: stop links when their GID is removed")
You can take the patches after that date certainly, but we can not make sure there is no risk to make something broken.

Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

Pull request submitted to kernel team's mailing list:
https://lists.ubuntu.com/archives/kernel-team/2022-March/thread.html#128771
changing status to 'In Progress'.

information type: Private → Public
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in ubuntu-z-systems:
status: Incomplete → In Progress
Changed in linux (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-5.15/5.15.0-25.25~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Frank Heimes (fheimes) wrote :

This feature was request for 22.04, hence the hwe kernel is more a fall out.
Hence updating tag to unblock the process.

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 5.15.0-25.25

---------------
linux (5.15.0-25.25) jammy; urgency=medium

  * jammy/linux: 5.15.0-25.25 -proposed tracker (LP: #1967146)

  * Miscellaneous Ubuntu changes
    - SAUCE: Revert "scsi: core: Reallocate device's budget map on queue depth
      change"

 -- Paolo Pisati <email address hidden> Wed, 30 Mar 2022 17:28:11 +0200

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.