Bionic: support for Solarflare X2542 network adapter (sfc driver)

Bug #1836635 reported by Mauricio Faria de Oliveira on 2019-07-15
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
debian-installer (Ubuntu)
Status tracked in Eoan
Bionic
Medium
Mauricio Faria de Oliveira
Cosmic
Undecided
Unassigned
Disco
Undecided
Unassigned
Eoan
Undecided
Unassigned
linux (Ubuntu)
Status tracked in Eoan
Bionic
Undecided
Mauricio Faria de Oliveira
Cosmic
Undecided
Unassigned
Disco
Undecided
Unassigned
Eoan
Undecided
Unassigned

Bug Description

[Impact]

 * Support for Solarflare X2542 network adapter
   (Medford2 / SFC9250) in the Bionic sfc driver.

 * This network adapter is present on recent hardware,
   at least HP 2019 and Dell PowerEdge R740xd systems.

 * On recent-hardware deployments that would rather use
   the Bionic LTS / GA supported kernel and cannot move
   to HWE kernels this adapter is non functional at all.

[Test Case]

 * The X2542 adapter has been exercised with iperf3 and nc
   across 2 hosts on 25G link speed w/ MTUs 1400/1500/9000
   on both directions, for 1 week.

   Its performance is on par with the Cosmic 4.18 kernel
   (which contains all these patches) and the out-of-tree
   driver from the vendor.

 * The 7000 series adapter (for regression testing an old model,
   supported previously) has been exercised with iperf and netperf
   (TCP_STREAM, UDP_STREAM, TCP_RR, UDP_RR, and TCP_CRR) in one
   host (client/server in different adapter ports isolated with
   network namespaces, so traffic goes through the network switch),
   on 10G link speed on MTUs 1500/9000, for 1 weekend.

   No regressions observed between the original and test kernels.

[Regression Potential]

 * The patchset touches a lot of the sfc driver, so the potential
   for regression definitely exists. Thus, a lot of consideration
   and testing happened:

 * It has been tested on other adapter which uses the old code,
   and no regressions were found so far (see 7000 series above).

 * The patchset is exclusively cherry-picks, no single backport.

 * The patchset essentially moves the Bionic driver up in the
   upstream 'git log --oneline -- drivers/net/ethernet/sfc/':

   - since commit d4a7a8893d4c ("sfc: pass valid pointers from efx_enqueue_unwind")
   - until commit 7f61e6c6279b ("sfc: support FEC configuration through ethtool")
   - except for 2 commits (not needed / unrelated)
     - commit 42356d9a137b ("sfc: support RSS spreading of ethtool ntuple filters")
     - commit 9baeb5eb1f83 ("sfc: falcon: remove duplicated bit-wise or of LOOPBACK_SGMII")
   - plus 2 more recent commits (fixes)
     - commit 458bd99e4974 ("sfc: remove ctpio_dmabuf_start from stats")
     - commit 0c235113b3c4 ("sfc: stop the TX queue before pushing new buffers")

Changed in linux (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Changed in linux (Ubuntu Cosmic):
status: New → Invalid
Changed in linux (Ubuntu Disco):
status: New → Invalid
Changed in linux (Ubuntu Eoan):
status: New → Invalid
description: updated
description: updated
description: updated
description: updated
description: updated
description: updated

Regression test results/log/script,
for documentation purposes.

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

Regression testing done on an older/previously supported adapter, SFC 7000 series.

The netperf suite of TCP/UDP STREAM and RR, and TCP_RR ran for ~2 days,
with results in the same ballpark as the original kernel and test kernels.

Now waiting for test results with the new/requested adapter
before marking verification done/successful.

Summary: test name, mtu sizes, original/test/proposed kernel results.

TCP_CRR
 1500/1500
  ORIG 4550-4560
  TEST 4550-4580
  PROP 5260-5316
 9000/9000
  ORIG 4557
  TEST 4570
  PROP 5260-5300

TCP_RR
 1500/1500
  ORIG 32531
  TEST ~31k,32k
  PROP 32180-34277
 9000/9000
  ORIG 31620
  TEST 27k-30k-36k
  PROP 27k-33k-34k

TCP_STREAM
 1500/1500
  ORIG 9406
  TEST 9403
  PROP 9405
 9000/9000
  ORIG 9883
  TEST 9887
  PROP 9887

UDP_RR
 1500/1500
  ORIG ~36k/~37k
  TEST ~36k/~37k
  PROP ~36k
 9000/9000
  ORIG ~35k/~37k
  TEST 33k-37k
  PROP ~35.8k/~36.6k

UDP_STREAM
 1500/1500
  ORIG 8.6k/8.9k
  TEST 8.9k
  PROP 8.6k/8.7k
 9000/9000
  ORIG 8.7k
  TEST 8.7k/8.8k
  PROP 8.7k

Verification done on the new adapter, Solarflare X2542.

The tester reports that iperf3 tests ran solid for 24h-72h.

tags: added: verification-done-bionic
removed: verification-needed-bionic

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (171.3 KiB)

This bug was fixed in the package linux - 4.15.0-58.64

---------------
linux (4.15.0-58.64) bionic; urgency=medium

  * unable to handle kernel NULL pointer dereference at 000000000000002c (IP:
    iget5_locked+0x9e/0x1f0) (LP: #1838982)
    - Revert "ovl: set I_CREATING on inode being created"
    - Revert "new primitive: discard_new_inode()"

linux (4.15.0-57.63) bionic; urgency=medium

  * CVE-2019-1125
    - x86/cpufeatures: Carve out CQM features retrieval
    - x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
    - x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
    - x86/speculation: Enable Spectre v1 swapgs mitigations
    - x86/entry/64: Use JMP instead of JMPQ
    - x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS

  * Packaging resync (LP: #1786013)
    - update dkms package versions

linux (4.15.0-56.62) bionic; urgency=medium

  * bionic/linux: 4.15.0-56.62 -proposed tracker (LP: #1837626)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync git-ubuntu-log
    - [Packaging] update helper scripts

  * CVE-2019-2101
    - media: uvcvideo: Fix 'type' check leading to overflow

  * hibmc-drm Causes Unreadable Display for Huawei amd64 Servers (LP: #1762940)
    - [Config] Set CONFIG_DRM_HISI_HIBMC to arm64 only
    - SAUCE: Make CONFIG_DRM_HISI_HIBMC depend on ARM64

  * Bionic: support for Solarflare X2542 network adapter (sfc driver)
    (LP: #1836635)
    - sfc: make mem_bar a function rather than a constant
    - sfc: support VI strides other than 8k
    - sfc: add Medford2 (SFC9250) PCI Device IDs
    - sfc: improve PTP error reporting
    - sfc: update EF10 register definitions
    - sfc: populate the timer reload field
    - sfc: update MCDI protocol headers
    - sfc: support variable number of MAC stats
    - sfc: expose FEC stats on Medford2
    - sfc: expose CTPIO stats on NICs that support them
    - sfc: basic MCDI mapping of 25/50/100G link speeds
    - sfc: support the ethtool ksettings API properly so that 25/50/100G works
    - sfc: add bits for 25/50/100G supported/advertised speeds
    - sfc: remove tx and MCDI handling from NAPI budget consideration
    - sfc: handle TX timestamps in the normal data path
    - sfc: add function to determine which TX timestamping method to use
    - sfc: use main datapath for HW timestamps if available
    - sfc: only enable TX timestamping if the adapter is licensed for it
    - sfc: MAC TX timestamp handling on the 8000 series
    - sfc: on 8000 series use TX queues for TX timestamps
    - sfc: only advertise TX timestamping if we have the license for it
    - sfc: simplify RX datapath timestamping
    - sfc: support separate PTP and general timestamping
    - sfc: support second + quarter ns time format for receive datapath
    - sfc: support Medford2 frequency adjustment format
    - sfc: add suffix to large constant in ptp
    - sfc: mark some unexported symbols as static
    - sfc: update MCDI protocol headers
    - sfc: support FEC configuration through ethtool
    - sfc: remove ctpio_dmabuf_start from stats
    - sfc: stop the TX queue before pushing new buffers

  * [18.04 FEAT] zKVM: Add hardwar...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in debian-installer (Ubuntu Eoan):
status: New → Invalid
Changed in debian-installer (Ubuntu Disco):
status: New → Invalid
Changed in debian-installer (Ubuntu Cosmic):
status: New → Invalid
Changed in debian-installer (Ubuntu Bionic):
assignee: nobody → Mauricio Faria de Oliveira (mfo)
importance: Undecided → Medium
status: New → In Progress

Attaching debian-installer debdiff to update the kernel version
so that the netboot images include this driver update on Bionic.

The kernel version eventually settled on is 4.15.0-62 after bug
searching on versions after 4.15.0-58 (which releases this 'fix').

This builds correctly on all architectures on PPA [1].

I have tested the installer with regular/lvm disk partition layouts
on amd64/i386/arm64/ppc64el/s390x on virtual machines with QEMU and
on amd64 bare metal machine.

The netboot installer boots the 4.15.0-62 kernel, finishes correctly,
and the installed system w/ the 4.15.0-64 kernel (from bionic-updates)
boots correctly.

The copy of the installer log in it (/var/log/installer/syslog)
shows no weird kernel messages at all, all good.

[1] https://launchpad.net/~mfo/+archive/ubuntu/lp1836635-di

tags: added: sts
Eric Desrochers (slashd) wrote :

Sponsored in Bionic.

Hello Mauricio, or anyone else affected,

Accepted debian-installer into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/debian-installer/20101020ubuntu543.11 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in debian-installer (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done-bionic

Verification successful with netboot image from bionic-proposed:

  http://archive.ubuntu.com/ubuntu/dists/bionic-proposed/main/installer-$ARCH/20101020ubuntu543.11/

Same testing as done on comment #8:

I have tested the installer with regular/lvm disk partition layouts
on amd64/i386/arm64/ppc64el/s390x on virtual machines with QEMU and
on amd64 bare metal machine.

The netboot installer boots the 4.15.0-62 kernel, finishes correctly,
and the installed system w/ the 4.15.0-64 kernel (from bionic-updates)
boots correctly.

The copy of the installer log in it (/var/log/installer/syslog)
shows no weird kernel messages at all, all good.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package debian-installer - 20101020ubuntu543.11

---------------
debian-installer (20101020ubuntu543.11) bionic; urgency=medium

  * Move master kernels to 4.15.0-62. (LP: #1836635)

 -- Mauricio Faria de Oliveira <email address hidden> Mon, 23 Sep 2019 16:07:51 -0300

Changed in debian-installer (Ubuntu Bionic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for debian-installer has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers