[18.04][config] regression: nvme and nvme_core couldn't be built as modules starting 4.15-rc2

Bug #1759893 reported by Yurii Shestakov on 2018-03-29
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Seth Forshee
Bionic
High
Seth Forshee

Bug Description

Some regression was introduced into NVME-related kernel configuration by 32c662c58a9b9 into 4.15-rc2, which was pulled later into ubuntu-bionic.git

In result, "nvme" and "nvme_core" drivers are built into the kernel, we can't build them as modules. It caused incompatibility of NMVe-OF target and initiator modules (nvmet, nvme-rdma) installed by Mellanox OFED with the inbox "nvme" driver.

Root cause analysis.

In the drivers/lightnvm/Kconfig file - kernel configuration for the OpenChannel SSDs (lightnvm) we have:

menuconfig NVM
        bool "Open-Channel SSD target support"
        depends on BLOCK && HAS_DMA && PCI
        select BLK_DEV_NVME
        help
          Say Y here to get to enable Open-channel SSDs.
...

It means that BLK_DEV_NVME is selected to "y" when NVM (CONFIG_NVM) is selected.
NVM parameter is 2 state (on / off, i.e. "y" or "no"), it couldn't be built as a module.
So that it triggers the change of BLK_DEV_NAME=y and NVME_CORE=y

$ git blame drivers/lightnvm/Kconfig

32c662c58a9b9 (Rakesh Pandit 2017-10-13 14:45:55 +0200 7) depends on BLOCK && HAS_DMA && PCI
32c662c58a9b9 (Rakesh Pandit 2017-10-13 14:45:55 +0200 8) select BLK_DEV_NVME

commit 32c662c58a9b9d0c99e713a14ca323a9a91c73a0
Author: Rakesh Pandit <email address hidden>
Date: Fri Oct 13 14:45:55 2017 +0200

    lightnvm: include NVM Express driver if OCSSD is selected for build

    Because NVM needs BLK_DEV_NVME, select it automatically if we mark NVM
    in config file before building kernel. Also append PCI to depends as
    select doesn't automatically add dependencies.

    Signed-off-by: Rakesh Pandit <email address hidden>
    Signed-off-by: Matias Bjørling <email address hidden>
    Signed-off-by: Jens Axboe <email address hidden>

 drivers/lightnvm/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

$ git diff 32c662c58a9b9^1..32c662c58a9b9
diff --git a/drivers/lightnvm/Kconfig b/drivers/lightnvm/Kconfig
index ead61a93cb4e..2a953efec4e1 100644
--- a/drivers/lightnvm/Kconfig
+++ b/drivers/lightnvm/Kconfig
@@ -4,7 +4,8 @@

 menuconfig NVM
        bool "Open-Channel SSD target support"
- depends on BLOCK && HAS_DMA
+ depends on BLOCK && HAS_DMA && PCI
+ select BLK_DEV_NVME
        help
          Say Y here to get to enable Open-channel SSDs.

Proposed fix is following:

diff --git a/drivers/lightnvm/Kconfig b/drivers/lightnvm/Kconfig
index 2a953efec4e1..9969236314d7 100644
--- a/drivers/lightnvm/Kconfig
+++ b/drivers/lightnvm/Kconfig
@@ -4,8 +4,7 @@

 menuconfig NVM
        bool "Open-Channel SSD target support"
- depends on BLOCK && HAS_DMA && PCI
- select BLK_DEV_NVME
+ depends on BLOCK && HAS_DMA && PCI && BLK_DEV_NVME
        help
          Say Y here to get to enable Open-channel SSDs.

Regards, Yurii Shestakov
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Mar 29 17:05 seq
 crw-rw---- 1 root audio 116, 33 Mar 29 17:05 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
HibernationDevice: RESUME=/dev/mapper/vg2-swap_1
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: QEMU Standard PC (Q35 + ICH9, 2009)
Package: linux (not installed)
PciMultimedia:

ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-10-generic root=/dev/mapper/vg2-root ro net.ifnames=0 biosdevname=0 quiet nofb nomodeset console=ttyS0,115200n8
ProcVersionSignature: Ubuntu 4.15.0-10.11-generic 4.15.3
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-10-generic N/A
 linux-backports-modules-4.15.0-10-generic N/A
 linux-firmware 1.173
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic
Uname: Linux 4.15.0-10-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: False
dmi.bios.date: 01/01/2011
dmi.bios.vendor: Seabios
dmi.bios.version: 0.5.1
dmi.chassis.type: 1
dmi.chassis.vendor: Bochs
dmi.modalias: dmi:bvnSeabios:bvr0.5.1:bd01/01/2011:svnQEMU:pnStandardPC(Q35+ICH9,2009):pvrpc-q35-2.0:cvnBochs:ct1:cvr:
dmi.product.name: Standard PC (Q35 + ICH9, 2009)
dmi.product.version: pc-q35-2.0
dmi.sys.vendor: QEMU

CVE References

Yurii Shestakov (yuriis) wrote :

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1759893

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Yurii Shestakov (yuriis) on 2018-03-29
summary: - regression: nvme and nvme_core couldn't be build as modules starting
- 4.15-rc2
+ [18.04][config] regression: nvme and nvme_core couldn't be build as
+ modules starting 4.15-rc2
summary: - [18.04][config] regression: nvme and nvme_core couldn't be build as
+ [18.04][config] regression: nvme and nvme_core couldn't be built as
modules starting 4.15-rc2

apport information

tags: added: apport-collected bionic
description: updated

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Bionic):
importance: Medium → High
status: Confirmed → Triaged
tags: added: kernel-da-key
tags: added: patch
Yurii Shestakov (yuriis) wrote :

From the KConfig tutorial: https://www.kernel.org/doc/Documentation/kbuild/kconfig-language.txt

- reverse dependencies: "select" <symbol> ["if" <expr>]
  While normal dependencies reduce the upper limit of a symbol (see
  below), reverse dependencies can be used to force a lower limit of
  another symbol. The value of the current menu symbol is used as the
  minimal value <symbol> can be set to. If <symbol> is selected multiple
  times, the limit is set to the largest selection.
  Reverse dependencies can only be used with boolean or tristate
  symbols.
  Note:
 *select should be used with care.* select will force
 a symbol to a value without visiting the dependencies.
 By abusing select you are able to select a symbol FOO even
 if FOO depends on BAR that is not set.
 In general use select only for non-visible symbols
 (no prompts anywhere) and for symbols with no dependencies.
 That will limit the usefulness but on the other hand avoid
 the illegal configurations all over.

Seth Forshee (sforshee) wrote :

Have you reported this upstream?

It does look to me that this dependency is working the wrong way, that CONFIG_BLK_DEV_NVME depends on CONFIG_NVME and not the other way around. The driver enabled by CONFIG_BLK_DEV_NVME has no entry points other than the struct pci_driver callbacks, but it does call into the code enabled by CONFIG_NVME.

I'm not convinced that yours is the correct fix though. I suspect that BLK_DEV_NVME should have a "depends on NVM" or "select NVM" instead. What I will do though is revert 32c662c58a9b9d0c99e713a14ca323a9a91c73a0 and change BLK_DEV_NVME back to =m. You should probably follow up upstream to get a fix in place there.

Changed in linux (Ubuntu Bionic):
assignee: nobody → Seth Forshee (sforshee)
status: Triaged → In Progress
Yurii Shestakov (yuriis) wrote :

Hi Seth,

No, I've not reported this issue to the upstream yet. I guess somebody from our company, who works on NVMe-OF related code in the kernel, should do this.

To be honest I'm not sure that my fix is technically correct and takes into account all dependencies. So that I'm fine with reverting of 32c662c58a9b9d0c99e713a14ca323a9a91c73a0.

Thank you.

Seth Forshee (sforshee) on 2018-04-05
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (35.7 KiB)

This bug was fixed in the package linux - 4.15.0-19.20

---------------
linux (4.15.0-19.20) bionic; urgency=medium

  * linux: 4.15.0-19.20 -proposed tracker (LP: #1766021)

  * Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers (LP: #1765232)
    - Revert "blk-mq: simplify queue mapping & schedule with each possisble CPU"
    - Revert "genirq/affinity: assign vectors to all possible CPUs"

linux (4.15.0-18.19) bionic; urgency=medium

  * linux: 4.15.0-18.19 -proposed tracker (LP: #1765490)

  * [regression] Ubuntu 18.04:[4.15.0-17-generic #18] KVM Guest Kernel:
    meltdown: rfi/fallback displacement flush not enabled bydefault (kvm)
    (LP: #1765429)
    - powerpc/pseries: Fix clearing of security feature flags

  * signing: only install a signed kernel (LP: #1764794)
    - [Packaging] update to Debian like control scripts
    - [Packaging] switch to triggers for postinst.d postrm.d handling
    - [Packaging] signing -- switch to raw-signing tarballs
    - [Packaging] signing -- switch to linux-image as signed when available
    - [Config] signing -- enable Opal signing for ppc64el
    - [Packaging] printenv -- add signing options

  * [18.04 FEAT] Sign POWER host/NV kernels (LP: #1696154)
    - [Packaging] signing -- add support for signing Opal kernel binaries

  * Please cherrypick s390 unwind fix (LP: #1765083)
    - s390/compat: fix setup_frame32

  * Ubuntu 18.04 installer does not detect any IPR based HDD/RAID array [S822L]
    [ipr] (LP: #1751813)
    - d-i: move ipr to storage-core-modules on ppc64el

  * drivers/gpu/drm/bridge/adv7511/adv7511.ko missing (LP: #1764816)
    - SAUCE: (no-up) rename the adv7511 drm driver to adv7511_drm

  * Miscellaneous Ubuntu changes
    - [Packaging] Add linux-oem to rebuild test blacklist.

linux (4.15.0-17.18) bionic; urgency=medium

  * linux: 4.15.0-17.18 -proposed tracker (LP: #1764498)

  * Eventual OOM with profile reloads (LP: #1750594)
    - SAUCE: apparmor: fix memory leak when duplicate profile load

linux (4.15.0-16.17) bionic; urgency=medium

  * linux: 4.15.0-16.17 -proposed tracker (LP: #1763785)

  * [18.04] [bug] CFL-S(CNP)/CNL GPIO testing failed (LP: #1757346)
    - [Config]: Set CONFIG_PINCTRL_CANNONLAKE=y

  * [Ubuntu 18.04] USB Type-C test failed on GLK (LP: #1758797)
    - SAUCE: usb: typec: ucsi: Increase command completion timeout value

  * Fix trying to "push" an already active pool VP (LP: #1763386)
    - SAUCE: powerpc/xive: Fix trying to "push" an already active pool VP

  * hisi_sas: Revert and replace SAUCE patches w/ upstream (LP: #1762824)
    - Revert "UBUNTU: SAUCE: scsi: hisi_sas: export device table of v3 hw to
      userspace"
    - Revert "UBUNTU: SAUCE: scsi: hisi_sas: config for hip08 ES"
    - scsi: hisi_sas: modify some register config for hip08
    - scsi: hisi_sas: add v3 hw MODULE_DEVICE_TABLE()

  * Realtek card reader - RTS5243 [VEN_10EC&DEV_5260] (LP: #1737673)
    - misc: rtsx: Move Realtek Card Reader Driver to misc
    - updateconfigs for Realtek Card Reader Driver
    - misc: rtsx: Add support for RTS5260
    - misc: rtsx: Fix symbol clashes

  * Mellanox [mlx5] [bionic] UBSAN: Undefined behaviour in
    ./include/linux/net_dim.h (LP: #1...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Andy Whitcroft (apw) on 2019-02-14
tags: added: kernel-fixup-verification-needed-bionic
removed: verification-needed-bionic
Brad Figg (brad-figg) on 2019-02-14
tags: added: verification-needed-bionic
Andy Whitcroft (apw) wrote :

This bug was erroneously marked for verification in bionic; verification is not required and verification-needed-bionic is being removed.

tags: removed: verification-needed-bionic
tags: added: verification-done-bionic
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers