[SRU] Unable to modify/create-namespace over NVDIMM-N

Bug #1811660 reported by Sujith Pandel
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

[Impact]
Users are unable to modify or create namespaces over NVDIMM-N

[Test Case]

Steps to Reproduce:
1. Setup a Dell EMC 14G server R740xd server with NVDIMM-N, update BIOS & NVDIMM firmware to latest available to customers.
2. Install and boot to Ubuntu 18.04
3. Notice that no pmems are enumerated.
4. Try #ndctl create-namespace -> Failure with device or resource busy.

Actual results:
No pmems are enumerated, cannot create namespaces.
If pmems are already present, then cannot modify the namespaces.

Expected results:
namespace creation should be possible

[Regression Potential]

The Regression Risk is low, Intel not impacted but Dell is.

[Other Info]
Additional Info:
* Seen with upstream linux-4.20.2 also.

* Upstream bug report -
https://github.com/pmem/ndctl/issues/78

* Upstream kernel patch:
https://lists.01.org/pipermail/linux-nvdimm/2019-January/019435.html

Request you to incorporate it in bionic and xenial kernels.

information type: Public → Private
information type: Private → Public
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1811660

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Michael Reed (mreed8855) wrote : Re: Unable to modify/create-namespace over NVDIMM-N

Can you add the Firmware versions for the BIOS and for the NVDIMM. Also can you include the uname -a information as well.

tags: added: bionic tpp xenial
Revision history for this message
Sujith Pandel (sujithpandel) wrote :

It can be seen across all 14G Dell EMC servers carrying NVDIMM-N.

Current setup I have is a T640 with 1.6.13 BIOS and NVDIMM fw version-9324

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Issue seen across all kernels of Ubuntu 18.04

Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Revision history for this message
Jeff Lane  (bladernr) wrote :

Out of curiosity, Are you using ipmctl and libsafec from Juston Li's PPAs to do the configuration of the NVDIMMs?

https://launchpad.net/~jhli/+archive/ubuntu/ipmctl
https://launchpad.net/~jhli/+archive/ubuntu/libsafec

Revision history for this message
Jerry Clement (jerry-clement) wrote :

After booting to the OS, user is unable to create namespaces by running "ndctl create-namespace". Even by using flags such as -f or -e, the operation is not successful.
I am not aware of ipmctl being approved as a management tool for nvdimm-n (yet?)

Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Michael Reed (mreed8855)
tags: added: verification-needed-bionic verification-needed-xenial
summary: - Unable to modify/create-namespace over NVDIMM-N
+ [SRU] Unable to modify/create-namespace over NVDIMM-N
Michael Reed (mreed8855)
description: updated
Revision history for this message
Khaled El Mously (kmously) wrote :

I have built a bionic test kernel with those fixes (+2 others needed for backporting) for testing.

The kernel can be found at:

https://kernel.ubuntu.com/~kmously/1811660/kernel-kmously-96d0cc8-l7Et/

@Sujith Pandel can you please confirm of this kernel fixes the problem or not?

Thanks.

Revision history for this message
Sujith Pandel (sujithpandel) wrote :
Download full text (4.1 KiB)

Another issue pops-up, looks like there are a few nvdimm patches that require backporting:

# ndctl create-namespace -e "namespace0.0" -m fsdax -f -vvv
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (256 byte) labels
libndctl: ndctl_dimm_enable: nmem0: failed to enable
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label area (1024) too small to host (128 byte) labels
libndctl: sizeof_namespace_index: nmem0: label are...

Read more...

Revision history for this message
Andy Whitcroft (apw) wrote :

This bug was erroneously marked for verification in bionic; verification is not required and verification-needed-bionic is being removed.

tags: added: kernel-fixup-verification-needed-bionic verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Jeff Lane  (bladernr) wrote :

Has this been tested with Disco (19.04) or Cosmic (18.10)? IOW does this issue with ndctl only exist in Bionic, or does it carry forward?

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

This is a kernel issue and we have seen this with upstream kernels and verified the fixes on upstream kernels.
I expect bionic kernel 4.15 and above kernels to display this defect.

Although I have not tried applying only the kernel fix on bionic, the test kernel shared above shows that original defect is resolved, however another one has popped up.
This indicates that a few more patches are required to get the complete namespace modification functionality working.

Revision history for this message
Michael Reed (mreed8855) wrote :

Sujith,

What is the other issue and have you identified the other patches required to completely get the namespace modification functionality working?

Revision history for this message
torel (torehl) wrote :

Seeing this on bionic with 4.15.0-60-generic x86_64 kernel.

Any movement on this issue?

root@srl-mds2:~# uname -ar
Linux srl-mds2 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

root@srl-mds2:~# ndctl list
[
  {
    "dev":"namespace1.0",
    "mode":"raw",
    "size":206158430208,
    "sector_size":512,
    "blockdev":"pmem1",
    "numa_node":1
  },
  {
    "dev":"namespace0.0",
    "mode":"raw",
    "size":206158430208,
    "sector_size":512,
    "blockdev":"pmem0",
    "numa_node":0
  }
]

root@srl-mds2:~# ndctl create-namespace -e "namespace0.0" -m fsdax -f -vvv
enable_labels:945: region0: failed to initialize labels
namespace_reconfig:977: region0: no idle namespace seed
failed to reconfigure namespace: No such device

Revision history for this message
torel (torehl) wrote :

Tested latest ndctl version 66 with ubuntu 4.15.0-60-generic

Still the same issue. Do I need a kernel fix?

root@srl-mds2:~# module load ndctl/66
root@srl-mds2:~# which ndctl
/cm/shared/apps/ndctl/66/bin/ndctl

root@srl-mds2:~# ldd /cm/shared/apps/ndctl/66/bin/ndctl
 linux-vdso.so.1 (0x00007ffff7ffa000)
 libndctl.so.6 => /cm/shared/apps/ndctl/66/usr/lib/libndctl.so.6 (0x00007ffff7985000)
 libdaxctl.so.1 => /cm/shared/apps/ndctl/66/usr/lib/libdaxctl.so.1 (0x00007ffff777d000)
 libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007ffff7576000)
 libjson-c.so.3 => /lib/x86_64-linux-gnu/libjson-c.so.3 (0x00007ffff736b000)
 libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007ffff7167000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff6d76000)
 libudev.so.1 => /lib/x86_64-linux-gnu/libudev.so.1 (0x00007ffff6b58000)
 libkmod.so.2 => /lib/x86_64-linux-gnu/libkmod.so.2 (0x00007ffff6941000)
 /lib64/ld-linux-x86-64.so.2 (0x00007ffff7dd5000)
 librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007ffff6739000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ffff651a000)

root@srl-mds2:~# ndctl create-namespace --mode fsdax --map dev -e namespace0.0 -f -vvv
enable_labels:1029: region0: failed to initialize labels
namespace_reconfig:1061: region0: no idle namespace seed
failed to reconfigure namespace: No such device

Revision history for this message
torel (torehl) wrote :

Khaled, could you build a test kernel 4.15.0-60-generic with fix please?

Revision history for this message
torel (torehl) wrote :

Possible to provide nfit kmod with fix?

Revision history for this message
Michael Reed (mreed8855) wrote :

I think this is fixed based on the status of bug https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1811785. I will wait to close this bug until I can verify the fix in focal.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.