Ubuntu16.04: NVMe 4K+T10 DIF/DIX format returns I/O error on dd with split op

Bug #1689946 reported by bugproxy on 2017-05-10
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Unassigned
linux (Ubuntu)
Medium
Joseph Salisbury
Xenial
Medium
Joseph Salisbury
Yakkety
Medium
Joseph Salisbury
Zesty
Medium
Joseph Salisbury
Artful
Medium
Joseph Salisbury

Bug Description

==== State: Open by: mdate on 19 March 2017 12:33:34 ====

On a Bolt adapter in a system with Ubuntu 16.04, I've formatted the Bolt for T10 and am using it to do a dd with a 2M block size.

Here are the commands:
nvme format /dev/nvme0n1 --lbaf=1 --pil=0 --ms=0 --pi=2

dd if=/dev/urandom of=/dev/nvme0n1 bs=2M oflag=direct count=1

I get an error on the dd.
 root@x1623bp1:~# dd if=/dev/urandom of=/dev/nvme0n1 bs=2M oflag=direct count=1
 dd: error writing '/dev/nvme0n1': Input/output error
 1+0 records in
 0+0 records out
 0 bytes copied, 0.0525061 s, 0.0 kB/s

dmesg shows:
        [589997.985151] blk_update_request: I/O error, dev nvme0n1, sector 0

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-152710 severity-high targetmilestone-inin16043
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)

------- Comment From <email address hidden> 2017-05-11 03:33 EDT-------

Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I built a 16.04 test kernel with a pick of commit f36ea50ca004. The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1689946/

Can you test this kernel and see if it resolves this bug?

Thanks in advance!

Frank Heimes (fheimes) on 2017-05-15
Changed in ubuntu-power-systems:
status: New → In Progress
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-12 16:23 EDT-------
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/block/blk-mq.c?id=f36ea50ca0043e7b1204feaf1d2ba6bd68c08d36

Above is upstream id.

I have tested your kernel and works fine.

Joseph Salisbury (jsalisbury) wrote :

I'd like to submit an SRU request to have commit f36ea50ca004 included in Xenial. However, I wanted to check an see if it is also needed in the newer Ubuntu releases as well.

Commit f36ea50ca004 was included in mainline as of v4.12-rc1. It changes the location of blk_queue_split in blk_mq_make_request, which was originally added by commit 54efd50bfd87 in v4.3-rc1. That would suggested that commit f36ea50ca004 is also needed in Yakkety, Zesty and Artful. Can you confirm this bug also exists in those releases and require commit f36ea50ca004?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-16 09:35 EDT-------
This CMVC defect is being cancelled by the CDE Bridge because the corresponding CQ Defect [SW383941] was transferred out of the bridge domain.
Here are the additional details:
New Subsystem = ppc_triage
New Release = unspecified
New Component = ubuntu_linux
New OwnerInfo = Chavez, Luciano (<email address hidden>)
To continue tracking this issue, please follow CQ defect [SW383941].

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-07-13 10:48 EDT-------
Yes. Commit f36ea50ca004 should be included in the kernels newer than v4.3-rc1.

Thanks,
Wendy

Changed in linux (Ubuntu Zesty):
status: New → In Progress
Changed in linux (Ubuntu Yakkety):
status: New → In Progress
importance: Undecided → Medium
Changed in linux (Ubuntu Zesty):
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Yakkety):
assignee: nobody → Joseph Salisbury (jsalisbury)
Seth Forshee (sforshee) on 2017-07-14
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Manoj Iyer (manjo) on 2017-07-19
Changed in ubuntu-power-systems:
importance: Undecided → Medium
importance: Medium → High

This bug was nominated against a series that is no longer supported, ie yakkety. The bug task representing the yakkety nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Yakkety):
status: In Progress → Won't Fix
Manoj Iyer (manjo) on 2017-07-31
tags: added: triage-g
Launchpad Janitor (janitor) wrote :
Download full text (24.9 KiB)

This bug was fixed in the package linux - 4.11.0-13.19

---------------
linux (4.11.0-13.19) artful; urgency=low

  * CVE-2017-7533
    - dentry name snapshots

linux (4.11.0-12.18) artful; urgency=low

  * linux: 4.11.0-12.18 -proposed tracker (LP: #1707635)
    - no change rebuild to pick up the new binutils.

  * Adt tests of src:linux time out often on armhf lxc containers (LP: #1705495)
    - [Packaging] tests -- reduce rebuild test to one flavour
    - [Packaging] tests -- reduce rebuild test to one flavour -- use filter

  * [ARM64] config EDAC_GHES=y depends on EDAC_MM_EDAC=y (LP: #1706141)
    - [Config] set EDAC_MM_EDAC=y for ARM64

  * [Hyper-V] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
    (LP: #1690174)
    - hv_netvsc: Exclude non-TCP port numbers from vRSS hashing

  * ath10k doesn't report full RSSI information (LP: #1706531)
    - ath10k: add per chain RSSI reporting

  * ideapad_laptop don't support v310-14isk (LP: #1705378)
    - platform/x86: ideapad-laptop: Add several models to no_hw_rfkill

  * Ubuntu 16.04.3: Qemu fails on P9 (LP: #1686019)
    - KVM: PPC: Pass kvm* to kvmppc_find_table()
    - KVM: PPC: Use preregistered memory API to access TCE list
    - KVM: PPC: VFIO: Add in-kernel acceleration for VFIO
    - powerpc/powernv/iommu: Add real mode version of iommu_table_ops::exchange()
    - powerpc/iommu/vfio_spapr_tce: Cleanup iommu_table disposal
    - powerpc/vfio_spapr_tce: Add reference counting to iommu_table
    - powerpc/mmu: Add real mode support for IOMMU preregistered memory
    - KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number
    - KVM: PPC: Book3S HV: Add radix checks in real-mode hypercall handlers

  * hns: ethtool selftest crashes system (LP: #1705712)
    - net/hns:bugfix of ethtool -t phy self_test

  * ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
    (LP: #1673564)
    - KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
      registers
    - KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
    - arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
    - KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
    - KVM: arm64: Make kvm_condition_valid32() accessible from EL2
    - KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
    - KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
    - KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
    - KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
    - KVM: arm64: vgic-v3: Add misc Group-0 handlers
    - KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
    - KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
    - arm64: Add MIDR values for Cavium cn83XX SoCs
    - arm64: Add wor...

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
Stefan Bader (smb) on 2017-08-11
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
tags: added: verification-needed-zesty

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

------- Comment From <email address hidden> 2017-08-16 23:08 EDT-------
Split op operations do not fail with HTX when testing with proposed fix.

tags: added: verification-done-xenial verification-done-zesty
removed: verification-needed-xenial verification-needed-zesty
Launchpad Janitor (janitor) wrote :
Download full text (16.2 KiB)

This bug was fixed in the package linux - 4.4.0-93.116

---------------
linux (4.4.0-93.116) xenial; urgency=low

  * linux: 4.4.0-93.116 -proposed tracker (LP: #1709296)

  * Creating conntrack entry failure with kernel 4.4.0-89 (LP: #1709032)
    - Revert "Revert "netfilter: synproxy: fix conntrackd interaction""
    - netfilter: nf_ct_ext: fix possible panic after nf_ct_extend_unregister

  * CVE-2017-1000112
    - Revert "udp: consistently apply ufo or fragmentation"
    - udp: consistently apply ufo or fragmentation

  * CVE-2017-1000111
    - Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
    - packet: fix tp_reserve race in packet_set_ring

  * kernel BUG at [tty_ldisc_reinit] mm/slub.c! (LP: #1709126)
    - tty: Simplify tty_set_ldisc() exit handling
    - tty: Reset c_line from driver's init_termios
    - tty: Handle NULL tty->ldisc
    - tty: Move tty_ldisc_kill()
    - tty: Use 'disc' for line discipline index name
    - tty: Refactor tty_ldisc_reinit() for reuse
    - tty: Destroy ldisc instance on hangup

  * atheros bt failed after S3 (LP: #1706833)
    - SAUCE: Bluetooth: Make request workqueue freezable

  * The Precision Touchpad(PTP) button sends incorrect event code (LP: #1708372)
    - HID: multitouch: handle external buttons for Precision Touchpads

  * Set CONFIG_SATA_HIGHBANK=y on armhf (LP: #1703430)
    - [Config] CONFIG_SATA_HIGHBANK=y

  * xfs slab objects (memory) leak when xfs shutdown is called (LP: #1706132)
    - xfs: fix xfs_log_ticket leak in xfs_end_io() after fs shutdown

  * Adt tests of src:linux time out often on armhf lxc containers (LP: #1705495)
    - [Packaging] tests -- reduce rebuild test to one flavour

  * CVE-2017-7495
    - ext4: fix data exposure after a crash

  * ubuntu/rsi driver downlink wifi throughput drops to 5-6 Mbps when BT
    keyboard is connected (LP: #1706991)
    - SAUCE: Redpine: enable power save by default for coex mode
    - SAUCE: Redpine: uapsd configuration changes

  * [Hyper-V] hv_netvsc: Exclude non-TCP port numbers from vRSS hashing
    (LP: #1690174)
    - hv_netvsc: Exclude non-TCP port numbers from vRSS hashing

  * ath10k doesn't report full RSSI information (LP: #1706531)
    - ath10k: add per chain RSSI reporting

  * ideapad_laptop don't support v310-14isk (LP: #1705378)
    - platform/x86: ideapad-laptop: Add several models to no_hw_rfkill

  * [8087:0a2b] Failed to load bluetooth firmware(might affect some other Intel
    bt devices) (LP: #1705633)
    - Bluetooth: btintel: Create common Intel Version Read function
    - Bluetooth: Use switch statement for Intel hardware variants
    - Bluetooth: Replace constant hw_variant from Intel Bluetooth firmware
      filename
    - Bluetooth: hci_intel: Fix firmware file name to use hw_variant
    - Bluetooth: btintel: Add MODULE_FIRMWARE entries for iBT 3.5 controllers

  * xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2
    comp_code 13 (LP: #1667750)
    - xhci: Bad Ethernet performance plugged in ASM1042A host

  * OpenPower: Some multipaths temporarily have only a single path
    (LP: #1696445)
    - scsi: ses: don't get power status of SES device slot on probe

  ...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (8.5 KiB)

This bug was fixed in the package linux - 4.10.0-33.37

---------------
linux (4.10.0-33.37) zesty; urgency=low

  * linux: 4.10.0-33.37 -proposed tracker (LP: #1709303)

  * CVE-2017-1000112
    - Revert "udp: consistently apply ufo or fragmentation"
    - udp: consistently apply ufo or fragmentation

  * CVE-2017-1000111
    - Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
    - packet: fix tp_reserve race in packet_set_ring

  * ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
    (LP: #1673564)
    - irqchip/gic-v3: Add missing system register definitions
    - arm64: KVM: Do not use stack-protector to compile EL2 code
    - KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
      registers
    - KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
    - arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
    - KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
    - KVM: arm64: Make kvm_condition_valid32() accessible from EL2
    - KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
    - KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
    - KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
    - KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
    - KVM: arm64: vgic-v3: Add misc Group-0 handlers
    - KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
    - KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
    - arm64: Add MIDR values for Cavium cn83XX SoCs
    - [Config] CONFIG_CAVIUM_ERRATUM_30115=y
    - arm64: Add workaround for Cavium Thunder erratum 30115
    - KVM: arm64: vgic-v3: Add ICV_DIR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_RPR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_CTLR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_PMR_EL1 handler
    - KVM: arm64: Enable GICv3 common sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Log which GICv3 system registers are trapped
    - arm64: KVM: Make unexpected reads from WO registers inject an undef
    - KVM: arm64: Log an error if trapping a read-from-write-only GICv3 access
    - KVM: arm64: Log an error if trapping a write-to-read-only GICv3 access

  * ibmvscsis: Do not send aborted task response (LP: #1689365)
    - target: Fix unknown fabric callback queue-full errors
    - ibmvscsis: Do not send aborted task response
    - ibmvscsis: Clear left-over abort_cmd pointers
    - ibmvscsis: Fix the incorrect req_lim_delta

  * hisi_sas performance improvements (LP: #1708734)
    - scsi: hisi_sas: define hisi_sas_device.device_id as int
    - scsi: hisi_sas: optimise the usage of hisi_hba.lock
    - scsi: hisi_sas: relocate sata_done_v2_hw()
    - scsi: hisi_sas: optimise DMA slot memory

  * hisi_sas...

Read more...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-06 10:26 EDT-------
Hi jsalisbury

We tried to installation Ubuntu17.04 over nvme + T10 enable but it failed during installation.
By our understanding, the patch in this bug is dropped into 4.10.0-33 kernel but installation kernel is 4.10.0-19. How can we drop the patch into installation kernel? Any suggestion?

Thanks for your help!

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-18 11:03 EDT-------
Nvme device is one of load source now. We need to support installation with T10 enable over nvme device. So we would like to drop this patch into installer kernel for Ubuntu installation.

Let me know if you have any question.

Thanks,
Wendy

Manoj Iyer (manjo) wrote :
Manoj Iyer (manjo) wrote :

Wendy,

At this time you can use the installer in -proposed, once the kernel in -proposed gets promoted to -updates (typically within ~5 weeks) you can use the installer from -updates. Here is the link to the installer in Xenial-updates:

http://ports.ubuntu.com/dists/xenial-updates/main/installer-ppc64el/current/images/hwe-netboot/

Currently installer build in -updates is from 7/28 and -proposed is from 9/14

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-09-18 14:13 EDT-------
Hi Manjo,

Thanks for your help!
I have installed Ubuntu16.04 successfully on nvme + T10 enable from -proposed.

/dev/nvme2n1 S3RVNA0J300254 MZPLL1T6HEHP 1 4.10 GB / 4.10 GB 4 KiB + 8 B MN11MN11

root@ltc-briggs6:~# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
udev 63491520 0 63491520 0% /dev
tmpfs 13381056 21504 13359552 1% /run
/dev/nvme2n1p2 3650612 1255888 2189568 37% /
tmpfs 66905216 0 66905216 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 66905216 0 66905216 0% /sys/fs/cgroup
tmpfs 13381056 0 13381056 0% /run/user/1000

Thanks,
Wendy

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers