md raid0/linear doesn't show error state if an array member is removed and allows successful writes

Bug #1847773 reported by Guilherme G. Piccoli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Guilherme G. Piccoli
Bionic
Fix Released
High
Guilherme G. Piccoli
Disco
Fix Released
High
Guilherme G. Piccoli
Eoan
Fix Released
High
Guilherme G. Piccoli

Bug Description

[Impact]

* Currently, mounted raid0/md-linear arrays have no indication/warning when one or more members are removed or suffer from some non-recoverable error condition.

* Given that, arrays keep mounted, and regular written data to it goes through page cache and appear as successful written to the devices, despite writeback threads can't write to it. For users, it can potentially cause data corruption, given that even "sync" command will return success despite the data is not written to the disk. Kernel messages will show I/O errors though.

* The patch proposed in this SRU addresses this issue in 2 levels; first, it fast-fails written I/Os to the raid0/md-linear array devices with one or more failed members. Also, it introduces the "broken" state, which is analog to "clean" but indicates that array is not in a good/correct state. A message showed in dmesg helps to clarify when such array gets a member removed/failed.

* The commit proposed here, available on Linus tree as 62f7b1989c02 ("md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone") [http://git.kernel.org/linus/62f7b1989c02], was pretty discussed upstream and received a good amount of reviews/analysis by both the current md maintainer as well as an old maintainer.

* One important note here is that this patch requires a counter-part in mdadm tool to be fully functional, which was SRUed in LP: #1847924.
It works fine without this counter-part, but in case of broken arrays, the
"mdadm --detail" command won't show broken, and instead will show "clean, FAILED".

* We ask hereby an exception from kernel team to have this backported to kernel 4.15 *only in Bionic* and not in Xenial. The reason is that mdadm code changed too much and we didn't want to introduce a potential regression in the Xenial version from that tool, so we only backported the mdadm counter-part of this patch to Bionic, Disco and Eoan - hence, we'd like to have a match in the kernel backported versions.

[Test case]

* To test this patch, create a raid0 or linear md array on Linux using mdadm, like in: "mdadm --create md0 --level=0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1";

* Format the array using a filesystem of your choice (for example ext4) and mount the array;

* Remove one member of the array, for example using sysfs interface (for nvme: echo 1 > /sys/block/nvme0nX/device/device/remove, for scsi: echo 1 > /sys/block/sdX/device/delete);

* Without this patch, the array partition can be written with success, and "mdadm --detail" will show clean state.

[Regression potential]

* There's not much potential regression here; we failed written I/Os to bad arrays and show message/status according to it, showing the array broken status. We believe the most common "issue" that could be reported from this patch is if an userspace tool rely on success of I/O writes or in the "clean" state of an array - after this patch it can potentially have a different behavior in case of a broken array.

summary: - md raid0/linear don't show error state if an array member is removed
+ md raid0/linear don't show error state if an array member is removed and
+ allows successful writes
summary: - md raid0/linear don't show error state if an array member is removed and
- allows successful writes
+ md raid0/linear doesn't show error state if an array member is removed
+ and allows successful writes
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

There was a first attempt to mimic the behavior of NVMe/SCSI devices removed while holding a mounted filesystem: <email address hidden>/T/#u

It was quite complex, relying on force an unmount operation in this filesystem, stop writeback threads and remove the md block device. It got not many reviews, and most of them not favorable for it, hence we proposed a more simpler approach, hereby SRUEd.

The RFC cover-letter aforementioned details the issue to a large extent, so it may be interesting read for the interested parties in this issue.

Thanks,

Guilherme

description: updated
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu Disco):
status: New → Confirmed
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Disco):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

SRU sent to kernel-team mailing-list: lists.ubuntu.com/archives/kernel-team/2019-October/104691.html

Changed in linux (Ubuntu Bionic):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Eoan):
status: Confirmed → Fix Committed
Changed in linux (Ubuntu Disco):
status: Confirmed → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-disco' to 'verification-done-disco'. If the problem still exists, change the tag 'verification-needed-disco' to 'verification-failed-disco'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-disco
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-eoan' to 'verification-done-eoan'. If the problem still exists, change the tag 'verification-needed-eoan' to 'verification-failed-eoan'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-eoan
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

I have verified this bug in all the required releases:
* Bionic, with kernel 4.15.0-68-generic;
* Disco, with kernel 5.0.0-33-generic;
* Eoan, with kernel 5.3.0-21-generic.

I've used the test procedure in the description, and used upstream mdadm for that.
All working fine, as expected. I could see the 'broken' state after removing an array member and the FS failed fast for write I/Os.

Also, worth mentioning I've tested Xenial-HWE kernel in -proposed (also 4.15.0-68) and as expected, it doesn't contain this modification, working exactly as before the change.
Thanks,

Guilherme

tags: added: verification-done-bionic verification-done-disco verification-done-eoan
removed: verification-needed-bionic verification-needed-disco verification-needed-eoan
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (linux-gcp-5.3/5.3.0-1008.9~18.04.1)

All autopkgtests for the newly accepted linux-gcp-5.3 (5.3.0-1008.9~18.04.1) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

linux-gcp-5.3/unknown (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#linux-gcp-5.3

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (53.1 KiB)

This bug was fixed in the package linux - 5.3.0-22.24

---------------
linux (5.3.0-22.24) eoan; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.3.0-21.22) eoan; urgency=medium

  * eoan/linux: 5.3.0-21.22 -proposed tracker (LP: #1850486)

  * Fix signing of staging modules in eoan (LP: #1850234)
    - [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink

linux (5.3.0-20.21) eoan; urgency=medium

  * eoan/linux: 5.3.0-20.21 -proposed tracker (LP: #1849064)

  * eoan: alsa/sof: Enable SOF_HDA link and codec (LP: #1848490)
    - [Config] Enable SOF_HDA link and codec

  * Eoan update: 5.3.7 upstream stable release (LP: #1848750)
    - panic: ensure preemption is disabled during panic()
    - [Config] updateconfigs for USB_RIO500
    - USB: rio500: Remove Rio 500 kernel driver
   ...

Changed in linux (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (38.6 KiB)

This bug was fixed in the package linux - 5.0.0-35.38

---------------
linux (5.0.0-35.38) disco; urgency=medium

  * [REGRESSION] md/raid0: cannot assemble multi-zone RAID0 with default_layout
    setting (LP: #1849682)
    - SAUCE: Fix revert "md/raid0: avoid RAID0 data corruption due to layout
      confusion."

  * refcount underflow and type confusion in shiftfs (LP: #1850867) // CVE-2019-15793
    - SAUCE: shiftfs: Correct id translation for lower fs operations
    - SAUCE: shiftfs: prevent type confusion
    - SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

  * CVE-2018-12207
    - kvm: Convert kvm_lock to a mutex
    - kvm: x86: Do not release the page inside mmu_set_spte()
    - KVM: x86: make FNAME(fetch) and __direct_map more similar
    - KVM: x86: remove now unneeded hugepage gfn adjustment
    - KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
    - KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - KVM: x86: use Intel speculation bugs and features as derived in generic x86
      code
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - SAUCE: x86/speculation/taa: Call tsx_init()
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

linux (5.0.0-34.36) disco; urgency=medium

  * disco/linux: <version to be filled> -proposed tracker (LP: #1850574)

  * [REGRESSION] md/raid0: cannot as...

Changed in linux (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (24.0 KiB)

This bug was fixed in the package linux - 4.15.0-69.78

---------------
linux (4.15.0-69.78) bionic; urgency=medium

  * KVM NULL pointer deref (LP: #1851205)
    - KVM: nVMX: handle page fault in vmread fix

  * CVE-2018-12207
    - KVM: MMU: drop vcpu param in gpte_access
    - kvm: Convert kvm_lock to a mutex
    - kvm: x86: Do not release the page inside mmu_set_spte()
    - KVM: x86: make FNAME(fetch) and __direct_map more similar
    - KVM: x86: remove now unneeded hugepage gfn adjustment
    - KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
    - KVM: x86: add tracepoints around __direct_map and FNAME(fetch)
    - kvm: x86, powerpc: do not allow clearing largepages debugfs entry
    - SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is
      active
    - SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
    - SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
    - SAUCE: kvm: Add helper function for creating VM worker threads
    - SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
    - SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
    - SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

  * CVE-2019-11135
    - KVM: x86: use Intel speculation bugs and features as derived in generic x86
      code
    - x86/msr: Add the IA32_TSX_CTRL MSR
    - x86/cpu: Add a helper function x86_read_arch_cap_msr()
    - x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
    - x86/speculation/taa: Add mitigation for TSX Async Abort
    - x86/speculation/taa: Add sysfs reporting for TSX Async Abort
    - kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
    - x86/tsx: Add "auto" option to the tsx= cmdline parameter
    - x86/speculation/taa: Add documentation for TSX Async Abort
    - x86/tsx: Add config options to set tsx=on|off|auto
    - SAUCE: x86/speculation/taa: Call tsx_init()
    - SAUCE: x86/cpu: Include cpu header from bugs.c
    - [Config] Disable TSX by default when possible

  * CVE-2019-0154
    - SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
    - SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

  * CVE-2019-0155
    - drm/i915/gtt: Add read only pages to gen8_pte_encode
    - drm/i915/gtt: Read-only pages for insert_entries on bdw+
    - drm/i915/gtt: Disable read-only support under GVT
    - drm/i915: Prevent writing into a read-only object via a GGTT mmap
    - drm/i915/cmdparser: Check reg_table_count before derefencing.
    - drm/i915/cmdparser: Do not check past the cmd length.
    - drm/i915: Silence smatch for cmdparser
    - drm/i915: Move engine->needs_cmd_parser to engine->flags
    - SAUCE: drm/i915: Rename gen7 cmdparser tables
    - SAUCE: drm/i915: Disable Secure Batches for gen6+
    - SAUCE: drm/i915: Remove Master tables from cmdparser
    - SAUCE: drm/i915: Add support for mandatory cmdparsing
    - SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
    - SAUCE: drm/i915: Allow parsing of unsized batches
    - SAUCE: drm/i915: Add gen9 BCS cmdparsing
    - SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
    - SAUCE: drm/i915/cmdparser: Add support for backward jumps
    - SAUCE: drm/i915/cmdpar...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (33.2 KiB)

This bug was fixed in the package linux - 5.3.0-24.26

---------------
linux (5.3.0-24.26) eoan; urgency=medium

  * eoan/linux: 5.3.0-24.26 -proposed tracker (LP: #1852232)

  * Eoan update: 5.3.9 upstream stable release (LP: #1851550)
    - io_uring: fix up O_NONBLOCK handling for sockets
    - dm snapshot: introduce account_start_copy() and account_end_copy()
    - dm snapshot: rework COW throttling to fix deadlock
    - Btrfs: fix inode cache block reserve leak on failure to allocate data space
    - btrfs: qgroup: Always free PREALLOC META reserve in
      btrfs_delalloc_release_extents()
    - iio: adc: meson_saradc: Fix memory allocation order
    - iio: fix center temperature of bmc150-accel-core
    - libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature
    - perf tests: Avoid raising SEGV using an obvious NULL dereference
    - perf map: Fix overlapped map handling
    - perf script brstackinsn: Fix recovery from LBR/binary mismatch
    - perf jevents: Fix period for Intel fixed counters
    - perf tools: Propagate get_cpuid() error
    - perf annotate: Propagate perf_env__arch() error
    - perf annotate: Fix the signedness of failure returns
    - perf annotate: Propagate the symbol__annotate() error return
    - perf annotate: Fix arch specific ->init() failure errors
    - perf annotate: Return appropriate error code for allocation failures
    - perf annotate: Don't return -1 for error when doing BPF disassembly
    - staging: rtl8188eu: fix null dereference when kzalloc fails
    - RDMA/siw: Fix serialization issue in write_space()
    - RDMA/hfi1: Prevent memory leak in sdma_init
    - RDMA/iw_cxgb4: fix SRQ access from dump_qp()
    - RDMA/iwcm: Fix a lock inversion issue
    - HID: hyperv: Use in-place iterator API in the channel callback
    - kselftest: exclude failed TARGETS from runlist
    - selftests/kselftest/runner.sh: Add 45 second timeout per test
    - nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
    - arm64: cpufeature: Effectively expose FRINT capability to userspace
    - arm64: Fix incorrect irqflag restore for priority masking for compat
    - arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
    - tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
    - tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
    - serial/sifive: select SERIAL_EARLYCON
    - tty: n_hdlc: fix build on SPARC
    - misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
    - RDMA/core: Fix an error handling path in 'res_get_common_doit()'
    - RDMA/cm: Fix memory leak in cm_add/remove_one
    - RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
    - RDMA/mlx5: Do not allow rereg of a ODP MR
    - RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
    - RDMA/mlx5: Add missing synchronize_srcu() for MW cases
    - gpio: max77620: Use correct unit for debounce times
    - fs: cifs: mute -Wunused-const-variable message
    - arm64: vdso32: Fix broken compat vDSO build warnings
    - arm64: vdso32: Detect binutils support for dmb ishld
    - serial: mctrl_gpio: Check for NULL pointer
    - serial: 8250_...

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.