Mounting LVM snapshots with xfs can hit kernel BUG in nvme driver

Bug #1869229 reported by Heitor Alves de Siqueira on 2020-03-26
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Heitor Alves de Siqueira

Bug Description

[Impact]
When mounting LVM snapshots using xfs, it's possible to hit a BUG_ON() in nvme driver.

Upstream commit 729204ef49ec ("block: relax check on sg gap") introduced a way to merge bios if they are physically contiguous. This can lead to issues if one rq starts with a non-aligned buffer, as it can cause the merged segment to end in an unaligned virtual boundary. In some AWS instances, it's possible to craft such a request when attempting to mount LVM snapshots using xfs. This will then cause a kernel spew due to a BUG_ON in nvme_setup_prps(), which checks if dma_len is aligned to the page size.

[Fix]
Upstream commit 5a8d75a1b8c9 ("block: fix bio_will_gap() for first bvec with offset") prevents requests that begin with an unaligned buffer from being merged.

[Test Case]
This has been verified on AWS with c5d.large instances:

1) Prepare the LVM device + snapshot
$ sudo vgcreate vg0 /dev/nvme1n1
$ sudo lvcreate -L5G -n data0 vg0
$ sudo mkfs.xfs /dev/vg0/data0
$ sudo mount /dev/vg0/data0 /mnt
$ sudo touch /mnt/test
$ sudo touch /mnt/test2
$ sudo ls /mnt
$ sudo umount /mnt
$ sudo lvcreate -l100%FREE -s /dev/vg0/data0 -n data0_snap

2) Attempting to mount the previously created snapshot results in the Oops:
$ sudo mount /dev/vg0/data0_snap /mnt
Segmentation fault (core dumped)

[Regression Potential]
The fix prevents some bios from being merged, so it can have a performance impact in certain scenarios. The patch only targets misaligned segments, so the impact should be less noticeable in the general case.
The commit is also present in mainline kernels since 4.13, and hasn't been changed significantly, so potential for other regressions should be low.

CVE References

Changed in linux (Ubuntu):
assignee: Heitor Alves de Siqueira (halves) → nobody
status: New → Fix Released
Changed in linux (Ubuntu Xenial):
status: New → Confirmed
assignee: nobody → Heitor Alves de Siqueira (halves)
description: updated
Download full text (4.5 KiB)

For reference, the kernel spew of the BUG_ON:

[ 78.354129] kernel BUG at /home/ubuntu/xenial-aws/drivers/nvme/host/pci.c:619!
[ 78.357297] invalid opcode: 0000 [#1] SMP
[ 78.359613] Modules linked in: dm_snapshot dm_bufio xfs ppdev serio_raw parport_pc 8250_fintek parport i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ena
[ 78.387878] CPU: 0 PID: 1687 Comm: mount Not tainted 4.4.0-1105-aws #116
[ 78.390837] Hardware name: Amazon EC2 c5d.large/, BIOS 1.0 10/16/2017
[ 78.393692] task: ffff8800bb155400 ti: ffff8800b93bc000 task.ti: ffff8800b93bc000
[ 78.396973] RIP: 0010:[<ffffffff815dbd06>] [<ffffffff815dbd06>] nvme_queue_rq+0x8c6/0xa60
[ 78.400787] RSP: 0018:ffff8800b93bf7c8 EFLAGS: 00010286
[ 78.403151] RAX: 0000000000000078 RBX: 0000000000001000 RCX: 0000000000001000
[ 78.406276] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000
[ 78.409390] RBP: ffff8800b93bf8a8 R08: ffff8800b916c700 R09: 0000000000001000
[ 78.412518] R10: 000000000001ec00 R11: ffff8800b8e30000 R12: 00000000fffffc00
[ 78.417056] R13: 0000000000000010 R14: 000000000000fc00 R15: 0000000035fd5000
[ 78.421581] FS: 00007f30fe043840(0000) GS:ffff880130a00000(0000) knlGS:0000000000000000
[ 78.427884] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.431827] CR2: 00007f57d4057889 CR3: 0000000035974000 CR4: 0000000000360670
[ 78.436322] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 78.440821] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 78.445316] Stack:
[ 78.447706] ffff880036009480 ffff880036009700 ffff8800b7782800 0000000000000ff8
[ 78.454583] ffff8800b8e30420 ffff8800360a9400 ffff88000001fc00 ffff8800b7697b00
[ 78.461462] ffff880100001000 ffff8800b8e30000 ffff88003604c000 00000001ffc00400
[ 78.468332] Call Trace:
[ 78.470921] [<ffffffff813e6617>] blk_mq_make_request+0x407/0x550
[ 78.475001] [<ffffffff813d8f14>] generic_make_request+0x114/0x2d0
[ 78.479110] [<ffffffff813d0371>] ? bvec_alloc+0x91/0x100
[ 78.482936] [<ffffffff813d9146>] submit_bio+0x76/0x160
[ 78.486680] [<ffffffffc0347a14>] _xfs_buf_ioapply+0x2e4/0x4a0 [xfs]
[ 78.490866] [<ffffffff810b22e0>] ? wake_up_q+0x70/0x70
[ 78.494601] [<ffffffffc0349c94>] ? xfs_bwrite+0x24/0x60 [xfs]
[ 78.498583] [<ffffffffc034975d>] xfs_buf_submit_wait+0x5d/0x230 [xfs]
[ 78.502861] [<ffffffffc0349c94>] xfs_bwrite+0x24/0x60 [xfs]
[ 78.506785] [<ffffffffc037108f>] xlog_bwrite+0x7f/0x100 [xfs]
[ 78.510787] [<ffffffffc0371f34>] xlog_write_log_records+0x1a4/0x230 [xfs]
[ 78.515192] [<ffffffffc0372077>] xlog_clear_stale_blocks+0xb7/0x1b0 [xfs]
[ 78.519596] [<ffffffffc037198f>] ? xlog_bread+0x3f/0x50 [xfs]
[ 78.523588] [<ffffffffc03765eb>] xlog_find_tail+0x2db/0x3b0 [xfs]
[ 78.527705] [<ffffffffc03766ed>] xlog_recover+0x2d/0x160 [xfs]
[ 78.531720] [<ffffffff...

Read more...

Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Matthew Ruffell (mruffell) wrote :

I created a new c5d.large instance on AWS, and installed 4.4.0-177-generic from -updates. I ran through the reproduction steps, and the issue reproduces.

I then created a new c5d.large instance, enabled -proposed and installed 4.4.0-178-generic. I again ran through the reproduction steps, and this time, the final mount succeeds, there is no segmentation fault and dmesg is clean of any errors.

I am happy to mark this as verified.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Launchpad Janitor (janitor) wrote :
Download full text (17.6 KiB)

This bug was fixed in the package linux - 4.4.0-178.208

---------------
linux (4.4.0-178.208) xenial; urgency=medium

  * xenial/linux: 4.4.0-178.208 -proposed tracker (LP: #1870660)

  * CVE-2019-19768
    - blktrace: Protect q->blk_trace with RCU
    - blktrace: fix dereference after null check

  * Multiple Kexec in AWS Nitro instances fail (LP: #1869948)
    - net: ena: Add PCI shutdown handler to allow safe kexec

  * Insert test_bpf module will report 4 failures for ubuntu_bpf_jit on X s390x
    (LP: #1768452)
    - test_bpf: flag tests that cannot be jited on s390

  * Mounting LVM snapshots with xfs can hit kernel BUG in nvme driver
    (LP: #1869229)
    - block: fix bio_will_gap() for first bvec with offset

  * Xenial update: 4.4.217 upstream stable release (LP: #1868629)
    - NFS: Remove superfluous kmap in nfs_readdir_xdr_to_array
    - r8152: check disconnect status after long sleep
    - net: nfc: fix bounds checking bugs on "pipe"
    - bnxt_en: reinitialize IRQs when MTU is modified
    - fib: add missing attribute validation for tun_id
    - nl802154: add missing attribute validation
    - nl802154: add missing attribute validation for dev_type
    - team: add missing attribute validation for port ifindex
    - team: add missing attribute validation for array index
    - nfc: add missing attribute validation for SE API
    - nfc: add missing attribute validation for vendor subcommand
    - ipvlan: add cond_resched_rcu() while processing muticast backlog
    - ipvlan: do not add hardware address of master to its unicast filter list
    - ipvlan: egress mcast packets are not exceptional
    - ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()
    - ipvlan: don't deref eth hdr before checking it's set
    - macvlan: add cond_resched() during multicast processing
    - net: fec: validate the new settings in fec_enet_set_coalesce()
    - slip: make slhc_compress() more robust against malicious packets
    - bonding/alb: make sure arp header is pulled before accessing it
    - net: fq: add missing attribute validation for orphan mask
    - iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn +
      add_taint
    - drm/amd/display: remove duplicated assignment to grph_obj_type
    - gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
    - KVM: x86: clear stale x86_emulate_ctxt->intercept value
    - ARC: define __ALIGN_STR and __ALIGN symbols for ARC
    - efi: Fix a race and a buffer overflow while reading efivars via sysfs
    - iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint
    - iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page
    - nl80211: add missing attribute validation for critical protocol indication
    - nl80211: add missing attribute validation for channel switch
    - netfilter: cthelper: add missing attribute validation for cthelper
    - iommu/vt-d: Fix the wrong printing in RHSA parsing
    - iommu/vt-d: Ignore devices with out-of-spec domain number
    - ipv6: restrict IPV6_ADDRFORM operation
    - efi: Add a sanity check to efivar_store_raw()
    - batman-adv: Fix invalid read while copying bat_iv.bcast_own
    - batman-adv: Only p...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers