Ubuntu
linux package

Performance degradation when copying from LVM snapshot backed by NVMe disk

Xenial (16.04)
Bug #1833319

Bug #1833319 reported by Matthew Ruffell on 2019-06-18

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Incomplete	Undecided	Unassigned
	Xenial	Fix Released	Undecided	Matthew Ruffell

Bug Description

BugLink: https://bugs.launchpad.net/bugs/1833319

[Impact]
When copying files from a mounted LVM snapshot which resides on NVMe storage devices, there is a massive performance degradation in the rate sectors are read from the disk.

The kernel is not merging sector requests and is instead issuing many small
sector requests to the NVMe storage controller instead of one larger request.

Experiments have shown a 14x-25x performance degradation in reads, where copies used to take seconds, now take minutes, and copies which took thirty minutes now take many hours.

The following was found with btrace, running alongside cat (see Testing):

A = IO remapped to different device
Q = IO handled by request queue
G = Get request
U = Unplug request
I = IO inserted onto request queue
D = IO issued to driver
C = IO completion

When reading from the LVM snapshot, we see:

259,0 1 113 0.001117160 1606 A R 837872 + 8 <- (252,0) 835824
259,0 1 114 0.001117276 1606 Q R 837872 + 8 [cat]
259,0 1 115 0.001117451 1606 G R 837872 + 8 [cat]
259,0 1 116 0.001117979 1606 A R 837880 + 8 <- (252,0) 835832
259,0 1 117 0.001118119 1606 Q R 837880 + 8 [cat]
259,0 1 118 0.001118285 1606 G R 837880 + 8 [cat]
259,0 1 122 0.001121613 1606 I RS 837640 + 8 [cat]
259,0 1 123 0.001121687 1606 I RS 837648 + 8 [cat]
259,0 1 124 0.001121758 1606 I RS 837656 + 8 [cat]
...
259,0 1 154 0.001126118 377 D RS 837648 + 8 [kworker/1:1H]
259,0 1 155 0.001126445 377 D RS 837656 + 8 [kworker/1:1H]
259,0 1 156 0.001126871 377 D RS 837664 + 8 [kworker/1:1H]
...
259,0 1 183 0.001848512 0 C RS 837632 + 8 [0]

Now what is happening here, is that a request for 8 sector read is placed onto the IO request queue, and is then inserted one at a time to the driver request queue and then fetched by the driver.

Comparing this behaviour to reading data from a LVM snapshot on 4.6+ mainline or the Ubuntu 4.15 HWE kernel:

M = IO back merged with request on queue

259,0 0 194 0.000532515 1897 A R 7358960 + 8 <- (253,0) 7356912
259,0 0 195 0.000532634 1897 Q R 7358960 + 8 [cat]
259,0 0 196 0.000532810 1897 M R 7358960 + 8 [cat]
259,0 0 197 0.000533864 1897 A R 7358968 + 8 <- (253,0) 7356920
259,0 0 198 0.000533991 1897 Q R 7358968 + 8 [cat]
259,0 0 199 0.000534177 1897 M R 7358968 + 8 [cat]
259,0 0 200 0.000534474 1897 UT N [cat] 1
259,0 0 201 0.000534586 1897 I R 7358464 + 512 [cat]
259,0 0 202 0.000537055 1897 D R 7358464 + 512 [cat]
259,0 0 203 0.002242539 0 C R 7358464 + 512 [0]

This shows us a 8 sector read is added to the request queue, and is then
subsequently [M]erged backward with other requests on the queue until the sum of all of those merged requests becomes 512 sectors. From there, the 512 sector read is placed onto the IO queue, where it is fetched by the device driver, and completes.

[Fix]

The problem is that the NVMe driver on 4.4 xenial kernel is not merging 8 sector requests.

Merging is controlled per device by this sysfs entry:
/sys/block/nvme1n1/queue/nomerges

On 4.4 xenial, reading from this yields 2, or (QUEUE_FLAG_NOMERGES).
On 4.6+ and 4.15 HWE kernel, reading from this yields 0, or allowing merge.

Setting this to 0 on the 4.4 kernel with:

# echo "0" > /sys/block/nvme1n1/queue/nomerges

and testing again, we find performance is restored and the problem is fixed.

Performing a btrace, we see 8 sector reads get backmerged into a 512 sector read which is done in one go.

The problem was fixed in 4.5 upstream with the below commit:

commit ef2d4615c59efb312e531a5e949970f37ca1c841
Author: Keith Busch <email address hidden>
Date: Thu Feb 11 13:05:40 2016 -0700
Subject: NVMe: Allow request merges

This commit removes the QUEUE_FLAG_NOMERGES flag from being set during driver init, allowing requests to be backmerged. This also has a direct effect of defaulting /sys/block/nvme1n1/queue/nomerges to 0.

Please cherry-pick ef2d4615c59efb312e531a5e949970f37ca1c841 to all xenial 4.4
kernels.

[Testcase]

You can replicate the problem with a system with a NVMe disk. I recommend using c5.large AWS EC2 instances with a secondary gpt2 EBS disk of 200gb or larger.

Steps (with NVMe disk being /dev/nvme1n1):
  1. sudo pvcreate /dev/nvme1n1
  2. sudo vgcreate secvol /dev/nvme1n1
  3. sudo lvcreate --name seclv -l 80%FREE secvol
  4. sudo mkfs.ext4 /dev/secvol/seclv
  5. sudo mount /dev/mapper/secvol-seclv /mnt
  6. for i in `seq 1 20`; do sudo dd if=/dev/zero of=/mnt/dummy$i bs=512M count=1; done
  7. sudo lvcreate --snapshot /dev/secvol/seclv --name tmp_backup1 --extents '90%FREE'
  8. NEWMOUNT=$(mktemp -t -d mount.backup_XXX)
  9. sudo mount -v -o ro /dev/secvol/tmp_backup1 $NEWMOUNT

To replicate, simply read one of those 512mb files:
10. time cat $NEWMOUNT/dummy1 1> /dev/null

On a stock xenial kernel, expect to see the following:

4.4.0-151-generic #178-Ubuntu

$ time cat /tmp/mount.backup_TYD/dummy1 1> /dev/null

real 0m42.693s
user 0m0.008s
sys 0m0.388s
$ cat /sys/block/nvme1n1/queue/nomerges
2

On a patched xenial kernel, performance is restored:

4.4.0-151-generic #178+hf228435v20190618b1-Ubuntu

$ time cat /tmp/mount.backup_aId/dummy1 1> /dev/null

real 0m1.773s
user 0m0.008s
sys 0m0.184s
$ cat /sys/block/nvme1n1/queue/nomerges
0

[Regression Potential]

Cherry picking "NVMe: Allow request merges" changes the default request policy for NVMe drives, which may give some cause for concern in both terms of stability and performance for other workloads.

Regarding stability, this flag was originally set when the NVMe driver was
bio based, before the driver had been converted to blk-mq and separated out from /block. You can read a mailing list thread about it here:

https://lists.infradead.org/pipermail/linux-nvme/2016-February/003946.html

Along with the commit "MD: make bio mergeable" there is no reason to not allow requests to be mergeable for the new NVMe driver.

Regarding performance for other workloads, I reference the commit which
QUEUE_FLAG_NOMERGES or nomerges == 2 was introduced:
commit: 488991e28e55b4fbca8067edf0259f69d1a6f92c
subject: block: Added in stricter no merge semantics for block I/O

nomerges Throughput %System Improvement (tput / %sys)
-------- ------------ ----------- -------------------------
0 12.45 MB/sec 0.669365609
1 12.50 MB/sec 0.641519199 0.40% / 2.71%
2 12.52 MB/sec 0.639849750 0.56% / 2.96%

It shows a 0.56% performance increase for no merging / 2, over allowing
merging / 0 for random IO workloads.

Comparing this with the 14x-25x performance degradation for reads where requests are not able to be merged, it is clear that changing the default to 0 will not impact any other workloads by any significant margin.

The commit is also present in Linux 4.5 mainline, can be cleanly cherry picked and is still present in the kernel to this day, and after review of the NVMe driver, I believe there will be no regressions.

If you are interested in testing, I have prepared two ppas with
ef2d4615c59efb312e531a5e949970f37ca1c841 patched:

linux-image-generic: https://launchpad.net/~mruffell/+archive/ubuntu/sf228435-test-generic
linux-image-aws: https://launchpad.net/~mruffell/+archive/ubuntu/sf228435-test

See original description

Tags:

CVE References

Matthew Ruffell (mruffell) on 2019-06-18

tags:

added: sts

Matthew Ruffell (mruffell) on 2019-06-18

description:	updated
description:	updated
Changed in linux (Ubuntu Xenial):
assignee:	nobody → Matthew Ruffell (mruffell)
status:	New → In Progress

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2019-06-19: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1833319

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: xenial

Matthew Ruffell (mruffell) on 2019-06-19

description:

updated

Khaled El Mously (kmously) on 2019-06-27

Changed in linux (Ubuntu Xenial):
status:	In Progress → Fix Committed

Revision history for this message

Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote on 2019-07-03:

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:

added: verification-needed-xenial

Revision history for this message

Matthew Ruffell (mruffell) wrote on 2019-07-04:

I enabled enabled -proposed and installed 4.4.0-155.182, and went through the
test case on a c5.large instance on aws. Note, I used the -generic kernel since
-aws doesn't seem to be ready yet.

The problem is solved and performance is the same as non-snapshot mounted disks.

We can see that merging has been enabled by looking at the flag:

$ cat /sys/block/nvme1n1/queue/nomerges
0

The problem is fixed. Changing tag to verified.

tags:

added: verification-done-xenial
removed: verification-needed-xenial

Revision history for this message

Matthew Ruffell (mruffell) wrote on 2019-07-07:

I enabled -proposed and installed 4.4.0-1088-aws #99-Ubuntu, and again went through the test case on a c5.large instance on aws.

The problem is solved and performance is restored, and performs the same as a non-snapshot mounted disk.

Again, we can see merging has been enabled with:

$ cat /sys/block/nvme1n1/queue/nomerges
0

The problem is fixed. Happy with verification status of done.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2019-07-24:

Download full text (30.5 KiB)

This bug was fixed in the package linux - 4.4.0-157.185

---------------
linux (4.4.0-157.185) xenial; urgency=medium

* linux: 4.4.0-157.185 -proposed tracker (LP: #1837476)

  * systemd 229-4ubuntu21.22 ADT test failure with linux 4.4.0-156.183 (storage)
    (LP: #1837235)
    - Revert "block/bio: Do not zero user pages"
    - Revert "block: Clear kernel memory before copying to user"
    - Revert "bio_copy_from_iter(): get rid of copying iov_iter"

linux (4.4.0-156.183) xenial; urgency=medium

* linux: 4.4.0-156.183 -proposed tracker (LP: #1836880)

* BCM43602 802.11ac Wireless regression - PCI ID 14e4:43ba (LP: #1836801)
- brcmfmac: add eth_type_trans back for PCIe full dongle

linux (4.4.0-155.182) xenial; urgency=medium

* linux: 4.4.0-155.182 -proposed tracker (LP: #1834918)

* Geneve tunnels don't work when ipv6 is disabled (LP: #1794232)
- geneve: correctly handle ipv6.disable module parameter

  * Kernel modules generated incorrectly when system is localized to a non-
    English language (LP: #1828084)
    - scripts: override locale from environment when running recordmcount.pl

* Handle overflow in proc_get_long of sysctl (LP: #1833935)
- sysctl: handle overflow in proc_get_long

This bug was fixed in the package linux - 4.4.0-157.185

---------------
linux (4.4.0-157.185) xenial; urgency=medium

* linux: 4.4.0-157.185 -proposed tracker (LP: #1837476)

linux (4.4.0-156.183) xenial; urgency=medium

* linux: 4.4.0-156.183 -proposed tracker (LP: #1836880)

* BCM43602 802.11ac Wireless regression - PCI ID 14e4:43ba (LP: #1836801)
    - brcmfmac: add eth_type_trans back for PCIe full dongle

linux (4.4.0-155.182) xenial; urgency=medium

* linux: 4.4.0-155.182 -proposed tracker (LP: #1834918)

* Geneve tunnels don't work when ipv6 is disabled (LP: #1794232)
    - geneve: correctly handle ipv6.disable module parameter

* Kernel modules generated incorrectly when system is localized to a non-
    English language (LP: #1828084)
    - scripts: override locale from environment when running recordmcount.pl

* Handle overflow in proc_get_long of sysctl (LP: #1833935)
    - sysctl: handle overflow in proc_get_long

* Xenial update: 4.4.181 upstream stable release (LP: #1832661)
    - x86/speculation/mds: Revert CPU buffer clear on double fault exit
    - x86/speculation/mds: Improve CPU buffer clear documentation
    - ARM: exynos: Fix a leaked reference by adding missing of_node_put
    - crypto: vmx - fix copy-paste error in CTR mode
    - crypto: crct10dif-generic - fix use via crypto_shash_digest()
    - crypto: x86/crct10dif-pcl - fix use via crypto_shash_digest()
    - ALSA: usb-audio: Fix a memory leak bug
    - ALSA: hda/hdmi - Consider eld_valid when reporting jack event
    - ALSA: hda/realtek - EAPD turn on later
    - ASoC: max98090: Fix restore of DAPM Muxes
    - ASoC: RT5677-SPI: Disable 16Bit SPI Transfers
    - mm/mincore.c: make mincore() more conservative
    - ocfs2: fix ocfs2 read inode data panic in ocfs2_iget
    - mfd: da9063: Fix OTP control register names to match datasheets for
      DA9063/63L
    - tty/vt: fix write/write race in ioctl(KDSKBSENT) handler
    - ext4: actually request zeroing of inode table after grow
    - ext4: fix ext4_show_options for file systems w/o journal
    - Btrfs: do not start a transaction at iterate_extent_inodes()
    - bcache: fix a race between cache register and cacheset unregister
    - bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim()
    - ipmi:ssif: compare block number correctly for multi-part return messages
    - crypto: gcm - Fix error return code in crypto_gcm_create_common()
    - crypto: gcm - fix incompatibility between "gcm" and "gcm_base"
    - crypto: chacha20poly1305 - set cra_name correctly
    - crypto: salsa20 - don't access already-freed walk.iv
    - crypto: arm/aes-neonbs - don't access already-freed walk.iv
    - writeback: synchronize sync(2) against cgroup writeback membership switches
    - fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going
      into workqueue when umount
    - ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
    - KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes
    - net: avoid weird emergency message
    - net/mlx4_core: Change the error print to info print
    - ppp: deflate: Fix possible crash in deflate_init
    - tipc: switch order of device registration to fix a crash
    - tipc: fix modprobe tipc failed after switch order of device registration
    - stm class: Fix channel free in stm output free path
    - md: add mddev->pers to avoid potential NULL pointer dereference
    - intel_th: msu: Fix single mode with IOMMU
    - of: fix clang -Wunsequenced for be32_to_cpu()
    - cifs: fix strcat buffer overflow and reduce raciness in
      smb21_set_oplock_level()
    - media: ov6650: Fix sensor possibly not detected on probe
    - NFS4: Fix v4.0 client state corruption when mount
    - clk: tegra: Fix PLLM programming on Tegra124+ when PMC overrides divider
    - fuse: fix writepages on 32bit
    - fuse: honor RLIMIT_FSIZE in fuse_file_fallocate
    - iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114
    - ceph: flush dirty inodes before proceeding with remount
    - tracing: Fix partial reading of trace event's id file
    - memory: tegra: Fix integer overflow on tick value calculation
    - perf intel-pt: Fix instructions sampling rate
    - perf intel-pt: Fix improved sample timestamp
    - perf intel-pt: Fix sample timestamp wrt non-taken branches
    - fbdev: sm712fb: fix brightness control on reboot, don't set SR30
    - fbdev: sm712fb: fix VRAM detection, don't set SR70/71/74/75
    - fbdev: sm712fb: fix white screen of death on reboot, don't set CR3B-CR3F
    - fbdev: sm712fb: fix boot screen glitch when sm712fb replaces VGA
    - fbdev: sm712fb: fix crashes during framebuffer writes by correctly mapping
      VRAM
    - fbdev: sm712fb: fix support for 1024x768-16 mode
    - fbdev: sm712fb: use 1024x768 by default on non-MIPS, fix garbled display
    - fbdev: sm712fb: fix crashes and garbled display during DPMS modesetting
    - PCI: Mark Atheros AR9462 to avoid bus reset
    - dm delay: fix a crash when invalid device is specified
    - xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlink
    - xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel module
    - vti4: ipip tunnel deregistration fixes.
    - xfrm4: Fix uninitialized memory read in _decode_session4
    - KVM: arm/arm64: Ensure vcpu target is unset on reset failure
    - power: supply: sysfs: prevent endless uevent loop with
      CONFIG_POWER_SUPPLY_DEBUG
    - ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
    - perf bench numa: Add define for RUSAGE_THREAD if not present
    - Revert "Don't jump to compute_result state from check_result state"
    - md/raid: raid5 preserve the writeback action after the parity check
    - btrfs: Honour FITRIM range constraints during free space trim
    - fbdev: sm712fb: fix memory frequency by avoiding a switch/case fallthrough
    - ext4: do not delete unlinked inode from orphan list on failed truncate
    - KVM: x86: fix return value for reserved EFER
    - bio: fix improper use of smp_mb__before_atomic()
    - Revert "scsi: sd: Keep disk read-only when re-reading partition"
    - crypto: vmx - CTR: always increment IV as quadword
    - gfs2: Fix sign extension bug in gfs2_update_stats
    - Btrfs: fix race between ranged fsync and writeback of adjacent ranges
    - btrfs: sysfs: don't leak memory when failing add fsid
    - fbdev: fix divide error in fb_var_to_videomode
    - hugetlb: use same fault hash key for shared and private mappings
    - fbdev: fix WARNING in __alloc_pages_nodemask bug
    - media: cpia2: Fix use-after-free in cpia2_exit
    - media: vivid: use vfree() instead of kfree() for dev->bitmap_cap
    - ssb: Fix possible NULL pointer dereference in ssb_host_pcmcia_exit
    - at76c50x-usb: Don't register led_trigger if usb_register_driver failed
    - perf tools: No need to include bitops.h in util.h
    - gfs2: Fix lru_count going negative
    - cxgb4: Fix error path in cxgb4_init_module
    - mmc: core: Verify SD bus width
    - powerpc/boot: Fix missing check of lseek() return value
    - ASoC: imx: fix fiq dependencies
    - spi: pxa2xx: fix SCR (divisor) calculation
    - brcm80211: potential NULL dereference in
      brcmf_cfg80211_vndr_cmds_dcmd_handler()
    - rtc: 88pm860x: prevent use-after-free on device remove
    - w1: fix the resume command API
    - dmaengine: pl330: _stop: clear interrupt status
    - mac80211/cfg80211: update bss channel on channel switch
    - ASoC: fsl_sai: Update is_slave_mode with correct value
    - mwifiex: prevent an array overflow
    - net: cw1200: fix a NULL pointer dereference
    - bcache: return error immediately in bch_journal_replay()
    - bcache: fix failure in journal relplay
    - bcache: add failure check to run_cache_set() for journal replay
    - bcache: avoid clang -Wunintialized warning
    - x86/build: Move _etext to actual end of .text
    - smpboot: Place the __percpu annotation correctly
    - x86/mm: Remove in_nmi() warning from 64-bit implementation of
      vmalloc_fault()
    - mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC
      versions
    - HID: logitech-hidpp: use RAP instead of FAP to get the protocol version
    - pinctrl: pistachio: fix leaked of_node references
    - dmaengine: at_xdmac: remove BUG_ON macro in tasklet
    - media: coda: clear error return value before picture run
    - media: ov6650: Move v4l2_clk_get() to ov6650_video_probe() helper
    - media: au0828: stop video streaming only when last user stops
    - media: ov2659: make S_FMT succeed even if requested format doesn't match
    - audit: fix a memory leak bug
    - media: au0828: Fix NULL pointer dereference in au0828_analog_stream_enable()
    - media: pvrusb2: Prevent a buffer overflow
    - powerpc/numa: improve control of topology updates
    - sched/core: Check quota and period overflow at usec to nsec conversion
    - sched/core: Handle overflow in cpu_shares_write_u64
    - USB: core: Don't unbind interfaces following device reset failure
    - x86/irq/64: Limit IST stack overflow check to #DB stack
    - i40e: don't allow changes to HW VLAN stripping on active port VLANs
    - RDMA/cxgb4: Fix null pointer dereference on alloc_skb failure
    - hwmon: (vt1211) Use request_muxed_region for Super-IO accesses
    - hwmon: (smsc47m1) Use request_muxed_region for Super-IO accesses
    - hwmon: (smsc47b397) Use request_muxed_region for Super-IO accesses
    - hwmon: (pc87427) Use request_muxed_region for Super-IO accesses
    - hwmon: (f71805f) Use request_muxed_region for Super-IO accesses
    - scsi: libsas: Do discovery on empty PHY to update PHY info
    - mmc_spi: add a status check for spi_sync_locked
    - mmc: sdhci-of-esdhc: add erratum eSDHC5 support
    - mmc: sdhci-of-esdhc: add erratum eSDHC-A001 and A-008358 support
    - PM / core: Propagate dev->power.wakeup_path when no callbacks
    - extcon: arizona: Disable mic detect if running when driver is removed
    - s390: cio: fix cio_irb declaration
    - cpufreq: ppc_cbe: fix possible object reference leak
    - cpufreq/pasemi: fix possible object reference leak
    - cpufreq: pmac32: fix possible object reference leak
    - x86/build: Keep local relocations with ld.lld
    - iio: ad_sigma_delta: Properly handle SPI bus locking vs CS assertion
    - iio: hmc5843: fix potential NULL pointer dereferences
    - iio: common: ssp_sensors: Initialize calculated_time in
      ssp_common_process_data
    - rtlwifi: fix a potential NULL pointer dereference
    - brcmfmac: fix missing checks for kmemdup
    - b43: shut up clang -Wuninitialized variable warning
    - brcmfmac: convert dev_init_lock mutex to completion
    - brcmfmac: fix race during disconnect when USB completion is in progress
    - scsi: ufs: Fix regulator load and icc-level configuration
    - scsi: ufs: Avoid configuring regulator with undefined voltage range
    - arm64: cpu_ops: fix a leaked reference by adding missing of_node_put
    - x86/ia32: Fix ia32_restore_sigcontext() AC leak
    - chardev: add additional check for minor range overlap
    - HID: core: move Usage Page concatenation to Main item
    - ASoC: eukrea-tlv320: fix a leaked reference by adding missing of_node_put
    - ASoC: fsl_utils: fix a leaked reference by adding missing of_node_put
    - cxgb3/l2t: Fix undefined behaviour
    - spi: tegra114: reset controller on probe
    - media: wl128x: prevent two potential buffer overflows
    - virtio_console: initialize vtermno value for ports
    - tty: ipwireless: fix missing checks for ioremap
    - rcutorture: Fix cleanup path for invalid torture_type strings
    - usb: core: Add PM runtime calls to usb_hcd_platform_shutdown
    - scsi: qla4xxx: avoid freeing unallocated dma memory
    - media: m88ds3103: serialize reset messages in m88ds3103_set_frontend
    - media: go7007: avoid clang frame overflow warning with KASAN
    - media: saa7146: avoid high stack usage with clang
    - scsi: lpfc: Fix SLI3 commands being issued on SLI4 devices
    - spi : spi-topcliff-pch: Fix to handle empty DMA buffers
    - spi: rspi: Fix sequencer reset during initialization
    - spi: Fix zero length xfer bug
    - ASoC: davinci-mcasp: Fix clang warning without CONFIG_PM
    - ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
    - llc: fix skb leak in llc_build_and_send_ui_pkt()
    - net-gro: fix use-after-free read in napi_gro_frags()
    - net: stmmac: fix reset gpio free missing
    - usbnet: fix kernel crash after disconnect
    - tipc: Avoid copying bytes beyond the supplied data
    - bnxt_en: Fix aggregation buffer leak under OOM condition.
    - net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
    - crypto: vmx - ghash: do nosimd fallback manually
    - xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
    - Revert "tipc: fix modprobe tipc failed after switch order of device
      registration"
    - tipc: fix modprobe tipc failed after switch order of device registration -v2
    - sparc64: Fix regression in non-hypervisor TLB flush xcall
    - include/linux/bitops.h: sanitize rotate primitives
    - xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
    - usb: xhci: avoid null pointer deref when bos field is NULL
    - USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
    - USB: sisusbvga: fix oops in error path of sisusb_probe
    - USB: Add LPM quirk for Surface Dock GigE adapter
    - USB: rio500: refuse more than one device at a time
    - USB: rio500: fix memory leak in close after disconnect
    - media: usb: siano: Fix general protection fault in smsusb
    - media: usb: siano: Fix false-positive "uninitialized variable" warning
    - media: smsusb: better handle optional alignment
    - scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
    - scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
    - Btrfs: fix race updating log root item during fsync
    - ALSA: hda/realtek - Set default power save node to 0
    - drm/nouveau/i2c: Disable i2c bus access after ->fini()
    - tty: serial: msm_serial: Fix XON/XOFF
    - tty: max310x: Fix external crystal register setup
    - memcg: make it work on sparse non-0-node systems
    - kernel/signal.c: trace_signal_deliver when signal_group_exit
    - CIFS: cifs_read_allocate_pages: don't iterate through whole page array on
      ENOMEM
    - binder: Replace "%p" with "%pK" for stable
    - binder: replace "%p" with "%pK"
    - brcmfmac: Add length checks on firmware events
    - brcmfmac: screening firmware event packet
    - brcmfmac: revise handling events in receive path
    - brcmfmac: fix incorrect event channel deduction
    - brcmfmac: add length checks in scheduled scan result handler
    - brcmfmac: add subtype check for event handling in data path
    - userfaultfd: don't pin the user memory in userfaultfd_file_create()
    - Revert "x86/build: Move _etext to actual end of .text"
    - net: cdc_ncm: GetNtbFormat endian fix
    - usb: gadget: fix request length error for isoc transfer
    - media: uvcvideo: Fix uvc_alloc_entity() allocation alignment
    - ethtool: fix potential userspace buffer overflow
    - neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit
    - net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
    - net: rds: fix memory leak in rds_ib_flush_mr_pool
    - pktgen: do not sleep with the thread lock held.
    - rcu: locking and unlocking need to always be at least barriers
    - parisc: Use implicit space register selection for loading the coherence
      index of I/O pdirs
    - fuse: fallocate: fix return with locked inode
    - MIPS: pistachio: Build uImage.gz by default
    - genwqe: Prevent an integer overflow in the ioctl
    - drm/gma500/cdv: Check vbt config bits when detecting lvds panels
    - fs: stream_open - opener for stream-like files so that read and write can
      run simultaneously without deadlock
    - fuse: Add FOPEN_STREAM to use stream_open()
    - ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabled
    - ethtool: check the return value of get_regs_len
    - Linux 4.4.181

* CVE-2019-2054
    - arm/ptrace: run seccomp after ptrace

* CVE-2018-12126 // CVE-2018-12127 // CVE-2018-12130
    - x86/speculation: Remove redundant arch_smt_update() invocation

* Revert x86/vdso linker changes from #1830890 as this causes glibc
    2.29-0ubuntu3 FTBFS on eoan (LP: #1834315)
    - Revert "x86/vdso: Pass --eh-frame-hdr to the linker"
    - Revert "x86: vdso: Use $LD instead of $CC to link"

* CONFIG_LOG_BUF_SHIFT set to 14 is too low on arm64 (LP: #1824864)
    - [Config] CONFIG_LOG_BUF_SHIFT=18 on all 64bit arches

* CVE-2019-11833
    - ext4: zero out the unused memory region in the extent tree block

* idle-page oopses when accessing page frames that are out of range
    (LP: #1833410)
    - mm/page_idle.c: fix oops because end_pfn is larger than max_pfn

* Performance degradation when copying from LVM snapshot backed by NVMe disk
    (LP: #1833319)
    - NVMe: Allow request merges

* Bluetooth regressions with Xenial kernel 4.4.0-152.179 (LP: #1833698)
    - Revert "Bluetooth: Align minimum encryption key size for LE and BR/EDR
      connections"

* 4.4.0-145-generic Kernel Panic  ip6_expire_frag_queue (LP: #1824687)
    - SAUCE: ipv6: frags: fix skb extraction in ip6_expire_frag_queue()

* [Xenial] Customer can not SSH to Linux VM due to "VSC State Unhealthy"
    (LP: #1826416)
    - vmbus: fix missing signaling in hv_signal_on_read()

* Xenial update: 4.4.180 upstream stable release (LP: #1830176)
    - kbuild: simplify ld-option implementation
    - KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number
    - cifs: do not attempt cifs operation on smb2+ rename error
    - MIPS: scall64-o32: Fix indirect syscall number load
    - trace: Fix preempt_enable_no_resched() abuse
    - sched/numa: Fix a possible divide-by-zero
    - ceph: ensure d_name stability in ceph_dentry_hash()
    - ceph: fix ci->i_head_snapc leak
    - nfsd: Don't release the callback slot unless it was actually held
    - sunrpc: don't mark uninitialised items as VALID.
    - USB: Add new USB LPM helpers
    - USB: Consolidate LPM checks to avoid enabling LPM twice
    - powerpc/xmon: Add RFI flush related fields to paca dump
    - powerpc/64s: Improve RFI L1-D cache flush fallback
    - powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
    - Revert "UBUNTU: SAUCE: powerpc/64s: Add support for a store forwarding
      barrier at kernel entry/exit"
    - powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
    - powerpc/64s: Add barrier_nospec
    - powerpc/64s: Add support for ori barrier_nospec patching
    - powerpc/64s: Patch barrier_nospec in modules
    - powerpc/64s: Enable barrier_nospec based on firmware settings
    - powerpc/64: Use barrier_nospec in syscall entry
    - powerpc: Use barrier_nospec in copy_from_user()
    - powerpc/64s: Enhance the information in cpu_show_spectre_v1()
    - powerpc64s: Show ori31 availability in spectre_v1 sysfs file not v2
    - powerpc/64: Disable the speculation barrier from the command line
    - powerpc/64: Make stf barrier PPC_BOOK3S_64 specific.
    - powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC
    - powerpc/64: Call setup_barrier_nospec() from setup_arch()
    - powerpc/64: Make meltdown reporting Book3S 64 specific
    - powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E
    - powerpc/asm: Add a patch_site macro & helpers for patching instructions
    - powerpc/64s: Add new security feature flags for count cache flush
    - powerpc/64s: Add support for software count cache flush
    - powerpc/pseries: Query hypervisor for count cache flush settings
    - powerpc/powernv: Query firmware for count cache flush settings
    - powerpc: Avoid code patching freed init sections
    - powerpc/fsl: Add infrastructure to fixup branch predictor flush
    - powerpc/fsl: Add macro to flush the branch predictor
    - powerpc/fsl: Fix spectre_v2 mitigations reporting
    - powerpc/fsl: Add nospectre_v2 command line argument
    - powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)
    - powerpc/fsl: Update Spectre v2 reporting
    - powerpc/security: Fix spectre_v2 reporting
    - powerpc/fsl: Fix the flush of branch predictor.
    - tipc: handle the err returned from cmd header function
    - slip: make slhc_free() silently accept an error pointer
    - intel_th: gth: Fix an off-by-one in output unassigning
    - fs/proc/proc_sysctl.c: Fix a NULL pointer dereference
    - NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.
    - netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
    - tipc: check bearer name with right length in tipc_nl_compat_bearer_enable
    - tipc: check link name with right length in tipc_nl_compat_link_set
    - bpf: reject wrong sized filters earlier
    - Revert "block/loop: Use global lock for ioctl() operation."
    - ipv4: add sanity checks in ipv4_link_failure()
    - team: fix possible recursive locking when add slaves
    - net: stmmac: move stmmac_check_ether_addr() to driver probe
    - ipv4: set the tcp_min_rtt_wlen range from 0 to one day
    - powerpc/fsl: Enable runtime patching if nospectre_v2 boot arg is used
    - powerpc/fsl: Flush branch predictor when entering KVM
    - powerpc/fsl: Emulate SPRN_BUCSR register
    - powerpc/fsl: Flush the branch predictor at each kernel entry (32 bit)
    - powerpc/fsl: Sanitize the syscall table for NXP PowerPC 32 bit platforms
    - powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup'
    - powerpc/fsl: Add FSL_PPC_BOOK3E as supported arch for nospectre_v2 boot arg
    - Documentation: Add nospectre_v1 parameter
    - usbnet: ipheth: prevent TX queue timeouts when device not ready
    - usbnet: ipheth: fix potential null pointer dereference in ipheth_carrier_set
    - qlcnic: Avoid potential NULL pointer dereference
    - netfilter: bridge: set skb transport_header before entering
      NF_INET_PRE_ROUTING
    - sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()
    - usb: gadget: net2280: Fix overrun of OUT messages
    - usb: gadget: net2280: Fix net2280_dequeue()
    - usb: gadget: net2272: Fix net2272_dequeue()
    - ARM: dts: pfla02: increase phy reset duration
    - net: ks8851: Dequeue RX packets explicitly
    - net: ks8851: Reassert reset pin if chip ID check fails
    - net: ks8851: Delay requesting IRQ until opened
    - net: ks8851: Set initial carrier state to down
    - net: xilinx: fix possible object reference leak
    - net: ibm: fix possible object reference leak
    - net: ethernet: ti: fix possible object reference leak
    - scsi: qla4xxx: fix a potential NULL pointer dereference
    - usb: u132-hcd: fix resource leak
    - ceph: fix use-after-free on symlink traversal
    - scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN
    - libata: fix using DMA buffers on stack
    - kconfig/[mn]conf: handle backspace (^H) key
    - ALSA: line6: use dynamic buffers
    - ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
    - ipv6/flowlabel: wait rcu grace period before put_pid()
    - ipv6: invert flowlabel sharing check in process and user mode
    - bnxt_en: Improve multicast address setup logic.
    - packet: validate msg_namelen in send directly
    - USB: yurex: Fix protection fault after device removal
    - USB: w1 ds2490: Fix bug caused by improper use of altsetting array
    - USB: core: Fix unterminated string returned by usb_string()
    - USB: core: Fix bug caused by duplicate interface PM usage counter
    - HID: debug: fix race condition with between rdesc_show() and device removal
    - rtc: sh: Fix invalid alarm warning for non-enabled alarm
    - bonding: show full hw address in sysfs for slave entries
    - jffs2: fix use-after-free on symlink traversal
    - debugfs: fix use-after-free on symlink traversal
    - rtc: da9063: set uie_unsupported when relevant
    - vfio/pci: use correct format characters
    - scsi: storvsc: Fix calculation of sub-channel count
    - net: hns: Use NAPI_POLL_WEIGHT for hns driver
    - net: hns: Fix WARNING when remove HNS driver with SMMU enabled
    - hugetlbfs: fix memory leak for resv_map
    - xsysace: Fix error handling in ace_setup
    - ARM: orion: don't use using 64-bit DMA masks
    - ARM: iop: don't use using 64-bit DMA masks
    - usb: usbip: fix isoc packet num validation in get_pipe
    - staging: iio: adt7316: allow adt751x to use internal vref for all dacs
    - staging: iio: adt7316: fix the dac read calculation
    - staging: iio: adt7316: fix the dac write calculation
    - Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ
    - selinux: never allow relabeling on context mounts
    - x86/mce: Improve error message when kernel cannot recover, p2
    - media: v4l2: i2c: ov7670: Fix PLL bypass register values
    - scsi: libsas: fix a race condition when smp task timeout
    - ASoC:soc-pcm:fix a codec fixup issue in TDM case
    - ASoC: cs4270: Set auto-increment bit for register writes
    - ASoC: tlv320aic32x4: Fix Common Pins
    - perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS
    - scsi: csiostor: fix missing data copy in csio_scsi_err_handler()
    - iommu/amd: Set exclusion range correctly
    - genirq: Prevent use-after-free and work list corruption
    - usb: dwc3: Fix default lpm_nyet_threshold value
    - scsi: qla2xxx: Fix incorrect region-size setting in optrom SYSFS routines
    - Bluetooth: hidp: fix buffer overflow
    - Bluetooth: Align minimum encryption key size for LE and BR/EDR connections
    - UAS: fix alignment of scatter/gather segments
    - ipv6: fix a potential deadlock in do_ipv6_setsockopt()
    - ASoC: Intel: avoid Oops if DMA setup fails
    - timer/debug: Change /proc/timer_stats from 0644 to 0600
    - netfilter: compat: initialize all fields in xt_init
    - platform/x86: sony-laptop: Fix unintentional fall-through
    - iio: adc: xilinx: fix potential use-after-free on remove
    - HID: input: add mapping for Expose/Overview key
    - HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys
    - libnvdimm/btt: Fix a kmemdup failure check
    - s390/dasd: Fix capacity calculation for large volumes
    - s390/3270: fix lockdep false positive on view->lock
    - KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in
      tracing
    - tools lib traceevent: Fix missing equality check for strcmp
    - init: initialize jump labels before command line option parsing
    - ipvs: do not schedule icmp errors from tunnels
    - s390: ctcm: fix ctcm_new_device error return code
    - gpu: ipu-v3: dp: fix CSC handling
    - cw1200: fix missing unlock on error in cw1200_hw_scan()
    - Don't jump to compute_result state from check_result state
    - x86/microcode/intel: Add a helper which gives the microcode revision
    - x86: stop exporting msr-index.h to userland
    - x86/microcode/intel: Check microcode revision before updating sibling
      threads
    - x86/MCE: Save microcode revision in machine check records
    - x86/bugs: Add AMD's variant of SSB_NO
    - x86/bugs: Add AMD's SPEC_CTRL MSR usage
    - x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
    - x86/bugs: Fix the AMD SSBD usage of the SPEC_CTRL MSR
    - x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
    - x86/microcode: Update the new microcode revision unconditionally
    - x86/mm: Use WRITE_ONCE() when setting PTEs
    - x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
    - x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
    - x86/speculation: Propagate information about RSB filling mitigation to sysfs
    - x86/speculation: Update the TIF_SSBD comment
    - x86/speculation: Clean up spectre_v2_parse_cmdline()
    - x86/speculation: Move STIPB/IBPB string conditionals out of
      cpu_show_common()
    - x86/speculation: Disable STIBP when enhanced IBRS is in use
    - x86/speculation: Rename SSBD update functions
    - x86/speculation: Reorganize speculation control MSRs update
    - x86/Kconfig: Select SCHED_SMT if SMP enabled
    - x86/speculation: Mark string arrays const correctly
    - x86/speculataion: Mark command line parser data __initdata
    - x86/speculation: Add command line control for indirect branch speculation
    - x86/speculation: Prepare for per task indirect branch speculation control
    - x86/process: Consolidate and simplify switch_to_xtra() code
    - x86/speculation: Avoid __switch_to_xtra() calls
    - x86/speculation: Prepare for conditional IBPB in switch_mm()
    - x86/speculation: Split out TIF update
    - x86/speculation: Prepare arch_smt_update() for PRCTL mode
    - x86/speculation: Prevent stale SPEC_CTRL msr content
    - x86/speculation: Add prctl() control for indirect branch speculation
    - x86/speculation: Enable prctl mode for spectre_v2_user
    - x86/speculation: Add seccomp Spectre v2 user space protection mode
    - x86/speculation: Provide IBPB always command line options
    - x86/cpu/bugs: Use __initconst for 'const' init data
    - USB: serial: use variable for status
    - USB: serial: fix unthrottle races
    - bridge: Fix error path for kobject_init_and_add()
    - net: ucc_geth - fix Oops when changing number of buffers in the ring
    - packet: Fix error path in packet_init
    - vlan: disable SIOCSHWTSTAMP in container
    - vrf: sit mtu should not be updated when vrf netdev is the link
    - ipv4: Fix raw socket lookup for local traffic
    - bonding: fix arp_validate toggling in active-backup mode
    - drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl
    - drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl
    - powerpc/booke64: set RI in default MSR
    - powerpc/lib: fix book3s/32 boot failure due to code patching
    - Linux 4.4.180
    - SAUCE: Clarify IBRS/IBPB runtime state change messages
    - SAUCE: x86/speculation: Move STIBP hunks
    - SAUCE: powerpc/speculation: Support 'mitigations=' cmdline option
    - SAUCE: x86/speculation: Update 'mitigations=' documentation
    - SAUCE: Show 'pti' instead of 'kaiser' in /proc/cpuinfo
    - SAUCE: perf/bench: Drop definition of BIT in numa.c
    - SAUCE: x86/speculation: Fix SSB command line documentation

* CVE-2018-12126 // CVE-2018-12127 // CVE-2018-12130 // CVE-2019-11091
    - SAUCE: Synchronize MDS mitigations with upstream
    - Documentation: Correct the possible MDS sysfs values
    - x86/speculation/mds: Fix documentation typo

* CVE-2019-11091
    - x86/mds: Add MDSUM variant to the MDS documentation

-- Stefan Bader <stefan.bader@canonical.com>  Tue, 23 Jul 2019 10:55:25 +0200

Changed in linux (Ubuntu Xenial):
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

Performance degradation when copying from LVM snapshot backed by NVMe disk

Bug Description

CVE References

Other bug subscribers

Remote bug watches

Ubuntu
linux package