NVMe driver regression for non-smp/1-cpu systems

Bug #1651602 reported by Chris Gregan
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Critical
Unassigned
Xenial
Fix Released
Critical
Dan Streetman

Bug Description

MAAS Version 2.1.1+bzr5544-0ubuntu1 (16.10.1)
Deploying Xenial Nodes

1) Deploy MAAS 2.1.1 on Yakkety
2) Associate Juju 2.1 beta3
3) Juju deploy Kubernetes Core

Nodes begin to deploy but fail

Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: b"no disk with serial 'CVMD434500BN400AGN' found\n"

Related bugs:
 * bug 1647485: NVMe symlinks broken by devices with spaces in model or serial strings
 * bug 1642903: introduce disk/by-id (model_serial) symlinks for NVMe drives

CVE References

Revision history for this message
Chris Gregan (cgregan) wrote :
Revision history for this message
Joshua Powers (powersj) wrote :

Saw a similar failure in Curtin's vmtests [1]. Here is output from the Xenial boot log:

[ 1.370713] nvme nvme0: Failed to get enough MSI/MSIX interrupts
[ 1.371798] nvme 0000:00:07.0: Removing after probe failure
[ 1.380426] FDC 0 is a S82078B
[ 1.398396] nvme nvme1: Failed to get enough MSI/MSIX interrupts
[ 1.399396] nvme 0000:00:08.0: Removing after probe failure

Looks like a kernel regression at this point. This failure was on Linux version 4.4.0-57-generic. The last test to pass was on Linux version 4.4.0-53-generic. From [2] it looks like there was an attempt to fix this, by allowing the kernel to fall-back to legacy interrupts in the events that MSI-X and even MSI interrupts failed to be allocated.

[1] https://jenkins.ubuntu.com/server/job/curtin-vmtest/649/artifact/output/XenialTestNvme/logs/
[2] http://lists.infradead.org/pipermail/linux-nvme/2016-May/004653.html

Scott Moser (smoser)
Changed in linux (Ubuntu Xenial):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Chris Gregan (cgregan)
tags: added: cdo-qa-blocker
Revision history for this message
Scott Moser (smoser) wrote :
tags: added: apport-collected ec2-images xenial
Revision history for this message
Scott Moser (smoser) wrote : apport information

AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Dec 22 13:44 seq
 crw-rw---- 1 root audio 116, 33 Dec 22 13:44 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
DistroRelease: Ubuntu 16.04
Ec2AMI: ami-0000051e
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: OpenStack Foundation OpenStack Nova
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-57-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0
ProcVersionSignature: User Name 4.4.0-57.78-generic 4.4.35
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-57-generic N/A
 linux-backports-modules-4.4.0-57-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial ec2-images xenial ec2-images xenial ec2-images
Uname: Linux 4.4.0-57-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy lxd netdev plugdev sudo video
_MarkForUpload: True
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-xenial
dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnOpenStackFoundation:pnOpenStackNova:pvr13.1.2:cvnQEMU:ct1:cvrpc-i440fx-xenial:
dmi.product.name: OpenStack Nova
dmi.product.version: 13.1.2
dmi.sys.vendor: OpenStack Foundation

Revision history for this message
Scott Moser (smoser) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : JournalErrors.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : Lspci.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : ProcModules.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : UdevDb.txt

apport information

Revision history for this message
Scott Moser (smoser) wrote : WifiSyslog.txt

apport information

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1651602

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Yakkety):
status: New → Incomplete
Revision history for this message
Scott Moser (smoser) wrote : Re: [2.1.1] Yakkety - MAAS has nvme0n1 set as boot disk, curtin fails

This script shows the failure, downloading a xenial cloud-image
that has vmlinuz-4.4.0-53-generic (release-20161214) and one that has
vmlinuz-4.4.0-57-generic (release-20161221).

Other differences are
changed: ['linux-image-virtual', 'apport', 'systemd', 'linux-image-4.4.0-57-generic', 'cloud-init', 'python3-apport', 'libpam-systemd:amd64', 'initramfs-tools-bin', 'linux-headers-virtual', 'isc-dhcp-client', 'grub-legacy-ec2', 'isc-dhcp-common', 'linux-virtual', 'open-iscsi', 'cloud-initramfs-dyn-netconf', 'initramfs-tools', 'initramfs-tools-core', 'libsystemd0:amd64', 'libudev1:amd64', 'udev', 'systemd-sysv',
'cloud-initramfs-copymods', 'overlayroot', 'python3-problem-report', 'ifupdown', 'linux-headers-generic']

I'll check if just the kernel upgrade changes things now. I also suspect udev.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Scott Moser (smoser) wrote : Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

OK, so I installed vmlinuz-4.4.0-57-generic into the good image (apt-get install linux-virtual) and the resulted kernel/initrd still show the problem.

So this is squarely a kernel regression.

summary: - [2.1.1] Yakkety - MAAS has nvme0n1 set as boot disk, curtin fails
+ [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails
tags: added: regression-release
Revision history for this message
Scott Moser (smoser) wrote :

Just to be clear, the failure I'm seeing is that in qemu, if you try to boot with root=LABEL=cloudimg-rootfs it will no longer work.

I believe that this is another symptom of both the failure in vmtest and the failure that Chris saw.

The reproduce above is just easier to run.

Revision history for this message
Scott Moser (smoser) wrote :

I've just now tested yakkety, and it seems like both 4.8.0-30-generic and 4.8.0-32-generic are working fine, so I do not believe this to affect yakkety.

no longer affects: linux (Ubuntu Yakkety)
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Scott Moser (smoser) wrote :

Seems only present in xenial, 4.4.0-57-generic.

Changed in linux (Ubuntu Xenial):
importance: High → Critical
Revision history for this message
Scott Moser (smoser) wrote :

Recreate is as show above, or:

qemu-system-x86_64 -enable-kvm \
    -drive file=disk1.qcow,if=none,format=qcow2,id=nvme0 \
    -device nvme,drive=nvme0,serial=nvme-0 \
    -snapshot -nographic -echr 0x05 -m 512 \
    -kernel kernel -initrd initrd \
    -append "root=LABEL=cloudimg-rootfs console=ttyS0"

Revision history for this message
Scott Moser (smoser) wrote :

Dan Streetman suggested that this should only be a problem with a single cpu system.
I did verify that the failure shown above goes away if you add '-smp cpus=2' to the qemu command line.

Revision history for this message
Chris Gregan (cgregan) wrote :
Dan Streetman (ddstreet)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Dan Streetman (ddstreet)
Revision history for this message
Chris Gregan (cgregan) wrote :
Revision history for this message
Dan Streetman (ddstreet) wrote :

I tested on my nvme system with the boot param 'maxcpus=0' (i.e. UP mode), the boot fails with kernel 4.4.0-57, because the nvme drive fails enumeration, with the error from comment 2. With the 4.4.0-57 kernel including my nvme patch, and using boot param 'maxcpus=0', the boot succeeds. The patched kernel is at this ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1651602

and the patch is here:
https://lists.ubuntu.com/archives/kernel-team/2016-December/081637.html

the patch fixes a bug introduced by my previous commit 96fce9e4025b ("NVMe: only setup MSIX once")
which is only currently included in Xenial, so this patch only needs to be applied to Xenial.

Revision history for this message
Dan Streetman (ddstreet) wrote :

Chris, as your specific problem seems different than the 1-cpu NVMe bug that the rest of this bug describes, and my patch fixes, can you open a new bug please.

Changed in maas:
status: New → Invalid
Revision history for this message
Dan Streetman (ddstreet) wrote :

I built a test kernel with this fix applied to the 4.4.0-57 kernel, available here:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1651602

Revision history for this message
Chris Gregan (cgregan) wrote : Re: [Bug 1651602] Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

Dan,
What should be the focus of the new bug? Split off the fact that Denial
cannot be deployed by MAAS? Or is it still related to nvme, just
differently?

On Dec 23, 2016 19:05, "Dan Streetman" <
<email address hidden>> wrote:

> Chris, as your specific problem seems different than the 1-cpu NVMe bug
> that the rest of this bug describes, and my patch fixes, can you open a
> new bug please.
>
> ** Changed in: maas
> Status: New => Invalid
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1651602
>
> Title:
> [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1651602/+subscriptions
>

Revision history for this message
Dan Streetman (ddstreet) wrote : Re: [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails

> What should be the focus of the new bug?

I don't know why it doesn't work on your system, as it does work on mine, so I can't tell you what to put in the new bug. Since your system is smp with 2 or more cpus, it doesn't appear to be the same as this bug (since this bug was taken over after you reported it, and all the comments and info is re: the 1-cpu nvme regression that comment 26 addresses). You could try deploying without the nvme drive configured in maas, verify it isn't available with the stock kernel, then load my ppa kernel from comment 28 and reboot, and see if it appears. If it makes no difference, you're seeing a totally different problem and should open a new bug.

> Split off the fact that Denial
> cannot be deployed by MAAS?

I don't know what "Denial" is.

> Or is it still related to nvme, just
> differently?

no idea. That's why you should open a new bug and put debug info there.

Revision history for this message
Chris Gregan (cgregan) wrote :
Revision history for this message
Chris Gregan (cgregan) wrote :

Dan,
Bug above, but since you say it is only happening to me, have other tests been run using MAAS 2.1.2?

Luis Henriques (henrix)
Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Committed
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I discussed this with cgregan on IRC and I think we came to the conclusion that the MAAS/curtin bug is simply that the two kernels (commissioning vs. ephemeral deployment) gather different (or missing) unique identifiers for each drive.

To validate that, I would run the following on each kernel on the problematic systems:

find /dev/disk -type l | xargs ls -1l | awk '{ print $9, $10, $11 }' | sort -k2

This will tell us which unique identifiers each kernel found to identify each disk, and sort by the endpoint block device, to make it easier to identify what might be missing for each drive.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

After further troubleshooting with cgregan, we've further narrowed this down.

We ran the following script on the node that was having trouble:

https://gist.github.com/pontillo/0b92a7da2fba43fb5dce705be2dcf38b

Unlike all the other devices MAAS works with, the Intel NVMe device reports a serial number that cannot be found anywhere in /dev/disk/by-id/*. When curtin is supplied a serial number, it uses a heuristic to find the device as follows:

http://bazaar.launchpad.net/~curtin-dev/curtin/trunk/view/435/curtin/commands/block_meta.py#L270

http://bazaar.launchpad.net/~curtin-dev/curtin/trunk/view/435/curtin/block/__init__.py#L601

So arguably, this is a bug in the Intel NVMe serial number; the way it populates /dev/disk/* leaves much to be desired.

This is *arguably* a bug in curtin (and maybe MAAS, since we knowingly use the serial number even though `udevadm` can tell us that the serial cannot be found anywhere in /dev/disk/by-id/*), in that we could do a better job dealing with devices backed by not-so-robust kernel drivers. But I think we shouldn't encourage bad behavior on the part of driver writers, so I'm on the fence about whether or not we should fix it.

But mostly, I would argue that this is a bug in the Intel NVMe driver. The way they expose the device to userland is non-standard and arguably broken. When we ran `udevadm info -q all -n nvme0n1` on the device, we got the following pseudo-output:

nvme0n1:
P: /devices/pci0000:00/0000:00:xx.0/0000:xx:00.0/nvme/nvme0/nvme0n1
N: nvme0n1
S: SSDxxxxxxxxxx_CVMDxxxxxxxxxxxxxx
S: disk/by-id/nvme-INTEL
E: DEVLINKS=/dev/disk/by-id/nvme-INTEL /dev/SSDxxxxxxxxxx_CVMDxxxxxxxxxxxxxx
E: DEVNAME=/dev/nvme0n1
E: DEVPATH=/devices/pci0000:00/0000:00:xx.0/0000:xx:00.0/nvme/nvme0/nvme0n1
E: DEVTYPE=disk
E: ID_SERIAL=INTEL SSDxxxxxxxxxx_CVMDxxxxxxxxxxxxxx
E: ID_SERIAL_SHORT=CVMDxxxxxxxxxxxxxx
E: MAJOR=259
E: MINOR=0
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=xxxxxxx

You can see by the lines that start with "S:" and the "DEVLINKS=" line that the way this device is exposed is very non-standard. One would expect /dev/disk/by-id/* to contain a DEVLINK containing the serial number. Instead they expose a 'nvme-INTEL' link, which is (IMHO) a critical bug, because anyone expecting the things in /dev/disk/by-id/* to be unique will be in for a big surprise when they add a second NVMe device to a machine.

Changed in linux (Ubuntu):
status: Invalid → New
Changed in linux (Ubuntu Xenial):
status: Fix Committed → New
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Marking this bug 'New' for the kernel, since this has to do with the Intel NVMe device links, not the related SMP issue that came up.

Changed in maas:
status: Invalid → Won't Fix
summary: - [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails
+ Intel NVMe driver does not expose consistent links in /dev/disk/by-id
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1651602

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Xenial):
status: New → Incomplete
Revision history for this message
Dan Streetman (ddstreet) wrote : Re: Intel NVMe driver does not expose consistent links in /dev/disk/by-id

> But mostly, I would argue that this is a bug in the Intel NVMe driver.

You're commenting in the wrong bug. This bug is already being addressed. Please go over to bug 1653797

Changed in curtin:
status: New → Invalid
Changed in linux (Ubuntu Xenial):
status: Incomplete → Confirmed
status: Confirmed → Fix Committed
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Dan Streetman (ddstreet)
summary: - Intel NVMe driver does not expose consistent links in /dev/disk/by-id
+ NVMe driver regression for non-smp/1-cpu systems
Scott Moser (smoser)
description: updated
description: updated
Scott Moser (smoser)
Changed in maas:
status: Won't Fix → Invalid
Revision history for this message
Scott Moser (smoser) wrote :

To verify this is fixed, I have done the following.
For good measure, I've also successfully booted the above
with additional parameter '-smp cpus=2'. That verifies that there is no
obvious regression on SMP system.

$ img_url="http://cloud-images.ubuntu.com/daily/server/xenial/current/xenial-server-cloudimg-amd64-disk1.img"
$ wget "${img_url}" -O disk.img

# patch the image to have a root passwd and disable cloud-init for simplicity.
$ sudo mount-image-callback disk.img -- chroot _MOUNTPOINT_ sh -exc '
    touch /etc/cloud/cloud-init.disabled
    echo "root:root" | chpasswd'

# attached 'get-kernels' enables proposed, installs linux-virtual into disk.img
# and copies kernels out to out.d
$ ./get-kernels disk.img linux-virtual out.d
$ for i in out.d/*info; do echo == $i ==; cat $i; done
== out.d/build.info ==
build_name: server
serial: 20170106.1
== out.d/vmlinuz-4.4.0-57-generic.pkg-info ==
linux-image-4.4.0-57-generic: /boot/vmlinuz-4.4.0-57-generic
== out.d/vmlinuz-4.4.0-59-generic.pkg-info ==
linux-image-4.4.0-59-generic: /boot/vmlinuz-4.4.0-59-generic

$ kver=4.4.0-57 ; qemu-system-x86_64 -enable-kvm -m 512 \
   -drive file=disk.img,if=none,format=qcow2,id=nvme0 \
   -device nvme,drive=nvme0,serial=nvme-0 \
   -snapshot -nographic -echr 0x05 \
   -kernel out.d/vmlinuz-$kver-generic \
   -initrd out.d/initrd.img-$kver-generic \
   -append "root=LABEL=cloudimg-rootfs console=ttyS0"

# if you set 'ver' to 4.4.0-57 (from released kernel version)
# then you see failure to find root device from the initramfs
# if you set 'ver' to 4.4.0-59 (from -proposed) then it finds root,
# and you can log in as root on console with 'root'.

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Dan Streetman (ddstreet) wrote :

Verified on physical box also, using maxcpus=0, with 4.4.0-58 nvme drive fails to initialize, with 4.4.0-59 nvme initializes.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.9 KiB)

This bug was fixed in the package linux - 4.4.0-59.80

---------------
linux (4.4.0-59.80) xenial; urgency=low

  [ John Donnelly ]

  * Release Tracking Bug
    - LP: #1654282

  * [2.1.1] MAAS has nvme0n1 set as boot disk, curtin fails (LP: #1651602)
    - (fix) nvme: only require 1 interrupt vector, not 2+

linux (4.4.0-58.79) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1651402

  * Support ACPI probe for IIO sensor drivers from ST Micro (LP: #1650123)
    - SAUCE: iio: st_sensors: match sensors using ACPI handle
    - SAUCE: iio: st_accel: Support sensor i2c probe using acpi
    - SAUCE: iio: st_pressure: Support i2c probe using acpi
    - [Config] CONFIG_HTS221=m, CONFIG_HTS221_I2C=m, CONFIG_HTS221_SPI=m

  * Fix channel data parsing in ST Micro sensor IIO drivers (LP: #1650189)
    - SAUCE: iio: common: st_sensors: fix channel data parsing

  * ST Micro lng2dm 3-axis "femto" accelerometer support (LP: #1650112)
    - SAUCE: iio: st-accel: add support for lis2dh12
    - SAUCE: iio: st_sensors: support active-low interrupts
    - SAUCE: iio: accel: Add support for the h3lis331dl accelerometer
    - SAUCE: iio: st_sensors: verify interrupt event to status
    - SAUCE: iio: st_sensors: support open drain mode
    - SAUCE: iio:st_sensors: fix power regulator usage
    - SAUCE: iio: st_sensors: switch to a threaded interrupt
    - SAUCE: iio: accel: st_accel: Add lis3l02dq support
    - SAUCE: iio: st_sensors: fix scale configuration for h3lis331dl
    - SAUCE: iio: accel: st_accel: add support to lng2dm
    - SAUCE: iio: accel: st_accel: inline per-sensor data
    - SAUCE: Documentation: dt: iio: accel: add lng2dm sensor device binding

  * ST Micro hts221 relative humidity sensor support (LP: #1650116)
    - SAUCE: iio: humidity: add support to hts221 rh/temp combo device
    - SAUCE: Documentation: dt: iio: humidity: add hts221 sensor device binding
    - SAUCE: iio: humidity: remove
    - SAUCE: iio: humidity: Support acpi probe for hts211

  * crypto : tolerate new crypto hardware for z Systems (LP: #1644557)
    - s390/zcrypt: Introduce CEX6 toleration

  * Acer, Inc ID 5986:055a is useless after 14.04.2 installed. (LP: #1433906)
    - uvcvideo: uvc_scan_fallback() for webcams with broken chain

  * vmxnet3 driver could causes kernel panic with v4.4 if LRO enabled.
    (LP: #1650635)
    - vmxnet3: segCnt can be 1 for LRO packets

  * system freeze when swapping to encrypted swap partition (LP: #1647400)
    - mm, oom: rework oom detection
    - mm: throttle on IO only when there are too many dirty and writeback pages

  * Kernel Fixes to get TCMU File Backed Optical to work (LP: #1646204)
    - target/user: Use sense_reason_t in tcmu_queue_cmd_ring
    - target/user: Return an error if cmd data size is too large
    - target/user: Fix comments to not refer to data ring
    - SAUCE: (no-up) target/user: Fix use-after-free of tcmu_cmds if they are
      expired

  * CVE-2016-9756
    - KVM: x86: drop error recovery in em_jmp_far and em_ret_far

  * Dell Precision 5520 & 3520 freezes at login screent (LP: #1650054)
    - ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520

  * CVE-2016-979...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Chris Gregan (cgregan) wrote :

Kernel 4.4.0-59.80 pushed to cloud and MAAS images and retested on failing systems.

Failing systems continue to fail

Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: b"no disk with serial 'CVMD51530020400AGN' found\n"
Stderr: ''

Changed in linux (Ubuntu Xenial):
status: Fix Released → Confirmed
Revision history for this message
Dan Streetman (ddstreet) wrote :

> Kernel 4.4.0-59.80 pushed to cloud and MAAS images and retested on failing systems.

Chris, this isn't the right bug for you. Please use your new bug 1653797.

Changed in linux (Ubuntu Xenial):
status: Confirmed → Fix Released
Revision history for this message
John Donnelly (jpdonnelly) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
Revision history for this message
Dan Streetman (ddstreet) wrote :

Booted the 4.8.0-36-generic kernel with maxcpus=0 parameter and verified all NVMe drives were initialized.

tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Mathew Hodson (mhodson)
Changed in linux (Ubuntu):
status: Invalid → Fix Committed
importance: Undecided → Critical
affects: curtin → ubuntu-translations
no longer affects: ubuntu-translations
affects: maas → ubuntu-translations
no longer affects: ubuntu-translations
Dan Streetman (ddstreet)
Changed in linux (Ubuntu):
status: Fix Committed → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.8.0-37.39

---------------
linux (4.8.0-37.39) yakkety; urgency=low

  [ Thadeu Lima de Souza Cascardo ]

  * Release Tracking Bug
    - LP: #1659381

  * Mouse cursor invisible or does not move (LP: #1646574)
    - drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user
    - drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing
      classes
    - drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices

 -- Benjamin M Romer <email address hidden> Wed, 25 Jan 2017 16:12:02 -0200

Changed in linux (Ubuntu):
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.