NVMe shutdown and suspend delay time is too short

Bug #1465136 reported by Alex Hung
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Alex Hung
Utopic
Fix Released
Undecided
Unassigned
linux-lts-utopic (Ubuntu)
Fix Released
High
Alex Hung
Trusty
Fix Released
Undecided
Unassigned

Bug Description

A hardware vendor reported that NVMe didn't implement power management correctly during shutdown/restart and suspend/resume as below:

=================================
However, when it comes to shutdown/restart, the Power Management Events (PME) are not correctly implemented.

When the host (Linux/Driver) sends the PME_TURN_OFF message it should wait for PME_TO_ACK from the device.

Microsoft Windows versions typically wait for 5 seconds.

The Linux OSes doesn't wait for the PME_TO_ACK from the device and simply turns off the system instantly immediately after the shutdown/restart.

This is not the way to treat PCIe storage devices. The host should allow the device to properly shutdown. Otherwise the disk will go in to rebuild during next power on which takes a while resulting in OS finding the device not responding and disabling it and not loading the driver for it.
=================================

Revision history for this message
Alex Hung (alexhung) wrote :

A patch in newer kernel (3.19 and later) has a fix for this as below:

commit 2484f40780b97df1b5eb09e78ce4efaa78b21875
Author: Dan McLeran <email address hidden>
Date: Tue Jul 1 09:33:32 2014 -0600

    NVMe: Add shutdown timeout as module parameter.

    The current implementation hard-codes the shutdown timeout to 2 seconds.
    Some devices take longer than this to complete a normal shutdown.
    Changing the shutdown timeout to a module parameter with a default
    timeout of 5 seconds.

    Signed-off-by: Dan McLeran <email address hidden>
    Signed-off-by: Matthew Wilcox <email address hidden>
    Signed-off-by: Jens Axboe <email address hidden>

However, this patch is not included in 3.13 or 3.16

Changed in linux (Ubuntu):
status: New → In Progress
Revision history for this message
penalvch (penalvch) wrote :
tags: added: cherry-pick
Changed in linux (Ubuntu):
importance: Undecided → High
status: In Progress → Fix Released
tags: added: kernel-fixed-upstream
Alex Hung (alexhung)
Changed in linux-lts-trusty (Ubuntu):
assignee: nobody → Alex Hung (alexhung)
Changed in linux-lts-utopic (Ubuntu):
assignee: nobody → Alex Hung (alexhung)
Changed in linux-lts-trusty (Ubuntu):
importance: Undecided → High
Changed in linux-lts-utopic (Ubuntu):
importance: Undecided → High
Brad Figg (brad-figg)
no longer affects: linux-lts-trusty (Ubuntu Trusty)
no longer affects: linux-lts-utopic (Ubuntu Precise)
no longer affects: linux (Ubuntu Trusty)
no longer affects: linux (Ubuntu Precise)
no longer affects: linux-lts-utopic (Ubuntu Utopic)
no longer affects: linux-lts-trusty (Ubuntu Utopic)
no longer affects: linux-lts-trusty (Ubuntu)
no longer affects: linux-lts-trusty (Ubuntu Precise)
Brad Figg (brad-figg)
Changed in linux (Ubuntu Utopic):
status: New → Fix Committed
Brad Figg (brad-figg)
Changed in linux-lts-utopic (Ubuntu Trusty):
status: New → Fix Committed
Alex Hung (alexhung)
Changed in linux-lts-utopic (Ubuntu):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
tags: added: verification-needed-utopic
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-utopic' to 'verification-done-utopic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Alex Hung (alexhung)
tags: added: verification-done-trusty verification-done-utopic
removed: verification-needed-trusty verification-needed-utopic
Alex Hung (alexhung)
Changed in linux-lts-utopic (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.4 KiB)

This bug was fixed in the package linux - 3.16.0-44.59

---------------
linux (3.16.0-44.59) utopic; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1472030

  [ Iyappan Subramanian ]

  * SAUCE: (no-up) drivers: net: xgene: fix: Out of order descriptor bytes
    read
    - LP: #1425576

  [ Upstream Kernel Changes ]

  * Revert "tools/vm: fix page-flags build"
    - LP: #1471170
  * NVMe: Add shutdown timeout as module parameter.
    - LP: #1465136
  * Drivers: hv: vmbus: Add support for VMBus panic notifier handler
    - LP: #1463584
  * Drivers: hv: vmbus: Correcting truncation error for constant
    HV_CRASH_CTL_CRASH_NOTIFY
    - LP: #1463584
  * KVM: nVMX: fix lifetime issues for vmcs02
    - LP: #1448269
  * KVM: nVMX: Fix nested vmexit ack intr before load vmcs01
    - LP: #1448269
  * mm/slab_common: support the slub_debug boot option on specific object
    size
    - LP: #1456952
  * kvm: x86: fix kvm_apic_has_events to check for NULL pointer
  * cpuidle: powernv: Populate cpuidle state details by querying the
    device-tree
    - LP: #1470404
  * cpuidle: powernv: Read target_residency value of idle states from DT if
    available
    - LP: #1470404
  * cpuidle: powernv: Avoid endianness conversions while parsing DT
    - LP: #1470404
  * cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
    - LP: #1470404
  * iio: adis16400: Report pressure channel scale
    - LP: #1471170
  * iio: adis16400: Use != channel indices for the two voltage channels
    - LP: #1471170
  * iio: adis16400: Compute the scan mask from channel indices
    - LP: #1471170
  * iio: adis16400: Remove unused variable
    - LP: #1471170
  * iio: adis16400: Fix burst mode
    - LP: #1471170
  * iio: adis16400: Fix burst transfer for adis16448
    - LP: #1471170
  * USB: serial: ftdi_sio: Add support for a Motion Tracker Development
    Board
    - LP: #1471170
  * iio: adc: twl6030-gpadc: Fix modalias
    - LP: #1471170
  * serial: imx: Fix DMA handling for IDLE condition aborts
    - LP: #1471170
  * usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros
    - LP: #1471170
  * ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
    - LP: #1471170
  * n_tty: Fix auditing support for cannonical mode
    - LP: #1471170
  * drm/i915/hsw: Fix workaround for server AUX channel clock divisor
    - LP: #1471170
  * x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
    - LP: #1471170
  * lib: Fix strnlen_user() to not touch memory after specified maximum
    - LP: #1471170
  * Input: elantech - fix detection of touchpads where the revision matches
    a known rate
    - LP: #1471170
  * ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
    - LP: #1471170
  * ALSA: usb-audio: add MAYA44 USB+ mixer control names
    - LP: #1471170
  * ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)
    - LP: #1471170
  * USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
    - LP: #1471170
  * Input: elantech - add new icbody type
    - LP: #1471170
  * MIPS: Fix enabling of DEBUG_STACKOVERFLOW
    - LP: #1471170
  * xfrm: fix a race in xfrm_state_lookup_byspi
    ...

Read more...

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (6.4 KiB)

This bug was fixed in the package linux-lts-utopic - 3.16.0-44.59~14.04.1

---------------
linux-lts-utopic (3.16.0-44.59~14.04.1) trusty; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1472223

  [ Iyappan Subramanian ]

  * SAUCE: (no-up) drivers: net: xgene: fix: Out of order descriptor bytes
    read
    - LP: #1425576

  [ Upstream Kernel Changes ]

  * Revert "tools/vm: fix page-flags build"
    - LP: #1471170
  * NVMe: Add shutdown timeout as module parameter.
    - LP: #1465136
  * Drivers: hv: vmbus: Add support for VMBus panic notifier handler
    - LP: #1463584
  * Drivers: hv: vmbus: Correcting truncation error for constant
    HV_CRASH_CTL_CRASH_NOTIFY
    - LP: #1463584
  * KVM: nVMX: fix lifetime issues for vmcs02
    - LP: #1448269
  * KVM: nVMX: Fix nested vmexit ack intr before load vmcs01
    - LP: #1448269
  * mm/slab_common: support the slub_debug boot option on specific object
    size
    - LP: #1456952
  * kvm: x86: fix kvm_apic_has_events to check for NULL pointer
  * cpuidle: powernv: Populate cpuidle state details by querying the
    device-tree
    - LP: #1470404
  * cpuidle: powernv: Read target_residency value of idle states from DT if
    available
    - LP: #1470404
  * cpuidle: powernv: Avoid endianness conversions while parsing DT
    - LP: #1470404
  * cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
    - LP: #1470404
  * iio: adis16400: Report pressure channel scale
    - LP: #1471170
  * iio: adis16400: Use != channel indices for the two voltage channels
    - LP: #1471170
  * iio: adis16400: Compute the scan mask from channel indices
    - LP: #1471170
  * iio: adis16400: Remove unused variable
    - LP: #1471170
  * iio: adis16400: Fix burst mode
    - LP: #1471170
  * iio: adis16400: Fix burst transfer for adis16448
    - LP: #1471170
  * USB: serial: ftdi_sio: Add support for a Motion Tracker Development
    Board
    - LP: #1471170
  * iio: adc: twl6030-gpadc: Fix modalias
    - LP: #1471170
  * serial: imx: Fix DMA handling for IDLE condition aborts
    - LP: #1471170
  * usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros
    - LP: #1471170
  * ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
    - LP: #1471170
  * n_tty: Fix auditing support for cannonical mode
    - LP: #1471170
  * drm/i915/hsw: Fix workaround for server AUX channel clock divisor
    - LP: #1471170
  * x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
    - LP: #1471170
  * lib: Fix strnlen_user() to not touch memory after specified maximum
    - LP: #1471170
  * Input: elantech - fix detection of touchpads where the revision matches
    a known rate
    - LP: #1471170
  * ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
    - LP: #1471170
  * ALSA: usb-audio: add MAYA44 USB+ mixer control names
    - LP: #1471170
  * ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)
    - LP: #1471170
  * USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
    - LP: #1471170
  * Input: elantech - add new icbody type
    - LP: #1471170
  * MIPS: Fix enabling of DEBUG_STACKOVERFLOW
    - LP: #1471170
  * xfrm: fix ...

Read more...

Changed in linux-lts-utopic (Ubuntu Trusty):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.