cache on NVMe inhibits proper suspend/hibernate - zfs l2arc

Bug #1670137 reported by mahmoh on 2017-03-05
This bug affects 2 people
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Colin Ian King

Bug Description

Please see: 2017-03-05 09:03 2017-03-05 14:03 UTC KernelOops linux-image-4.4.0-31-generic

Running zfs [root] with l2arc cache enabled fails to suspend properly, screen goes blank but power stays on, when I disable the cache it works fine. Unsure if this happens with non-root install of zfs, slog does not cause this problem.

Linux mp 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

ii zfs-initramfs all Native OpenZFS root filesystem capabilities for Linux
ii zfs-zed amd64 OpenZFS Event Daemon (zed)
ii zfsutils-linux amd64 Native OpenZFS management utilities for Linux

mahmoh (mahmoh) wrote :

Precision 5510 BIOS version: 1.2.19 (latest)

mahmoh (mahmoh) wrote :

Occurs while suspend to ram, attempted pm_trace (sudo sh -c "sync && echo 1 > /sys/power/pm_trace && pm-suspend"):
[ 1.196689] Key type encrypted registered
[ 1.196712] AppArmor: AppArmor sha1 policy hashing enabled
[ 1.442937] evm: HMAC attrs: 0x1
[ 1.444273] Magic number: 12:578:178
[ 1.444323] acpi device:0e: hash matches
[ 1.444345] platform: hash matches
[ 1.444671] rtc_cmos 00:02: setting system clock to 2016-12-22 12:09:40 UTC (1482408580)
[ 1.444925] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
[ 1.444927] EDD information not available.
[ 1.445014] PM: Hibernation image not present or could not be loaded.
[ 1.445962] Freeing unused kernel memory: 1480K (ffffffff81f42000 - ffffffff820b4000)

More data points:

mahmoh (mahmoh) wrote :

Same results with 4.8 kernel & pm_trace as above:

Linux mp 4.8.0-39-generic #42~16.04.1-Ubuntu SMP Mon Feb 20 15:06:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

2017-03-05 10:03 2017-03-05 15:03 UTC KernelOops linux-image-4.8.0-39-generic

mahmoh (mahmoh) wrote :

I think this is more of an symptom of the cache being located on an NVMe drive which are notorious for causing power problems. I'm closing this until I can validate on a non-NVMe drive.

Changed in zfs-linux (Ubuntu):
status: New → Invalid
Joseph Yasi (joe-yasi) wrote :

I am also seeing this issue with zfs and l2arc cache on an NVMe and 4.10.14 on zesty. I've seen it in the past with btrfs, and a bcache cache on NVMe as well, but not with the bcache cache on a SATA SSD. I will move my zfs l2arc cache to a SATA SSD to see if that fixes suspend. This is probably an issue in the NVMe kernel stack.

mahmoh (mahmoh) wrote :

I can confirm that the problem does not happen when cache is on SATA vs. NVME. I assume this needs to get pushed to the kernel then? Please advise.

   nvme0n1p5 OFFLINE 0 0 0
   sda3 ONLINE 0 0 0

Linux mp 4.8.0-54-generic #57~16.04.1-Ubuntu SMP Wed May 24 16:22:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Distributor ID: Ubuntu
Description: Ubuntu 16.04.2 LTS
Release: 16.04
Codename: xenial

Changed in zfs-linux (Ubuntu):
status: Invalid → New
summary: - zfs l2arc cache inhibits proper suspend
+ cache on NVMe inhibits proper suspend/hibernate - zfs l2arc
Colin Ian King (colin-king) wrote :

Hi mahmoh,

Out of interest, where did you get the info:

   nvme0n1p5 OFFLINE 0 0 0
   sda3 ONLINE 0 0 0

Changed in zfs-linux (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Colin Ian King (colin-king)
Joseph Yasi (joe-yasi) wrote :

That info is from:
zpool status -v

I've worked around this suspend issue by writing a script to take the NVMe L2ARC cache offline during suspend, and bring it back online on resume.

The issue with bcache on suspend that I had for a similar situation is because a read can trigger a write to the cache when writes have been shut off during suspend. Potentially, ZFS could temporarily disable writes to the L2ARC during suspend and reenable on resume to prevent writing when unable to.

Colin Ian King (colin-king) wrote :

@Joseph, out of interest for users seeing the same issue, can you supply the script so others can use this as a temporary workaround?

Joseph Yasi (joe-yasi) wrote :

I've attached my workaround script. To use:

change the ZFSPOOLNAME variable in the script to the name of the zfspool with an NVMe L2ARC.
change the NVMEL2ARCPART to the name of the cache partition.

sudo cp /lib/systemd/system-sleep/
sudo chmod +x /lib/systemd/system-sleep/

Colin Ian King (colin-king) wrote :

@mahmoh, does Joseph's workaround script help?

Changed in zfs-linux (Ubuntu):
status: In Progress → Incomplete
importance: Medium → Low
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers