OpenZFS writing stalls, under load

Bug #1899249 reported by Tyson Key
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Native ZFS for Linux
New
Unknown
zfs-linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Using a QNAP 4-drive USB enclosure, with a set of SSDs, on a Raspberry Pi 8GB. ZFS deduplication, and LZJB compression is enabled.

This issue seems to occur, intermittently, after some time (happens with both SMB access, via Samba, and when interacting with the system, via SSH), and never previously occurred, until a few months ago, and I sometimes have to force a reboot of the system (at the cost of some data loss), in order to use it again.

The "dmesg" log reports:

[25375.911590] z_wr_iss_h D 0 2161 2 0x00000028
[25375.911606] Call trace:
[25375.911627] __switch_to+0x104/0x170
[25375.911639] __schedule+0x30c/0x7c0
[25375.911647] schedule+0x3c/0xb8
[25375.911655] io_schedule+0x20/0x58
[25375.911668] rq_qos_wait+0x100/0x178
[25375.911677] wbt_wait+0xb4/0xf0
[25375.911687] __rq_qos_throttle+0x38/0x50
[25375.911700] blk_mq_make_request+0x128/0x610
[25375.911712] generic_make_request+0xb4/0x2d8
[25375.911722] submit_bio+0x48/0x218
[25375.911960] vdev_disk_io_start+0x670/0x9f8 [zfs]
[25375.912181] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[25375.912400] zio_nowait+0xd4/0x170 [zfs]
[25375.912617] vdev_mirror_io_start+0xa8/0x1b0 [zfs]
[25375.912839] zio_vdev_io_start+0x248/0x2b8 [zfs]
[25375.913057] zio_execute+0xac/0x110 [zfs]
[25375.913096] taskq_thread+0x2f8/0x570 [spl]
[25375.913108] kthread+0xfc/0x128
[25375.913119] ret_from_fork+0x10/0x1c
[25375.913149] INFO: task txg_sync:2333 blocked for more than 120 seconds.
[25375.919916] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25375.926848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25375.934835] txg_sync D 0 2333 2 0x00000028
[25375.934850] Call trace:
[25375.934869] __switch_to+0x104/0x170
[25375.934879] __schedule+0x30c/0x7c0
[25375.934887] schedule+0x3c/0xb8
[25375.934899] schedule_timeout+0x9c/0x190
[25375.934908] io_schedule_timeout+0x28/0x48
[25375.934946] __cv_timedwait_common+0x1a8/0x1f8 [spl]
[25375.934982] __cv_timedwait_io+0x3c/0x50 [spl]
[25375.935205] zio_wait+0x130/0x2a0 [zfs]
[25375.935423] dsl_pool_sync+0x3fc/0x498 [zfs]
[25375.935650] spa_sync+0x538/0xe68 [zfs]
[25375.935867] txg_sync_thread+0x2c0/0x468 [zfs]
[25375.935911] thread_generic_wrapper+0x74/0xa0 [spl]
[25375.935924] kthread+0xfc/0x128
[25375.935935] ret_from_fork+0x10/0x1c
[25375.936017] INFO: task zbackup:75339 blocked for more than 120 seconds.
[25375.942780] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25375.949710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25375.957702] zbackup D 0 75339 5499 0x00000000
[25375.957716] Call trace:
[25375.957732] __switch_to+0x104/0x170
[25375.957742] __schedule+0x30c/0x7c0
[25375.957750] schedule+0x3c/0xb8
[25375.957789] cv_wait_common+0x188/0x1b0 [spl]
[25375.957823] __cv_wait+0x30/0x40 [spl]
[25375.958045] zil_commit_impl+0x234/0xd30 [zfs]
[25375.958263] zil_commit+0x48/0x70 [zfs]
[25375.958481] zfs_create+0x544/0x7d0 [zfs]
[25375.958698] zpl_create+0xb8/0x178 [zfs]
[25375.958711] lookup_open+0x4ec/0x6a8
[25375.958721] do_last+0x260/0x8c0
[25375.958730] path_openat+0x84/0x258
[25375.958739] do_filp_open+0x84/0x108
[25375.958752] do_sys_open+0x180/0x2b0
[25375.958763] __arm64_sys_openat+0x2c/0x38
[25375.958773] el0_svc_common.constprop.0+0x80/0x218
[25375.958781] el0_svc_handler+0x34/0xa0
[25375.958791] el0_svc+0x10/0x2cc
[25375.958801] INFO: task zbackup:95187 blocked for more than 120 seconds.
[25375.965564] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25375.972492] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25375.980479] zbackup D 0 95187 5499 0x00000000
[25375.980493] Call trace:
[25375.980514] __switch_to+0x104/0x170
[25375.980525] __schedule+0x30c/0x7c0
[25375.980536] schedule+0x3c/0xb8
[25375.980578] cv_wait_common+0x188/0x1b0 [spl]
[25375.980612] __cv_wait+0x30/0x40 [spl]
[25375.980834] zil_commit_impl+0x234/0xd30 [zfs]
[25375.981052] zil_commit+0x48/0x70 [zfs]
[25375.981280] zfs_write+0xa3c/0xb90 [zfs]
[25375.981498] zpl_write_common_iovec+0xac/0x120 [zfs]
[25375.981726] zpl_iter_write+0xe4/0x150 [zfs]
[25375.981766] new_sync_write+0x100/0x1a8
[25375.981776] __vfs_write+0x74/0x90
[25375.981784] vfs_write+0xe4/0x1c8
[25375.981794] ksys_write+0x78/0x100
[25375.981803] __arm64_sys_write+0x24/0x30
[25375.981813] el0_svc_common.constprop.0+0x80/0x218
[25375.981821] el0_svc_handler+0x34/0xa0
[25375.981830] el0_svc+0x10/0x2cc
[25375.981842] INFO: task zbackup:95188 blocked for more than 120 seconds.
[25375.988615] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25375.995543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25376.003532] zbackup D 0 95188 5499 0x00000008
[25376.003545] Call trace:
[25376.003561] __switch_to+0x104/0x170
[25376.003571] __schedule+0x30c/0x7c0
[25376.003579] schedule+0x3c/0xb8
[25376.003619] cv_wait_common+0x188/0x1b0 [spl]
[25376.003654] __cv_wait+0x30/0x40 [spl]
[25376.003875] zil_commit_impl+0x234/0xd30 [zfs]
[25376.004092] zil_commit+0x48/0x70 [zfs]
[25376.004312] zfs_write+0xa3c/0xb90 [zfs]
[25376.004530] zpl_write_common_iovec+0xac/0x120 [zfs]
[25376.004748] zpl_iter_write+0xe4/0x150 [zfs]
[25376.004759] new_sync_write+0x100/0x1a8
[25376.004768] __vfs_write+0x74/0x90
[25376.004776] vfs_write+0xe4/0x1c8
[25376.004785] ksys_write+0x78/0x100
[25376.004794] __arm64_sys_write+0x24/0x30
[25376.004803] el0_svc_common.constprop.0+0x80/0x218
[25376.004811] el0_svc_handler+0x34/0xa0
[25376.004820] el0_svc+0x10/0x2cc
[25376.004917] INFO: task zbackup:95231 blocked for more than 120 seconds.
[25376.011678] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25376.018587] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25376.026554] zbackup D 0 95231 5499 0x00000000
[25376.026562] Call trace:
[25376.026575] __switch_to+0x104/0x170
[25376.026582] __schedule+0x30c/0x7c0
[25376.026588] schedule+0x3c/0xb8
[25376.026615] cv_wait_common+0x188/0x1b0 [spl]
[25376.026636] __cv_wait+0x30/0x40 [spl]
[25376.026771] zil_commit_impl+0x234/0xd30 [zfs]
[25376.026904] zil_commit+0x48/0x70 [zfs]
[25376.027042] zfs_write+0xa3c/0xb90 [zfs]
[25376.027174] zpl_write_common_iovec+0xac/0x120 [zfs]
[25376.027309] zpl_iter_write+0xe4/0x150 [zfs]
[25376.027317] new_sync_write+0x100/0x1a8
[25376.027322] __vfs_write+0x74/0x90
[25376.027332] vfs_write+0xe4/0x1c8
[25376.027338] ksys_write+0x78/0x100
[25376.027344] __arm64_sys_write+0x24/0x30
[25376.027350] el0_svc_common.constprop.0+0x80/0x218
[25376.027355] el0_svc_handler+0x34/0xa0
[25376.027361] el0_svc+0x10/0x2cc
[25376.027374] INFO: task z_wr_int:95546 blocked for more than 120 seconds.
[25376.034203] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25376.041113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25376.049080] z_wr_int D 0 95546 2 0x00000028
[25376.049089] Call trace:
[25376.049100] __switch_to+0x104/0x170
[25376.049106] __schedule+0x30c/0x7c0
[25376.049111] schedule+0x3c/0xb8
[25376.049116] io_schedule+0x20/0x58
[25376.049123] rq_qos_wait+0x100/0x178
[25376.049129] wbt_wait+0xb4/0xf0
[25376.049135] __rq_qos_throttle+0x38/0x50
[25376.049143] blk_mq_make_request+0x128/0x610
[25376.049150] generic_make_request+0xb4/0x2d8
[25376.049156] submit_bio+0x48/0x218
[25376.049292] vdev_disk_io_start+0x670/0x9f8 [zfs]
[25376.049423] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[25376.049554] zio_nowait+0xd4/0x170 [zfs]
[25376.049693] vdev_queue_io_done+0x1ec/0x2a0 [zfs]
[25376.049847] zio_vdev_io_done+0xec/0x220 [zfs]
[25376.049986] zio_execute+0xac/0x110 [zfs]
[25376.050010] taskq_thread+0x2f8/0x570 [spl]
[25376.050025] kthread+0xfc/0x128
[25376.050033] ret_from_fork+0x10/0x1c
[25376.050041] INFO: task z_wr_int:95553 blocked for more than 120 seconds.
[25376.056873] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25376.063780] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25376.071746] z_wr_int D 0 95553 2 0x00000028
[25376.071754] Call trace:
[25376.071764] __switch_to+0x104/0x170
[25376.071770] __schedule+0x30c/0x7c0
[25376.071775] schedule+0x3c/0xb8
[25376.071780] io_schedule+0x20/0x58
[25376.071787] rq_qos_wait+0x100/0x178
[25376.071792] wbt_wait+0xb4/0xf0
[25376.071798] __rq_qos_throttle+0x38/0x50
[25376.071806] blk_mq_make_request+0x128/0x610
[25376.071813] generic_make_request+0xb4/0x2d8
[25376.071819] submit_bio+0x48/0x218
[25376.071955] vdev_disk_io_start+0x670/0x9f8 [zfs]
[25376.072086] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[25376.072217] zio_nowait+0xd4/0x170 [zfs]
[25376.072346] vdev_queue_io_done+0x1ec/0x2a0 [zfs]
[25376.072476] zio_vdev_io_done+0xec/0x220 [zfs]
[25376.072606] zio_execute+0xac/0x110 [zfs]
[25376.072629] taskq_thread+0x2f8/0x570 [spl]
[25376.072637] kthread+0xfc/0x128
[25376.072643] ret_from_fork+0x10/0x1c
[25496.722315] INFO: task z_wr_iss_h:2161 blocked for more than 241 seconds.
[25496.729251] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25496.736195] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25496.744197] z_wr_iss_h D 0 2161 2 0x00000028
[25496.744247] Call trace:
[25496.744268] __switch_to+0x104/0x170
[25496.744279] __schedule+0x30c/0x7c0
[25496.744288] schedule+0x3c/0xb8
[25496.744296] io_schedule+0x20/0x58
[25496.744308] rq_qos_wait+0x100/0x178
[25496.744317] wbt_wait+0xb4/0xf0
[25496.744327] __rq_qos_throttle+0x38/0x50
[25496.744340] blk_mq_make_request+0x128/0x610
[25496.744351] generic_make_request+0xb4/0x2d8
[25496.744362] submit_bio+0x48/0x218
[25496.744600] vdev_disk_io_start+0x670/0x9f8 [zfs]
[25496.744821] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[25496.745039] zio_nowait+0xd4/0x170 [zfs]
[25496.745256] vdev_mirror_io_start+0xa8/0x1b0 [zfs]
[25496.745473] zio_vdev_io_start+0x248/0x2b8 [zfs]
[25496.745690] zio_execute+0xac/0x110 [zfs]
[25496.745730] taskq_thread+0x2f8/0x570 [spl]
[25496.745742] kthread+0xfc/0x128
[25496.745753] ret_from_fork+0x10/0x1c
[25496.745853] INFO: task zbackup:75339 blocked for more than 241 seconds.
[25496.752621] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[25496.759557] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[25496.767547] zbackup D 0 75339 5499 0x00000000
[25496.767561] Call trace:
[25496.767585] __switch_to+0x104/0x170
[25496.767595] __schedule+0x30c/0x7c0
[25496.767606] schedule+0x3c/0xb8
[25496.767646] cv_wait_common+0x188/0x1b0 [spl]
[25496.767682] __cv_wait+0x30/0x40 [spl]
[25496.767906] zil_commit_impl+0x234/0xd30 [zfs]
[25496.768126] zil_commit+0x48/0x70 [zfs]
[25496.768355] zfs_create+0x544/0x7d0 [zfs]
[25496.768574] zpl_create+0xb8/0x178 [zfs]
[25496.768595] lookup_open+0x4ec/0x6a8
[25496.768604] do_last+0x260/0x8c0
[25496.768614] path_openat+0x84/0x258
[25496.768624] do_filp_open+0x84/0x108
[25496.768636] do_sys_open+0x180/0x2b0
[25496.768647] __arm64_sys_openat+0x2c/0x38
[25496.768657] el0_svc_common.constprop.0+0x80/0x218
[25496.768665] el0_svc_handler+0x34/0xa0
[25496.768674] el0_svc+0x10/0x2cc

tyson@ubuntu:~$ lsb_release -rd
Description: Ubuntu 20.04.1 LTS
Release: 20.04

tyson@ubuntu:~$ apt-cache policy zfs-dkms
zfs-dkms:
  Installed: 0.8.3-1ubuntu12.4
  Candidate: 0.8.3-1ubuntu12.4
  Version table:
 *** 0.8.3-1ubuntu12.4 500
        500 http://ports.ubuntu.com/ubuntu-ports focal-updates/universe arm64 Packages
        500 http://ports.ubuntu.com/ubuntu-ports focal-updates/universe armhf Packages
        100 /var/lib/dpkg/status
     0.8.3-1ubuntu12 500
        500 http://ports.ubuntu.com/ubuntu-ports focal/universe arm64 Packages
        500 http://ports.ubuntu.com/ubuntu-ports focal/universe armhf Packages

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: zfs-dkms 0.8.3-1ubuntu12.4
ProcVersionSignature: Ubuntu 5.4.0-1018.20-raspi 5.4.55
Uname: Linux 5.4.0-1018-raspi aarch64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu27.8
Architecture: arm64
CasperMD5CheckResult: skip
Date: Sat Oct 10 08:50:34 2020
ImageMediaBuild: 20200423.1
PackageArchitecture: all
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
SourcePackage: zfs-linux
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Tyson Key (vmlemon) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

It is worth noting that using dededuplication does consume a lot of memory.

Deduplication tables consume memory and eventually spill over and consume disk space - this causes extra read and write operations for every block of data on which deduplication is attempted cause more I/O waits.

A system with large pools with small memory areas does not perform deduplication well.

So, if you run the following command on your pool (pool-ssd in this example)

sudo zdb -s pool-ssd

..you see something like the following:

  pool: pool-ssd
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:31:14 with 0 errors on Wed Sep 2 14:35:52 2020
config:

 NAME STATE READ WRITE CKSUM
 pool-ssd ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     ata-INTEL_SSDSC2BW480H6_CVTR608104NW480EGN ONLINE 0 0 0
     ata-INTEL_SSDSC2BW480H6_CVTR552500S0480EGN ONLINE 0 0 0

errors: No known data errors

 dedup: DDT entries 4258028, size 448B on disk, 306B in core

bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
     1 3.63M 464G 226G 226G 3.63M 464G 226G 226G
     2 376K 46.1G 24.3G 24.3G 822K 100G 53.2G 53.2G
     4 51.3K 5.60G 2.51G 2.51G 240K 26.3G 11.8G 11.8G
     8 7.49K 925M 544M 544M 77.9K 9.41G 5.47G 5.47G
    16 3.32K 415M 261M 261M 68.8K 8.41G 5.35G 5.35G
    32 657 78.7M 62.1M 62.1M 29.3K 3.51G 2.79G 2.79G
    64 389 46.2M 42.7M 42.7M 34.0K 4.03G 3.78G 3.78G
   128 248 30.1M 29.5M 29.5M 44.8K 5.44G 5.32G 5.32G
   256 339 42.1M 40.8M 40.8M 123K 15.4G 14.9G 14.9G
   512 374 46.8M 46.1M 46.1M 271K 33.9G 33.5G 33.5G
    1K 254 31.8M 31.3M 31.3M 355K 44.4G 43.8G 43.8G
    2K 5 640K 513K 513K 12.9K 1.61G 1.20G 1.20G
    4K 4 512K 512K 512K 22.7K 2.84G 2.84G 2.84G
    8K 8 1M 897K 897K 89.3K 11.2G 10.1G 10.1G
   64K 2 256K 6K 6K 226K 28.3G 679M 679M
 Total 4.06M 517G 254G 254G 5.99M 759G 421G 421G

---
Looking at the "dedup: DDT entries 4258028, size 448B on disk, 306B in core" line, we have:

4258028 de-dup entries, each entry used 306 bytes, so that's ~1.2GB of memory for just the de-dup/ table. But it is capped at 1/4 the size of the ARC, so you may find on a small memory system such as a raspberry pi that a lot of the dedup table can't be stored in memory and needs to be on-disk, and this increases the I/O waits.

It may be an idea to disable de-dup and see if it helps to resolve your issue on a comparatively slow and memory constrained system such as a raspberry pi when you have large pools

Changed in zfs-linux (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
assignee: nobody → Colin Ian King (colin-king)
Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (15.8 KiB)

Thanks for your reply (I expected this to go into the ether, since it seems to be a very common issue, with I/O, on this hardware platform).

I did try switching from LZJB, to LZ4, and got slightly-better performance, and reliability, from at least testing with backing up a Windows 10 machine, using WSL, and Borg, over SSH - however, I can still lock the system up, after several runs, or get it to a stage where I can send commands, over an open SSH session, but spawning a new one fails, since it can't access my home directory (which is stored on the ZFS array).

My last log, with Borg, is:

[ 22.536219] bcmgenet: Skipping UMAC reset
[ 22.537803] bcmgenet fd580000.ethernet: configuring instance for external RGMII
[ 22.538029] bcmgenet fd580000.ethernet eth0: Link is Down
[ 25.140217] new mount options do not match the existing superblock, will be ignored
[ 26.751285] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save enabled
[ 27.651756] bcmgenet fd580000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[ 27.651790] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 29.696209] Started bpfilter
[ 29.701240] bpfilter: Loaded bpfilter_umh pid 3166
[ 29.858264] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[ 30.379027] kauditd_printk_skb: 27 callbacks suppressed
[ 30.379032] audit: type=1400 audit(1602433969.316:39): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="/usr/bin/lxc-start" pid=3259 comm="apparmor_parser"
[ 30.457589] audit: type=1400 audit(1602433969.396:40): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default" pid=3266 comm="apparmor_parser"
[ 30.457603] audit: type=1400 audit(1602433969.396:41): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-cgns" pid=3266 comm="apparmor_parser"
[ 30.457620] audit: type=1400 audit(1602433969.396:42): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-with-mounting" pid=3266 comm="apparmor_parser"
[ 30.457629] audit: type=1400 audit(1602433969.396:43): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="lxc-container-default-with-nesting" pid=3266 comm="apparmor_parser"
[ 36.556387] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[ 36.556393] Bluetooth: BNEP filters: protocol multicast
[ 36.556407] Bluetooth: BNEP socket layer initialized
[ 74.720044] Adding 16777212k swap on /dev/zd0. Priority:-2 extents:1 across:16777212k SSFS
[10030.116676] INFO: task z_rd_int:2154 blocked for more than 120 seconds.
[10030.123432] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[10030.130389] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10030.138388] z_rd_int D 0 2154 2 0x00000028
[10030.138403] Call trace:
[10030.138424] __switch_to+0...

Revision history for this message
Tyson Key (vmlemon) wrote :

It looks like writing snapshots, with "zfs snapshot" will sometimes stall, even if other commands, like "zfs status" appear to work, occasionally, too.

Revision history for this message
Tyson Key (vmlemon) wrote :

In the meantime, I'll see if I can temporarily disable ZPool-level deduplication, and retry the backup run, again, on the Windows machine.

Whilst it's not the perfect long-term solution, I might look into using offline deduplication, for older, infrequently-accessed data in the pool (probably with Borg/ZBackup), since this machine is mostly-used for bulk storage of archival/back-up data, and file/HTTP serving, in a small network.

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (10.5 KiB)

With deduplication disabled, it eventually gets further with the backup, but still stalls, after leaving it, overnight:

[12084.274242] INFO: task z_wr_iss_h:2157 blocked for more than 120 seconds.
[12084.281171] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[12084.288126] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12084.296127] z_wr_iss_h D 0 2157 2 0x00000028
[12084.296142] Call trace:
[12084.296163] __switch_to+0x104/0x170
[12084.296174] __schedule+0x30c/0x7c0
[12084.296183] schedule+0x3c/0xb8
[12084.296191] io_schedule+0x20/0x58
[12084.296204] rq_qos_wait+0x100/0x178
[12084.296213] wbt_wait+0xb4/0xf0
[12084.296223] __rq_qos_throttle+0x38/0x50
[12084.296235] blk_mq_make_request+0x128/0x610
[12084.296247] generic_make_request+0xb4/0x2d8
[12084.296258] submit_bio+0x48/0x218
[12084.296496] vdev_disk_io_start+0x670/0x9f8 [zfs]
[12084.296716] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[12084.296935] zio_nowait+0xd4/0x170 [zfs]
[12084.297151] vdev_mirror_io_start+0xa8/0x1b0 [zfs]
[12084.297369] zio_vdev_io_start+0x248/0x2b8 [zfs]
[12084.297586] zio_execute+0xac/0x110 [zfs]
[12084.297626] taskq_thread+0x2f8/0x570 [spl]
[12084.297639] kthread+0xfc/0x128
[12084.297650] ret_from_fork+0x10/0x1c
[12084.297678] INFO: task txg_quiesce:3077 blocked for more than 120 seconds.
[12084.304712] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[12084.311641] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12084.319632] txg_quiesce D 0 3077 2 0x00000028
[12084.319646] Call trace:
[12084.319668] __switch_to+0x104/0x170
[12084.319679] __schedule+0x30c/0x7c0
[12084.319691] schedule+0x3c/0xb8
[12084.319731] cv_wait_common+0x188/0x1b0 [spl]
[12084.319767] __cv_wait+0x30/0x40 [spl]
[12084.319990] txg_quiesce_thread+0x27c/0x380 [zfs]
[12084.320028] thread_generic_wrapper+0x74/0xa0 [spl]
[12084.320050] kthread+0xfc/0x128
[12084.320062] ret_from_fork+0x10/0x1c
[12084.320133] INFO: task borg:13631 blocked for more than 120 seconds.
[12084.326632] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[12084.333562] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12084.341550] borg D 0 13631 13630 0x00000000
[12084.341564] Call trace:
[12084.341578] __switch_to+0x104/0x170
[12084.341588] __schedule+0x30c/0x7c0
[12084.341596] schedule+0x3c/0xb8
[12084.341635] cv_wait_common+0x188/0x1b0 [spl]
[12084.341670] __cv_wait+0x30/0x40 [spl]
[12084.341895] zil_commit_impl+0x234/0xd30 [zfs]
[12084.342112] zil_commit+0x48/0x70 [zfs]
[12084.342331] zfs_write+0xa3c/0xb90 [zfs]
[12084.342588] zpl_write_common_iovec+0xac/0x120 [zfs]
[12084.342817] zpl_iter_write+0xe4/0x150 [zfs]
[12084.342830] new_sync_write+0x100/0x1a8
[12084.342839] __vfs_write+0x74/0x90
[12084.342855] vfs_write+0xe4/0x1c8
[12084.342865] ksys_write+0x78/0x100
[12084.342874] __arm64_sys_write+0x24/0x30
[12084.342884] el0_svc_common.constprop.0+0x80/0x218
[12084.342892] el0_svc_handler+0x34/0xa0
[12084.342902] el0_svc+0x10/0x2cc
[12084.342927] INFO: task z_wr_iss_h:114156 blocked for more than 120 seconds.
[12084.350038] ...

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (10.6 KiB)

Also receive this, when trying to archive a large directory:

[ 2055.147509] INFO: task z_wr_iss_h:2169 blocked for more than 120 seconds.
[ 2055.154450] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[ 2055.161401] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2055.169403] z_wr_iss_h D 0 2169 2 0x00000028
[ 2055.169420] Call trace:
[ 2055.169441] __switch_to+0x104/0x170
[ 2055.169453] __schedule+0x30c/0x7c0
[ 2055.169462] schedule+0x3c/0xb8
[ 2055.169470] io_schedule+0x20/0x58
[ 2055.169483] rq_qos_wait+0x100/0x178
[ 2055.169492] wbt_wait+0xb4/0xf0
[ 2055.169502] __rq_qos_throttle+0x38/0x50
[ 2055.169514] blk_mq_make_request+0x128/0x610
[ 2055.169526] generic_make_request+0xb4/0x2d8
[ 2055.169537] submit_bio+0x48/0x218
[ 2055.169779] vdev_disk_io_start+0x670/0x9f8 [zfs]
[ 2055.170005] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[ 2055.170229] zio_nowait+0xd4/0x170 [zfs]
[ 2055.170451] vdev_mirror_io_start+0xa8/0x1b0 [zfs]
[ 2055.170673] zio_vdev_io_start+0x248/0x2b8 [zfs]
[ 2055.170896] zio_execute+0xac/0x110 [zfs]
[ 2055.170937] taskq_thread+0x2f8/0x570 [spl]
[ 2055.170950] kthread+0xfc/0x128
[ 2055.170960] ret_from_fork+0x10/0x1c
[ 2055.170990] INFO: task txg_quiesce:2380 blocked for more than 120 seconds.
[ 2055.178025] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[ 2055.184958] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2055.192949] txg_quiesce D 0 2380 2 0x00000028
[ 2055.192963] Call trace:
[ 2055.192984] __switch_to+0x104/0x170
[ 2055.192995] __schedule+0x30c/0x7c0
[ 2055.193003] schedule+0x3c/0xb8
[ 2055.193046] cv_wait_common+0x188/0x1b0 [spl]
[ 2055.193082] __cv_wait+0x30/0x40 [spl]
[ 2055.193310] txg_quiesce_thread+0x27c/0x380 [zfs]
[ 2055.193349] thread_generic_wrapper+0x74/0xa0 [spl]
[ 2055.193370] kthread+0xfc/0x128
[ 2055.193381] ret_from_fork+0x10/0x1c
[ 2055.193438] INFO: task smbd:3662 blocked for more than 120 seconds.
[ 2055.199853] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[ 2055.206784] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2055.214772] smbd D 0 3662 3498 0x00000000
[ 2055.214786] Call trace:
[ 2055.214800] __switch_to+0x104/0x170
[ 2055.214810] __schedule+0x30c/0x7c0
[ 2055.214818] schedule+0x3c/0xb8
[ 2055.214856] cv_wait_common+0x188/0x1b0 [spl]
[ 2055.214892] __cv_wait+0x30/0x40 [spl]
[ 2055.215121] zil_commit_impl+0x234/0xd30 [zfs]
[ 2055.215344] zil_commit+0x48/0x70 [zfs]
[ 2055.215567] zfs_read+0x2f8/0x418 [zfs]
[ 2055.215831] zpl_read_common_iovec+0xa8/0x110 [zfs]
[ 2055.216065] zpl_read_common+0x4c/0x70 [zfs]
[ 2055.216288] zpl_xattr_get_dir+0x10c/0x128 [zfs]
[ 2055.216519] zpl_xattr_get+0xcc/0x1a8 [zfs]
[ 2055.216742] zpl_xattr_user_get+0x68/0x90 [zfs]
[ 2055.216764] __vfs_getxattr+0x60/0x80
[ 2055.216774] vfs_getxattr+0x140/0x160
[ 2055.216784] getxattr+0x9c/0x2c0
[ 2055.216794] path_getxattr+0x7c/0xd0
[ 2055.216805] __arm64_sys_getxattr+0x28/0x38
[ 2055.216815] el0_svc_common.constprop.0+0x80/0x218
[ 2055.216823] el0_svc_handler+0x34/0xa0
[ 2055.216833] el0_svc+0x10/0x2cc
[ 2...

Revision history for this message
Richard Laager (rlaager) wrote :

Did you destroy and recreate the pool after disabling dedup? Otherwise you still have the same dedup table and haven’t really accomplished much.

Revision history for this message
Tyson Key (vmlemon) wrote :

Unfortunately, I haven't got a storage device large enough, to contain all of the data from the pool, and much of it cannot be recreated, or restored, from another source, so I won't be able to nuke the pool, and rebuild it.

Revision history for this message
Richard Laager (rlaager) wrote :

You could shrink the DDT by making a copy of the files in place (with dedup off) and deleting the old file. That only requires enough extra space for a single file at a time. This assumes no snapshots.

If you need to preserve snapshots, another option would be to send|recv a dataset at a time. If you have enough free space for a copy of the largest dataset, this would work.

Revision history for this message
Tyson Key (vmlemon) wrote :

Thanks.

I was able to archive, and delete 400GB of data, without problems, earlier on, today, which reduced the "REFER" of my data set, a little - however, it looks like I probably need to focus on archiving, and removing some of the older snapshots, if I want to trim down the memory utilisation of my pool.

When I arrived home, from work, "dmesg" didn't have anything interesting to say, in terms of panics, or errors - which seems promising, but it's still early days, yet.

Seems a little like a "luxury problem", that I find myself with more storage space, and data, than available RAM, given that usually, the opposite situation is true...

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (11.4 KiB)

Looks like I could archive about 1.3TB, of a 1.64TB snapshot, before things started to go bad, again:

[110079.681102] INFO: task z_wr_iss_h:2171 blocked for more than 120 seconds.
[110079.688123] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[110079.695167] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[110079.703260] z_wr_iss_h D 0 2171 2 0x00000028
[110079.703276] Call trace:
[110079.703297] __switch_to+0x104/0x170
[110079.703308] __schedule+0x30c/0x7c0
[110079.703317] schedule+0x3c/0xb8
[110079.703325] io_schedule+0x20/0x58
[110079.703338] rq_qos_wait+0x100/0x178
[110079.703347] wbt_wait+0xb4/0xf0
[110079.703357] __rq_qos_throttle+0x38/0x50
[110079.703370] blk_mq_make_request+0x128/0x610
[110079.703382] generic_make_request+0xb4/0x2d8
[110079.703392] submit_bio+0x48/0x218
[110079.703629] vdev_disk_io_start+0x670/0x9f8 [zfs]
[110079.703850] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[110079.704068] zio_nowait+0xd4/0x170 [zfs]
[110079.704285] vdev_mirror_io_start+0xa8/0x1b0 [zfs]
[110079.704503] zio_vdev_io_start+0x248/0x2b8 [zfs]
[110079.704720] zio_execute+0xac/0x110 [zfs]
[110079.704760] taskq_thread+0x2f8/0x570 [spl]
[110079.704773] kthread+0xfc/0x128
[110079.704784] ret_from_fork+0x10/0x1c
[110079.704795] INFO: task z_wr_int:2173 blocked for more than 120 seconds.
[110079.711650] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[110079.718668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[110079.726745] z_wr_int D 0 2173 2 0x00000028
[110079.726760] Call trace:
[110079.726780] __switch_to+0x104/0x170
[110079.726790] __schedule+0x30c/0x7c0
[110079.726803] schedule+0x3c/0xb8
[110079.726811] io_schedule+0x20/0x58
[110079.726823] rq_qos_wait+0x100/0x178
[110079.726832] wbt_wait+0xb4/0xf0
[110079.726842] __rq_qos_throttle+0x38/0x50
[110079.726854] blk_mq_make_request+0x128/0x610
[110079.726866] generic_make_request+0xb4/0x2d8
[110079.726877] submit_bio+0x48/0x218
[110079.727100] vdev_disk_io_start+0x670/0x9f8 [zfs]
[110079.727320] zio_vdev_io_start+0xdc/0x2b8 [zfs]
[110079.727549] zio_nowait+0xd4/0x170 [zfs]
[110079.727766] vdev_queue_io_done+0x1ec/0x2a0 [zfs]
[110079.727991] zio_vdev_io_done+0xec/0x220 [zfs]
[110079.728209] zio_execute+0xac/0x110 [zfs]
[110079.728255] taskq_thread+0x2f8/0x570 [spl]
[110079.728268] kthread+0xfc/0x128
[110079.728279] ret_from_fork+0x10/0x1c
[110079.728309] INFO: task txg_sync:2388 blocked for more than 120 seconds.
[110079.735157] Tainted: P C OE 5.4.0-1018-raspi #20-Ubuntu
[110079.742175] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[110079.750306] txg_sync D 0 2388 2 0x00000028
[110079.750321] Call trace:
[110079.750337] __switch_to+0x104/0x170
[110079.750348] __schedule+0x30c/0x7c0
[110079.750356] schedule+0x3c/0xb8
[110079.750366] schedule_timeout+0x9c/0x190
[110079.750374] io_schedule_timeout+0x28/0x48
[110079.750412] __cv_timedwait_common+0x1a8/0x1f8 [spl]
[110079.750447] __cv_timedwait_io+0x3c/0x50 [spl]
[110079.750670] zio_wait+0x130/0x2a0 [zfs]
[110079.750890] dsl_pool_sync+0x3fc/0x498...

Revision history for this message
Colin Ian King (colin-king) wrote :

Is it possible to have the full dmesg from the start of where you boot to the point where you see the "INFO... blocked for more than 120 seconds" message?

Revision history for this message
Colin Ian King (colin-king) wrote :

..and does you raspi have swap enabled?

Revision history for this message
Colin Ian King (colin-king) wrote :

Same issue as reported here: https://github.com/openzfs/zfs/issues/10522

Revision history for this message
Colin Ian King (colin-king) wrote :

So it may be that the write-back-throttling (wbt) for the underlying devices is getting confused about the exact throttle rates are for these devices and somehow getting stuck. It maybe worth experimenting by disabling the throttling and seeing if this gets I/O working again.

For example, to disable wbt for a device /dev/sda use:

echo 0 | sudo tee /sys/block/sda/queue/wbt_lat_usec

and if you need to reset it back to the default:

echo -1 | sudo tee /sys/block/sda/queue/wbt_lat_usec

..use the appropriate block device name for the block devices you have attached. It may even be worth setting the wbt_lat_usec to 0 for all the block devices in your pool as early as possible after boot and see if this helps. My hunch is that wbt is not set correctly for the attached non-spinny devices you have and it is assuming they are reasonably fast when in fact it may be rate limited to be slower than expected because they are attached to relatively slow host.

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (93.4 KiB)

Here's a full "dmesg" log, if it helps:

tyson@ubuntu:~$ dmesg
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[ 0.000000] Linux version 5.4.0-1018-raspi (buildd@bos02-arm64-052) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #20-Ubuntu SMP Sun Sep 6 05:11:16 UTC 2020 (Ubuntu 5.4.0-1018.20-raspi 5.4.55)
[ 0.000000] Machine model: Raspberry Pi 4 Model B Rev 1.4
[ 0.000000] efi: Getting EFI parameters from FDT:
[ 0.000000] efi: UEFI not found.
[ 0.000000] Reserved memory: created CMA memory pool at 0x000000002c000000, size 64 MiB
[ 0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[ 0.000000] On node 0 totalpages: 2061312
[ 0.000000] DMA zone: 3792 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 242688 pages, LIFO batch:63
[ 0.000000] DMA32 zone: 12288 pages used for memmap
[ 0.000000] DMA32 zone: 770048 pages, LIFO batch:63
[ 0.000000] Normal zone: 16384 pages used for memmap
[ 0.000000] Normal zone: 1048576 pages, LIFO batch:63
[ 0.000000] percpu: Embedded 32 pages/cpu s92120 r8192 d30760 u131072
[ 0.000000] pcpu-alloc: s92120 r8192 d30760 u131072 alloc=32*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
[ 0.000000] Detected PIPT I-cache on CPU0
[ 0.000000] CPU features: detected: EL2 vector hardening
[ 0.000000] CPU features: kernel page table isolation forced ON by KASLR
[ 0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[ 0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 2028848
[ 0.000000] Kernel command line: coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 snd_bcm2835.enable_headphones=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:B8:1A:28 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc quiet splash
[ 0.000000] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] software IO TLB: mapped [mem 0x37400000-0x3b400000] (64MB)
[ 0.000000] Memory: 7898136K/8245248K available (11772K kernel code, 1236K rwdata, 4244K rodata, 6144K init, 1076K bss, 281576K reserved, 65536K cma-reserved)
[ 0.000000] random: get_random_u64 called from kmem_cache_open+0x38/0x3e0 with crng_init=0
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] ftrace: allocating 39001 entries in 153 pages
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] GIC: Using split EOI/Deactivate mode
[ 0.000000] arch_timer: cp15 timer(s) running ...

Revision history for this message
Tyson Key (vmlemon) wrote :

I'm using 16GB of swap, on a ZVOL, in the pool, but I can also test with swap, on the internal MicroSD card's EXT4 root partition, if it helps.

Revision history for this message
Tyson Key (vmlemon) wrote :

All of the drives in the pool are SSDs (2x Crucial BX500 (2TB), 1x SanDisk Ultra 3D (4TB), 1x Samsung 870 QVO (4TB), but obviously don't perfectly-align, in terms of performance characteristics - but, I'll test the WBT tuneable, to see if it makes a difference.

I'm also suspecting that some power management weirdness, affecting waking up USB devices, is at play, on this machine, but it's a Ubuntu Server installation, without any special GUI, or Raspberry Pi-specific components installed.

Revision history for this message
Tyson Key (vmlemon) wrote :

Set the WBT value to 0, for all of the devices, whilst another archive run takes place. The last one ran for just over 20 hours, before hitting the I/O problem.

Revision history for this message
Tyson Key (vmlemon) wrote :

Needs more time, in terms of long-running I/O, but changing the WBT setting at least makes retrieving snapshots a little faster, with "zfs list -t all".

Revision history for this message
Colin Ian King (colin-king) wrote :

It may also be a good idea to disable any power management on USB too.

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (46.9 KiB)

And, after 18 hours+, I was able to archive a ZFS snapshot, using BorgBackup, with the WBT settings, suggested, earlier:

tyson@ubuntu:~$ sudo zfs send Yaesu@Crucial-2TB-1951E22FA633 | borg create --stats BorgStore::Yaesu@Crucial-2TB-1951E22FA633 -
^[[A^[[A------------------------------------------------------------------------------
Archive name: Yaesu@Crucial-2TB-1951E22FA633
Archive fingerprint: 32d64e2a13eb543e4846cc34ac2c3609fe192127e3bf6d324a0fa9d61b988c19
Time (start): Tue, 2020-10-13 15:49:51
Time (end): Wed, 2020-10-14 10:01:46
Duration: 18 hours 11 minutes 54.87 seconds
Number of files: 1
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size Compressed size Deduplicated size
This archive: 1.89 TB 1.75 TB 617.36 GB
All archives: 7.44 TB 5.87 TB 1.99 TB

                       Unique chunks Total chunks
Chunk index: 842436 2607257
------------------------------------------------------------------------------

The "dmesg" log is clean, too:

tyson@ubuntu:~$ dmesg
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[ 0.000000] Linux version 5.4.0-1018-raspi (buildd@bos02-arm64-052) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #20-Ubuntu SMP Sun Sep 6 05:11:16 UTC 2020 (Ubuntu 5.4.0-1018.20-raspi 5.4.55)
[ 0.000000] Machine model: Raspberry Pi 4 Model B Rev 1.4
[ 0.000000] efi: Getting EFI parameters from FDT:
[ 0.000000] efi: UEFI not found.
[ 0.000000] Reserved memory: created CMA memory pool at 0x000000002c000000, size 64 MiB
[ 0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[ 0.000000] On node 0 totalpages: 2061312
[ 0.000000] DMA zone: 3792 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 242688 pages, LIFO batch:63
[ 0.000000] DMA32 zone: 12288 pages used for memmap
[ 0.000000] DMA32 zone: 770048 pages, LIFO batch:63
[ 0.000000] Normal zone: 16384 pages used for memmap
[ 0.000000] Normal zone: 1048576 pages, LIFO batch:63
[ 0.000000] percpu: Embedded 32 pages/cpu s92120 r8192 d30760 u131072
[ 0.000000] pcpu-alloc: s92120 r8192 d30760 u131072 alloc=32*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
[ 0.000000] Detected PIPT I-cache on CPU0
[ 0.000000] CPU features: detected: EL2 vector hardening
[ 0.000000] CPU features: kernel page table isolation forced ON by KASLR
[ 0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[ 0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 2028848
[ 0.000000] Kernel command line: coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 snd_bcm2835.enable_headphones=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:B8:1A:28 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200...

Revision history for this message
Colin Ian King (colin-king) wrote :

That's great news. Let's keep this bug open for the moment as "incomplete" and if you don't report back after ~6 weeks or so it will automatically be closed.

Revision history for this message
Tyson Key (vmlemon) wrote :
Download full text (97.7 KiB)

Curiously, it worked fine, for quite some time, but I did manage to get this crash, with TAR, earlier:

[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[ 0.000000] Linux version 5.4.0-1018-raspi (buildd@bos02-arm64-052) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #20-Ubuntu SMP Sun Sep 6 05:11:16 UTC 2020 (Ubuntu 5.4.0-1018.20-raspi 5.4.55)
[ 0.000000] Machine model: Raspberry Pi 4 Model B Rev 1.4
[ 0.000000] efi: Getting EFI parameters from FDT:
[ 0.000000] efi: UEFI not found.
[ 0.000000] Reserved memory: created CMA memory pool at 0x000000002c000000, size 64 MiB
[ 0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[ 0.000000] On node 0 totalpages: 2061312
[ 0.000000] DMA zone: 3792 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 242688 pages, LIFO batch:63
[ 0.000000] DMA32 zone: 12288 pages used for memmap
[ 0.000000] DMA32 zone: 770048 pages, LIFO batch:63
[ 0.000000] Normal zone: 16384 pages used for memmap
[ 0.000000] Normal zone: 1048576 pages, LIFO batch:63
[ 0.000000] percpu: Embedded 32 pages/cpu s92120 r8192 d30760 u131072
[ 0.000000] pcpu-alloc: s92120 r8192 d30760 u131072 alloc=32*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
[ 0.000000] Detected PIPT I-cache on CPU0
[ 0.000000] CPU features: detected: EL2 vector hardening
[ 0.000000] CPU features: kernel page table isolation forced ON by KASLR
[ 0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[ 0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 2028848
[ 0.000000] Kernel command line: coherent_pool=1M 8250.nr_uarts=1 snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 snd_bcm2835.enable_headphones=1 bcm2708_fb.fbwidth=0 bcm2708_fb.fbheight=0 bcm2708_fb.fbswap=1 smsc95xx.macaddr=DC:A6:32:B8:1A:28 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 net.ifnames=0 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc quiet splash
[ 0.000000] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] software IO TLB: mapped [mem 0x37400000-0x3b400000] (64MB)
[ 0.000000] Memory: 7898140K/8245248K available (11772K kernel code, 1236K rwdata, 4244K rodata, 6144K init, 1076K bss, 281572K reserved, 65536K cma-reserved)
[ 0.000000] random: get_random_u64 called from kmem_cache_open+0x38/0x3e0 with crng_init=0
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] ftrace: allocating 39001 entries in 153 pages
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] GIC: Using split EOI/Deactivate mode
[ 0.0...

Revision history for this message
Colin Ian King (colin-king) wrote :

That warning message is from the Raspberry Pi firmware broadcom get_property sys interface, some user space program has read an old deprecated sys interface and is just warning to use the hwmon sysfs interface and it has no bearing on the ZFS or block WBT settings.

Revision history for this message
Colin Ian King (colin-king) wrote :

@Tyson, hopefully the WBT changes helped to resolve this issue. If so, please let me know and I can close this bug.

Revision history for this message
Tyson Key (vmlemon) wrote :

Thanks, Colin. I intend to retest, with 20.10, but for now, with deduplication disabled, things seem stable.

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, I'll close this bug for now. If it still bites please feel free to re-open the bug and I'll get back onto it.

Changed in zfs-linux (Ubuntu):
status: Incomplete → Fix Released
Changed in zfs:
status: Unknown → New
Changed in zfs-linux (Ubuntu):
assignee: Colin Ian King (colin-king) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.