Very slow reception of incremental OpenZFS backup

Bug #1969281 reported by BertN45
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

Since OpenZFS the reception of incremental backups over ssh are very slow;
FreeBSD 13.0 with OpenZFS running on a 2003 Pentium 4 HT 3.0GHz (1.5GB DDR) is faster than Ubuntu 21.10/22.04 with OpenZFS running on my laptop with an i5-2520m (8GB DDR3).

FreeBSD runs at 21 MiB/s with a 10% variance; The limit is caused by the load on a P4 CPU thread of >90%;
For Ubuntu the transfer speed is between 90 MiB/S and periods of seconds long with 0 MiB/s. A lot of time the transfer speed is in the single digit numbers.

The problems occur when one or two of the four i5 cpu threads run at 100%, while the CPU clock remains at 0.83 GHz. So it looks like an integration issue between OpenZFS and the Linux kernel.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: zfsutils-linux 2.1.2-1ubuntu3
ProcVersionSignature: Ubuntu 5.15.0-25.25-generic 5.15.30
Uname: Linux 5.15.0-25-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu82
Architecture: amd64
CasperMD5CheckResult: pass
CurrentDesktop: ubuntu:GNOME
Date: Sat Apr 16 14:49:10 2022
InstallationDate: Installed on 2021-10-30 (168 days ago)
InstallationMedia: Ubuntu 21.10 "Impish Indri" - Release amd64 (20211012)
SourcePackage: zfs-linux
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/zfs']

Revision history for this message
BertN45 (lammert-nijhof) wrote :
Revision history for this message
BertN45 (lammert-nijhof) wrote :

The error only occurs on the backup of the large datasets with incremental updates from say 40 GB to 80 GB. Those datasets are around 250 GB and 450 GB and they contain Virtual Machines. On smaller datasets I have no issues and in the begin of the transfers, there are no problems either. The performance get worse over time.

Today I also moved one of the datasets to another partition and that local transfer had no problems. So the error is more related to the interface between ssh and OpenZFS. However this dataset has a few VMs and is around 25GB with 2GB to 5GB updates.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

Let me try a wild guess, it looks like buffer fragmentation related to network/ssh/openZFS. The software might desperately try to defragment many small spaces and that would explain the 100% CPU load during the low speed transfers. If the defragmentation happens in the kernel, it also might explain, why the CPU frequency is not increased. Note that only Conky records that 100% CPU load, while the task manager does not notice anything and has the load below 30% all the time.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.