zpool scrub malfunction after kernel upgrade

Bug #1860182 reported by Eric Brown on 2020-01-17
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Critical
Colin Ian King
Bionic
Undecided
Unassigned

Bug Description

== SRU Request [BIONIC] ==

The HWE kernel on bionic provides zfs 0.8.1 driver which includes an improved scrub however, the progress stats reported by the kernel are incompatible to the 0.7.x zfs driver.

== Fix ==

Use the new zfs 8.x pool_scan_stat_t extra fields to calculate
the scan progress when using zfs 8.x kernel drivers. Add detection of the kernel module version and use an approximation to the zfs 0.8.0 progress and rate reporting for newer kernels.

For 0.7.5 we can pass the larger 8.x port_scan_stat_t to 0.7.5
zfs w/o problems and ignore these new fields and continue
to use the 0.7.5 rate calculations.

== Test ==

Install the HWE kernel on Bionic, create some large ZFS pools and populate with a lot of data. Issue:

sudo zpool scrub poolname
and then look at the progress using

sudo zpool status

Without the fix, the progress stats are incorrect. With the fix the duration and rate stats as a fairly good approximation of the progress. Since the newer 0.8.x zfs does scanning now in two phases the older zfs tools will only report accurate stats for phase #2 of the scan to keep it roughly compatible with the 0.7.x zfs utils output.

== Regression Potential ==

This is a userspace reporting fix so the zpool status output is only affected by this fix when doing a scrub, so the impact of this fix is very small and limited.

------------------------------------

I ran a zpool scrub prior to upgrading my 18.04 to the latest HWE kernel (5.3.0-26-generic #28~18.04.1-Ubuntu) and it ran properly:

eric@eric-8700K:~$ zpool status
  pool: storagepool1
 state: ONLINE
  scan: scrub repaired 1M in 4h21m with 0 errors on Fri Jan 17 07:01:24 2020
config:

 NAME STATE READ WRITE CKSUM
 storagepool1 ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M3YFRVJ3 ONLINE 0 0 0
     ata-ST2000DM001-1CH164_Z1E285A4 ONLINE 0 0 0
   mirror-1 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M1DSASHD ONLINE 0 0 0
     ata-ST2000DM006-2DM164_Z4ZA3ENE ONLINE 0 0 0

I ran zpool scrub after upgrading the kernel and rebooting, and now it fails to work properly. It appeared to finish in about 5 minutes but did not, and says it is going slow:

eric@eric-8700K:~$ sudo zpool status
  pool: storagepool1
 state: ONLINE
  scan: scrub in progress since Fri Jan 17 15:32:07 2020
 1.89T scanned out of 1.89T at 589M/s, (scan is slow, no estimated time)
 0B repaired, 100.00% done
config:

 NAME STATE READ WRITE CKSUM
 storagepool1 ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M3YFRVJ3 ONLINE 0 0 0
     ata-ST2000DM001-1CH164_Z1E285A4 ONLINE 0 0 0
   mirror-1 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M1DSASHD ONLINE 0 0 0
     ata-ST2000DM006-2DM164_Z4ZA3ENE ONLINE 0 0 0

errors: No known data errors

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: zfsutils-linux 0.7.5-1ubuntu16.7
ProcVersionSignature: Ubuntu 5.3.0-26.28~18.04.1-generic 5.3.13
Uname: Linux 5.3.0-26-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.9-0ubuntu7.9
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Fri Jan 17 16:22:01 2020
InstallationDate: Installed on 2018-03-07 (681 days ago)
InstallationMedia: Ubuntu 17.10 "Artful Aardvark" - Release amd64 (20180105.1)
SourcePackage: zfs-linux
UpgradeStatus: Upgraded to bionic on 2018-08-02 (533 days ago)
modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/zfs']

Eric Brown (eberic) wrote :
Changed in zfs-linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → High
status: New → Triaged
Eric Brown (eberic) wrote :

Note: rebooting and running the older kernel (4.15.0-74-generic #84-Ubuntu) results in the scrub proceeding at the expected rate.

Richard Laager (rlaager) wrote :

You original scrub took just under 4.5 hours. Have you let the second scrub run anywhere near that long? If not, start there.

The new scrub code uses a two-phase approach. First it works through metadata determining what (on-disk) blocks to scrub. Second, it does the actual scrub. This allows ZFS to coalesce the blocks and do large, sequential reads in the second phase. This dramatically speeds up the total scrub time. In contrast, the original scrub code is doing a lot of small, random reads.

You might just be seeing the first phase completing in 5 minutes, but the second phase still needs to occur. Or, maybe it did part of the first phase but hit the RAM limit and needed to start the second phase.

If you've let it run for 4.5 hours and it's still showing that status, then I'd say something is wrong.

Eric Brown (eberic) wrote :

Okay thank you. I will let the scrub run to completion on kernel 4.15, then I will reboot back into 5.3 and scrub again and will allow it time to see what happens. Even if it does end up completing, the text of zpool status was still incorrect: as it was running it said 5 mins remaining, then said it was 100% done, but also that scan is slow (as above, and below).

scan: scrub in progress since Fri Jan 17 15:32:07 2020
 1.89T scanned out of 1.89T at 589M/s, (scan is slow, no estimated time)
 0B repaired, 100.00% done

Eric Brown (eberic) wrote :

Update: The scrub back on 4.15 went as expected. I rebooted with 5.3 kernel and tried the scrub again. I saw the same behaviour: the scrub started, with a very fast rate and short time estimate (3-5 minutes), and then reached 100% after about that time. However it continued to say it was in progress, despite saying 100.00% done. It did not calcualte a time estimate, saying the rate was slow, however, it did complete after 3h36 (so fast than it used to on kernel 4.15).

eric@eric-8700K:~$ zpool status
  pool: storagepool1
 state: ONLINE
  scan: scrub repaired 0B in 3h36m with 0 errors on Sat Jan 18 14:50:16 2020
config:

 NAME STATE READ WRITE CKSUM
 storagepool1 ONLINE 0 0 0
   mirror-0 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M3YFRVJ3 ONLINE 0 0 0
     ata-ST2000DM001-1CH164_Z1E285A4 ONLINE 0 0 0
   mirror-1 ONLINE 0 0 0
     ata-WDC_WD20EZRZ-00Z5HB0_WD-WCC4M1DSASHD ONLINE 0 0 0
     ata-ST2000DM006-2DM164_Z4ZA3ENE ONLINE 0 0 0

errors: No known data errors

I started a scrub again to see whether it was just the first time with this kernel, but again the % complete seems inaccurate, and the time remaining is definitely inaccurate (0h0m shortly after starting).

So at the very least it is clear that zpool status is not accurately reporting the scrub process on kernel 5.3.

Changed in zfs-linux (Ubuntu):
importance: High → Medium

I believe there was something on the zfs user list that had a similar problem. Let the scrub run as long on the HWE kernel as the other and it looks like it will complete.

Eric Brown (eberic) wrote :

Yes, @gerald-drouillard, that's what I found. It will complete, but the progress and details reported by `zpool status` are incorrect.

Colin Ian King (colin-king) wrote :

OK, I'll look into this sometime this week. Thanks for the information.

Changed in zfs-linux (Ubuntu):
importance: Medium → High
status: Triaged → In Progress
Andrea Righi (arighi) wrote :

I'm trying to reproduce this issue, but scrub always completes in a reasonable amount of time in my case and `zpool status` seems to report the correct statistics. I've tried with different pool sizes (all smaller than 256GB). Are you using any special configuration for the zpool?

I was wondering if this change can be related to this problem:
https://github.com/zfsonlinux/zfs/pull/7738/commits/a6f89b0168499335fb4f6a11e18ac1832e07cbd1

As a test, could you try to install zfs-dkms from this ppa?
https://launchpad.net/~arighi/+archive/ubuntu/zfs-linux

I've changed the code touched by commit a6f89b0168499335fb4f6a11e18ac1832e07cbd1 to be more similar to the code before the change, without reverting the fix.

description: updated
Colin Ian King (colin-king) wrote :

I've uploaded a fixed package, it's now going to proceed via the normal SRU process.

Eric Brown (eberic) wrote :

Thanks for identifying and fixing the bug! I'll test it when it lands.

Colin Ian King (colin-king) wrote :

@SRU folk, can this expedited as this does affect ZFS scrubs with HWE kernels and it is a relatively simple userspace fix?

Changed in zfs-linux (Ubuntu):
importance: High → Critical
Timo Aaltonen (tjaalton) on 2020-03-20
Changed in zfs-linux (Ubuntu Bionic):
status: New → Fix Committed

All autopkgtests for the newly accepted zfs-linux (0.7.5-1ubuntu16.9) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

lxd/3.0.3-0ubuntu1~18.04.1 (arm64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#zfs-linux

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Colin Ian King (colin-king) wrote :

The autopkgtests are working fine now after a re-trigger.

Colin Ian King (colin-king) wrote :

autopkgtests tests passed - all looks sane.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers