ZFS unrecoverable error after upgrading from 20.04 to 22.04.1
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
zfs-linux (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
I have a server that has been running its data volume using ZFS in 20.04 without any problem. The volume is using ZFS encryption and a raidz1-0 configuration. I performed a scrub operations before the upgrade and it did not find any problem. After the reboot for the upgrade, I was welcomed with the following message:
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https:/
The volumes still do not have any checksum error but there are 5 zvols that are not accessible. zpool status displays a line similar to the below for each of the five:
errors: Permanent errors have been detected in the following files:
I run a scrub and it has not identified any problem but the error messages are not there and the data is still not available. There are 10+ other zvols in the zpool that do not have any kind of problem. I have been unable to identify any correlation between the zvols that are failing.
I have seen people reporting similar problems in github after the 20.04 to the 22.04 upgrade (see https:/
I will try to downgrade the version of zfs in the system and report back
ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: zfsutils-linux 2.1.4-0ubuntu0.1
ProcVersionSign
Uname: Linux 5.15.0-46-generic x86_64
NonfreeKernelMo
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
CasperMD5CheckR
Date: Sat Aug 20 22:24:54 2022
ProcEnviron:
TERM=screen-
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: zfs-linux
UpgradeStatus: Upgraded to jammy on 2022-08-20 (0 days ago)
modified.
I saw that downgrading the module was not a very realistic option so I decided to follow a different route.
It seems that the problem is the same as the one described here https:/ /github. com/openzfs/ zfs/issues/ 13709. The solution that worked for me is described here: https:/ /github. com/openzfs/ zfs/issues/ 13709#issuecomm ent-1200430509.
It took a while to recover all the data because of the need for send / receive but it seems it is all back.
I would suggest that you would consider to include some wording about this in the release notes or that you would even stop the upgrade for users that are using native ZFS encryption until this is solved.
Thanks