echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds says premission error, unable to reapair lost zfs pool data

Bug #1906542 reported by Joni-Pekka Kurronen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

root@jonipekka-desktop:~# echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds
-bash: /sys/module/zfs/parameters/zfs_max_missing_tvds: Permission denied
root@jonipekka-desktop:~#

https://www.delphix.com/blog/openzfs-pool-import-recovery

Import with missing top level vdevs

The changes to the pool configuration logic have enabled another great improvement: the ability to import a pool with missing or faulted top-level vdevs. Since some data will almost certainly be missing, a pool with missing top-level vdevs can only be imported read-only, and the failmode is set to “continue” (failmode=continue means that when encountering errors the pool will continue running, as opposed to being suspended or panicking).

To enable this feature, we’ve added a new global variable: zfs_max_missing_tvds, which defines how many missing top level vdevs we can tolerate before marking a pool as unopenable. It is set to 0 by default, and should be changed to other values only temporarily, while performing an extreme pool recovery.

Here as an example we create a pool with two vdevs and write some data to a first dataset; we then add a third vdev and write some data to a second dataset. Finally we physically remove the new vdev (simulating, for instance, a device failure) and try to import the pool using the new feature.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: zfsutils-linux 0.7.5-1ubuntu16.10
ProcVersionSignature: Ubuntu 4.15.0-126.129-generic 4.15.18
Uname: Linux 4.15.0-126-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.9-0ubuntu7.20
Architecture: amd64
Date: Wed Dec 2 18:39:58 2020
InstallationDate: Installed on 2020-12-02 (0 days ago)
InstallationMedia: Ubuntu 18.04.1 LTS "Bionic Beaver" - Release amd64 (20180725)
SourcePackage: zfs-linux
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :
Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

Is there anyone who could help me over go this bug so I can rescue my ZFS pool data,
pool will be lost as I understand. I accidentaly added disk to pool and not as mirror
what was intention,... and it can nor be removed even there is no data!

Revision history for this message
Richard Laager (rlaager) wrote :

Why is the second disk missing? If you accidentally added it and ended up with a striped pool, as long as both disks are connected, you can import the pool normally. Then use the new device_removal feature to remove the new disk from the pool.

If you've done something crazy like pulled the disk and wiped it, then yeah, you're going to need to figure out how to import the pool read-only. I don't have any advice on that piece.

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote : Re: [Bug 1906542] Re: echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds says premission error, unable to reapair lost zfs pool data

I tried remove command before taking it out,

i was in beleaf system will then correct problem,....

So basically secon disk has no data and not corrupted,...

I need that option due import readonly -f pool -d dose not fix problem

so i can then copy disk. I have only essential's at backup ,... so there
is many files

i realy need,...

joni

Richard Laager kirjoitti 3.12.2020 klo 18.38:
> Why is the second disk missing? If you accidentally added it and ended
> up with a striped pool, as long as both disks are connected, you can
> import the pool normally. Then use the new device_removal feature to
> remove the new disk from the pool.
>
> If you've done something crazy like pulled the disk and wiped it, then
> yeah, you're going to need to figure out how to import the pool read-
> only. I don't have any advice on that piece.
>
--
joni

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

 new device_removal feature ,... where it is ? It might work.

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

root@jonipekka-desktop:~# zpool import
   pool: rpool
     id: 5077426391014001687
  state: UNAVAIL
 status: One or more devices are faulted.
 action: The pool cannot be imported due to damaged devices or data.
 config:

 rpool UNAVAIL insufficient replicas
   ata-WDC_WD4005FZBX-00K5WB0_V6GAE1PR-part1 ONLINE
   ata-WDC_WD4005FZBX-00K5WB0_VBGDM25F-part4 FAULTED corrupted data
root@jonipekka-desktop:~#

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :

Do you mean this feature which is comi ng,... when ?
https://github.com/openzfs/openzfs/pull/251

Revision history for this message
Richard Laager (rlaager) wrote :
Download full text (3.2 KiB)

device_removal only works if you can import the pool normally. That is what you should have used after you accidentally added the second disk as another top-level vdev. Whatever you have done in the interim, though, has resulted in the second device showing as FAULTED. Unless you can fix that, device_removal is not an option. I had hoped that you just had the second drive unplugged or something. But since the import is showing "corrupted data" for the second drive, that's probably not what happened.

This works for me on Ubuntu 20.04:
echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds

That setting does not exist on Ubuntu 18.04 (which you are running), so I get the same "Permission denied" error (because bash is trying to create that file, which you cannot do).

I now see this is an rpool. Is your plan to reinstall? With 18.04 or 20.04?

If 18.04, then:
1. Download the 20.04.1 live image. Write it to a USB disk and boot into that.
2. In the live environment, install the ZFS tools: sudo apt install zfsutils-linux
3. echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds
4. mkdir /old
5. Import the old pool renaming it to rpool-old and mount filesystems:
   zpool import -o readonly=on -N -R /old rpool rpool-old
   zfs mount rpool-old/ROOT/ubuntu
   zfs mount -a
6. Confirm you can access your data. Take another backup, if desired. If you don't have space to back it up besides the new/second disk, then read on...
7. Follow the 18.04 Root-on-ZFS HOWTO using (only) the second disk. Be very careful not to partition or zpool create the disk with your data!!! For example, partition the second disk for the mirror scenario. But obviously you can't do zpool create with "mirror" because you have only one disk.
8. Once the new system is installed (i.e. after step 6.2), but before rebooting, copy data from /old to /mnt as needed.
9. Shut down. Disconnect the old disk. Boot up again.
9. Continue the install as normal.
10. When you are certain that everything is good and that new disk is working properly (maybe do a scrub) and you have all your data, then you can connect the old disk and do the zpool attach (ATTACH, not add) to attach the old disk to the new pool as a mirror

If 20.04, then I'd do this instead:
1. Unplug the disk with your data.
2. Follow the 20.04 Root-on-ZFS HOWTO using only the second disk. Follow the steps as if you were mirroring (since that is the ultimate goal) where possible. For example, partition the second disk for the mirror scenario. But obviously you can't do zpool create with "mirror" because you have only one disk.
3. Once the new, 20.04 system is working on the second disk and booting normally, connect the other, old drive. (This assumes you can connect it while the system is running.)
4. echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds
5. Import the old pool using its GUID renaming it to rpool-old and mount filesystems:
   zpool import -o readonly -N -R /mnt 5077426391014001687 rpool-old
   zfs mount rpool-old/ROOT/ubuntu
   zfs mount -a
6. Copy over data.
7. zpool export rpool-old
8. When you are certain that everything is good and that new disk is working properly (maybe do a scrub) and you have all you...

Read more...

Revision history for this message
Joni-Pekka Kurronen (joni-kurronen) wrote :
Download full text (3.8 KiB)

hi,

Dose new ZFS allow just removeing FAULTED device, so I have old clean
disk alone,

scrub that,... then REPARTITIONING FAULTED device ( i had incorrect
size, there is boot area also ),

and then attach FAULTED DEVICE AS NEW MIRROR DISK as it was intented ???

zfs remove old-rpool  -d  faulted

zfs scrub

zfs add old-rpool old new

|???|

|Then I do not have to copy anyyhing ???
|

||

joni

Richard Laager kirjoitti 3.12.2020 klo 23.29:
> device_removal only works if you can import the pool normally. That is
> what you should have used after you accidentally added the second disk
> as another top-level vdev. Whatever you have done in the interim,
> though, has resulted in the second device showing as FAULTED. Unless you
> can fix that, device_removal is not an option. I had hoped that you just
> had the second drive unplugged or something. But since the import is
> showing "corrupted data" for the second drive, that's probably not what
> happened.
>
> This works for me on Ubuntu 20.04:
> echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds
>
> That setting does not exist on Ubuntu 18.04 (which you are running), so
> I get the same "Permission denied" error (because bash is trying to
> create that file, which you cannot do).
>
> I now see this is an rpool. Is your plan to reinstall? With 18.04 or
> 20.04?
>
> If 18.04, then:
> 1. Download the 20.04.1 live image. Write it to a USB disk and boot into that.
> 2. In the live environment, install the ZFS tools: sudo apt install zfsutils-linux
> 3. echo 1 >> /sys/module/zfs/parameters/zfs_max_missing_tvds
> 4. mkdir /old
> 5. Import the old pool renaming it to rpool-old and mount filesystems:
> zpool import -o readonly=on -N -R /old rpool rpool-old
> zfs mount rpool-old/ROOT/ubuntu
> zfs mount -a
> 6. Confirm you can access your data. Take another backup, if desired. If you don't have space to back it up besides the new/second disk, then read on...
> 7. Follow the 18.04 Root-on-ZFS HOWTO using (only) the second disk. Be very careful not to partition or zpool create the disk with your data!!! For example, partition the second disk for the mirror scenario. But obviously you can't do zpool create with "mirror" because you have only one disk.
> 8. Once the new system is installed (i.e. after step 6.2), but before rebooting, copy data from /old to /mnt as needed.
> 9. Shut down. Disconnect the old disk. Boot up again.
> 9. Continue the install as normal.
> 10. When you are certain that everything is good and that new disk is working properly (maybe do a scrub) and you have all your data, then you can connect the old disk and do the zpool attach (ATTACH, not add) to attach the old disk to the new pool as a mirror
>
> If 20.04, then I'd do this instead:
> 1. Unplug the disk with your data.
> 2. Follow the 20.04 Root-on-ZFS HOWTO using only the second disk. Follow the steps as if you were mirroring (since that is the ultimate goal) where possible. For example, partition the second disk for the mirror scenario. But obviously you can't do zpool create with "mirror" because you have only one disk.
> 3. Once the new, 20.04 system is working on the second disk and booting normally, conn...

Read more...

Revision history for this message
Rich (rincebrain) wrote (last edit ):

(A bit delayed, but just for anyone finding this...)

No, you cannot remove a FAULTED normal data device - device_removal involves migrating all the data off the old one, which you cannot do if it's not there.

(logs and caches are different.)

You'll need to recreate the pool.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.