Lost compatibilty for backup between Ubuntu 19.10 and FreeBSD 12.0

Bug #1854982 reported by BertN45
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

After I tried to back-up my datapools from Ubuntu 19.10 to FreeBSD 12.0 as I have done since June each week, I found out it did not work anymore. The regression occurred after I reinstalled Ubuntu on my new nvme drive. I also had to reorganize my own datapools/datasets, because they either moved to the nvme drive or they had to be located on 2 HDDs instead of 3. I had one datapool that still works, the datapool containing my archives and it is the datapool, that has NOT been reorganized. I tried for a whole long day to get the backup working again, but I failed. I compared the properties from datapool and dataset, but did not see any problem there. Only a lot of new features and properties not present before and not present in FreeBSD. I used FreeBSD, because I use for backup an old 32-bits Pentium.

I have two complaints:
- the Ubuntu upgrade did cost me the compatibility with FreeBSD. Open-ZFS? :(
- the system transfers the dataset and at the end of a long transfer it decides to quit and the error messages are completely useless and self contradicting.

On the first try it say the dataset does exist and on the second try it says it does NOT exist. One of the two is completely wrong. Some consistency and some clearer error messages would be helpful for the user.
See the following set of strange set error messages on two tries:

root@VM-Host-Ryzen:/home/bertadmin# /sbin/zfs send -c dpool/dummy@191130 | ssh 192.168.1.100 zfs receive zroot/hp-data/dummy
cannot receive new filesystem stream: destination 'zroot/hp-data/dummy' exists
must specify -F to overwrite it
root@VM-Host-Ryzen:/home/bertadmin# /sbin/zfs send -c dpool/dummy@191130 | ssh 192.168.1.100 zfs receive -F zroot/hp-data/dummy
cannot receive new filesystem stream: dataset does not exist

A 2nd subset of my backup is stored on the laptop and that still works. I also compared the properties with those of my laptop, that still has its original datapools of begin of the year. I aligned the properties of FreeBSD with those of my laptop, but it did not help.

I attach the properties of the datapool and dataset from both FreeBSD and Ubuntu.

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: zfsutils-linux 0.8.1-1ubuntu14.1
ProcVersionSignature: Ubuntu 5.3.0-23.25-generic 5.3.7
Uname: Linux 5.3.0-23-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu8.2
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Tue Dec 3 13:35:08 2019
InstallationDate: Installed on 2019-11-30 (3 days ago)
InstallationMedia: Ubuntu 19.10 "Eoan Ermine" - Release amd64 (20191017)
SourcePackage: zfs-linux
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/zfs']

Revision history for this message
BertN45 (lammert-nijhof) wrote :
BertN45 (lammert-nijhof)
description: updated
Revision history for this message
BertN45 (lammert-nijhof) wrote :

I decided to add the properties of my Dec 2018 archives datapool and dataset of the Ubuntu PC. That dataset-backup still works between Ubuntu 19.10 and FreeBSD 12.0. I did send an incremental update a few days ago. I noticed that the archives feature@large_dnode is enabled and not active, maybe that is an indication of the type of problem? That feature is active on all other datapools of laptop, desktop and FreeBSD.

Revision history for this message
Richard Laager (rlaager) wrote :

This is probably an issue of incompatible pool features. Check what you have active on the Ubuntu side:

zpool get all | grep feature | grep active

Then compare that to the chart here:
http://open-zfs.org/wiki/Feature_Flags

There is an as-yet-unimplemented proposal upstream to create a features “mask” to limit the features to those with broad cross-platform support.

If it’s not a features issue, I think there was some unintentional send compatibility break. I don’t have the specifics or a bug number, but a friend ran into a similar issue with 18.04 sending to 16.04.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

I will add an overview of the features of all involved datasets.

The feature@userobj_accounting and feature@project_quota are not supported by FreeBSD 12.0 and are not part of feature list of FreeBSD as annexed, but they are active in Ubuntu. Why? I don't use them and don't intent to use them ever, so they should not be active but enabled. However I would not expect, that these features have any influence on the send/receive

feature@large_dnode is active in my FreeBSD pool, but according to your feature overview it is not supported by FreeBSD, so I assume that overview is not completely up-to-date.
However I can send/receive with my archives datapool, like I said before. That archives datapool has the feature@large_dnode enabled but not active.

So the incompatibility is in the feature@large_dnode. Unfortunately that is the default chosen by Ubuntu for rpool. So if the issue is not solved, I can never backup my 250 GB of virtual machines anymore, that are stored on rpool. Rsync is completely unusable for this type of incremental backup of 10-40 GB vdi files.

I have seen that a new update of Ubuntu is available since a few hours, so I will update my system this evening. I also planned a week ago to upgrade FreeBSD 12.0 to the new 12.1. I hope that one of the two will solve my problem.
I will inform you about the result.

Revision history for this message
Richard Laager (rlaager) wrote :

I'm not sure if userobj_accounting and/or project_quota have implications for send stream compatibility, but my hunch is that they do not. large_dnode is documented as being an issue, but since your receiver supports that, that's not it.

I'm not sure what the issue is, nor what a good next step would be. You might ask on IRC (#zfsonlinux on FreeNode) or the zfs-discuss mailing list. See: https://github.com/zfsonlinux/zfs/wiki/Mailing-Lists

Not that it helps now, but this will get somewhat better in the future, as FreeBSD is switching to using the current ZFS-on-Linux, which will be renamed to OpenZFS, codebase as its upstream. So Linux and FreeBSD will have feature parity, outside of the usual time lag of release cycles.

Revision history for this message
Richard Laager (rlaager) wrote :

I received the email of your latest comment, but oddly I’m not seeing it here.

Before you go to all the work to rebuild the system, I think you should do some testing to determine exactly what thing is breaking the send stream compatibility. From your comment about your laptop, it sounds like you think it is large_dnode. It really shouldn’t be large_dnode because you said you have that feature on the receive side.

I would suggest creating some file-backed pools with different features. You can do that with something like:

truncate -s 1G test1.img
zpool create test1 $(pwd)/test1.img

To adjust the features, add -d to disable all features and then add various -o feature@something=enabled.

To actually use large dnodes, I believe you also have to set dnodesize=auto on a filesystem with either “zfs -o” or for the root dataset, “zpool -O” at the time of creation.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

I did hide my errors :)

On the ZOL site Richard Laager advised me to look also at the dnodesize property for the datasets, since both systems had the zpool feature large-dnode "active". He assumed that the send/receive should work. The FreeBSD system had all dnodesizes from all datasets set to "legacy", so after setting my Ubuntu dataset dnodesize to "legacy" too, the problem with FreeBSD was solved and send/receive worked again :)
Ricard Laager saved me a lot of work.

Revision history for this message
Garrett Fields (fields-g) wrote :

At this point, does this need a code improvement to avoid this situation for others? If not, this looks like this bug report is ready to be closed/resolved.

Revision history for this message
BertN45 (lammert-nijhof) wrote : Re: [Bug 1854982] Re: Lost compatibilty for backup between Ubuntu 19.10 and FreeBSD 12.0

The easy, lazy solution is to close this bugreport. However if you
start to use this install option on servers in server farms, you might
have this problem more frequently. The OpenZFS website has a matrix
with which features are supported by which OSes. It would be relatively
easy to implement that matrix in the installers and to ask during
installation, whether feature compatibility is required with another
OS.

At least you should do something about the inconsistent and confusing
error messages. Just read my bugreport again carefully!

By the way, I could solve this large-dnode feature incompatibility, by
putting all dataset property to "dnodesize=legacy" and reload all those
datasets. I don't care anymore, but a server administrator will not be
amused, when running in these type of issues with an Ubuntu Server.

On Tue, 2020-01-28 at 13:45 +0000, fields_g wrote:
> At this point, does this need a code improvement to avoid this
> situation
> for others? If not, this looks like this bug report is ready to be
> closed/resolved.
>

Revision history for this message
Garrett Fields (fields-g) wrote :

Lazy is not the goal, nor is prematurely cleaning out the bug tracker. After the 2019-12-04 message, it seemed that things were working well, so it seemed to be time for a re-declaration of what we are trying to achieve.

It seems this situation was created by a user explicitly running "zfs set dnodesize=auto" or a "zpool create -O dnodesize=auto ..." then trying to send to a dataset that doesn't support large dnodes. The default dataset property is 'legacy'. This is not caused by the pool's "feature@large_dnode" being "enabled", but it being set to "active" because a dataset was told explicitly to use the feature. It takes effort beyond defaults to get into this situation.

When using ZFS on Linux is used as a receiver, "cannot receive new filesystem stream: pool must be upgraded to receive this stream", is the error message, which seem more appropriate (tested both 0.7.9 and 0.8.2). The poor error messages experienced in the bug report are due to using FreeBSD's "zfs receive" command, not the ZOL's "zfs send". It would seem that the error message correction needs to be done on the FreeBSD side.

There is a project underway called ZFS on FreeBSD (ZOF). It ultimately will allow FreeBSD to compile from the same codebase as ZOL. If it isn't fixed in the current FreeBSD development, it will be incorporated then.

That being said, there are feature flags, when enabled and activated by default, would prevent someone from importing the pool's disks using FreeBSD (vs. this bug's send/recv issue). Again, this may get fixed with ZOF, as it is utilized on FreeBSD machines. It would take a new ZFS version to be deployed on FreeBSD. This also occurs if a pool in made on a newer version of ZOL and an imported on a older, less capable version of ZOL. Though cross importing a pool to another OS (or older edition of ZOL) is an semi-advanced scenario, and the mentioned feature matrix would benefit those users. As these are advanced scenarios, I don't know if or when user education should be done.

Revision history for this message
Richard Laager (rlaager) wrote :

The last we heard on this, FreeBSD was apparently not receiving the send stream, even though it supports large_dnode:

https://zfsonlinux.topicbox.com/groups/zfs-discuss/T187d60c7257e2eb6-M14bb2d52d4d5c230320a4f56/feature-incompatibility-between-ubuntu-19-10-and-freebsd-12-0

That's really bizarre. If it supports large_dnode, it should be able to receive that stream. Ideally, this needs more troubleshooting, particularly on the receive side. "It said (dataset does not exist) after a long transfer." is not particularly clear. I'd like to see a copy-and-paste of the actual `zfs recv` output, at a minimum.

@BertN45, if you want to keep troubleshooting, a good next step would be to boil this down to a reproducible test case. That is, create a list of specific commands to create dataset and send it that demonstrates the problem. That would help. We may need to flesh out the reproducer a bit more, e.g. by creating a pool on sparse files with particular feature flags.

Revision history for this message
BertN45 (lammert-nijhof) wrote :
Download full text (3.2 KiB)

According to the overview of features on the OpenZFS website (see the
link provided by Richard Laager in the bug-report), FreeBSD 12 does not
support "large dnode". However FreeBSD did set the large dnode feature
to "active" and it is still set to "active". But FreeBSD does not
handle those send/receive streams correctly. It reads the stream builds
up the dataset@snapshot and at the end after an hour or so, it gives
the errormessage.

---------------------------------------------------------------
cannot receive new filesystem stream: destination 'zroot/hp-data/dummy'
exists
must specify -F to overwrite it
----------------------------------------------------------------
That dataset did exists, I can see it grow in size during that hour
with "zfs list" but, when the transfer is completed, it gives that
error message. I guess FreeBSD is creating the dataset and start
filling it and in the end instead of creating the snapshot, it tries to
recreate the whole dataset :) And all transfered data disappears.

The fun part is, that if you follow the advise in the errormessage, the
system will say:
------------------------------------------------------------
cannot receive new filesystem stream: dataset does not exist
----------------------------------------------------------

Your remark: "the situation was created by a user explicitly running
"zfs set dnodesize=auto" or a "zpool create -O dnodesize=auto", was
wrong. I did not set anything, but those settings were chosen by the
Ubuntu install process and I only created some own datasets on that
"rpool" datapool.
That is why I complained about the missing params in a future "install"
or "create datapool" process. For me personally, it also could be
solved by allowing me to choose the "rpool" size during install. In
that case I would not store my user datasets in rpool, but create an
own datapool on that nvme-SSD with the correct compatible feature
settings using the OpenZFS document.

I tried it again, because I updated FreeBSD to 12.1, but exactly the
same error happened again. What happens and the commands and error
messages were in the original bugreport.

If you think it is a FreeBSD problem, please send the stuff to them.

On Tue, 2020-01-28 at 20:46 +0000, Richard Laager wrote:
> The last we heard on this, FreeBSD was apparently not receiving the
> send
> stream, even though it supports large_dnode:
>
> https://zfsonlinux.topicbox.com/groups/zfs-
> discuss/T187d60c7257e2eb6-M14bb2d52d4d5c230320a4f56/feature-
> incompatibility-between-ubuntu-19-10-and-freebsd-12-0
>
> That's really bizarre. If it supports large_dnode, it should be able
> to
> receive that stream. Ideally, this needs more troubleshooting,
> particularly on the receive side. "It said (dataset does not exist)
> after a long transfer." is not particularly clear. I'd like to see a
> copy-and-paste of the actual `zfs recv` output, at a minimum.
>
> @BertN45, if you want to keep troubleshooting, a good next step would
> be
> to boil this down to a reproducible test case. That is, create a list
> of
> specific commands to create dataset and send it that demonstrates the
> problem. That would help. We may need to flesh out the rep...

Read more...

Revision history for this message
Richard Laager (rlaager) wrote :

So, one of two things is true:
A) ZFS on Linux is generating the stream incorrectly.
B) FreeBSD is receiving the stream incorrectly.

I don't have a good answer as to how we might differentiate those two. Filing a bug report with FreeBSD might be a good next step. But like I said, a compact reproducer would go a long way.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

What do you mean with "a compact reproducer"? I can reproduce the error
easily, but I have no clue, how to produce more info? I'm still
relatively new to FreeBSD.

On Wed, 2020-01-29 at 01:20 +0000, Richard Laager wrote:
> So, one of two things is true:
> A) ZFS on Linux is generating the stream incorrectly.
> B) FreeBSD is receiving the stream incorrectly.
>
> I don't have a good answer as to how we might differentiate those
> two.
> Filing a bug report with FreeBSD might be a good next step. But like
> I
> said, a compact reproducer would go a long way.
>

Revision history for this message
Garrett Fields (fields-g) wrote :

So these pools were created with the Ubuntu Ubiquity ZFS installer? I missed that because the pool names are hardcoded to bpool and rpool and your message lists 'dpool/dummy' and 'zroot/hp-data/dummy'

Also, in the linked email thread, you stated "The ZFS manual advised auto, if also using xattr=sa, so that is why I used auto for my own datapools/datasets."

Now the origin of the pool is clearer to me. Yes I do see -O dnodesize=auto being set on (and inherited from) rpool in Ubiquity root zfs installation. This would impact the ease of sending to non-large_dnode pools (or in your case FreeBSD with large_dnode problems).

Some simple tests to run:
Within FreeBSD, I'd be really surprised if large_dnode=active to large_dnode=enabled/active zfs send/recv doesn't work, but I'd start there.

Next, I'd try to send from FreeBSD large_dnode=active to Linux large_dnode=enabled/active. If it fails, what error is returned?

Also, like rlaager stated, we should do the original Linux large_dnode=active to FreeBSD large_dnode=enabled/active that gave you problems. This all will give us evidence for bug reports in FreeBSD and/or ZOL upstreams.

I'm on a mobile device, so can build examples, if requested, at a later time.

Revision history for this message
BertN45 (lammert-nijhof) wrote :
Download full text (4.9 KiB)

"dpool" is another datapool created with Ubuntu 19.10 and it had the
same defaults with respect to "large-dnode" as rpool. My main problem
has been with rpool, since it took my whole nvme-SSD. By the way the
same happened in FreeBSD with zroot, during the install it also took
all space on my striped HDDs :)

Note that FreeBSD is a 32-bits version on an old Pentium 4 HT :)

By the way dpool (Ubuntu) is also striped over two 450GB partitions on a 500GB and a 1TB HDD. The second part of the 1TB HDD still had the
partition/datapool created by Ubuntu 18.04 with zfs 0.7.x release and
that one had no large-dnode problems.

I solved my send/receive problem by specifying on the Ubuntu system for
each dataset on rpool and dpool dnodesize=legacy and reloaded the
content of those datasets.

See the Ubuntu dnodesize overview in the attachment.

On FreeBSD zroot has "large-dnode = active" and the dnodesize is as
follows:
----------------------------------------------------------------
root@freebsd:~ # zfs get dnodesize
NAME PROPERTY VALUE SOURCE
bootpool dnodesize legacy default
zroot dnodesize legacy default
zroot/ROOT dnodesize legacy default
zroot/ROOT@upgrade12-1 dnodesize - -

zroot/hp-data dnodesize legacy local
zroot/hp-data/ISO dnodesize legacy inherited from
      zroot/hp-data
-----------------------------------------------------------------

I have created a separate dataset on FreeBSD with the same attributes
as rpool/USERDATA on Ubuntu

zroot/USERDATA dnodesize auto local

Sending data to this datset had the following result:

See the send/receive results after the dnodesize overview in the
attachment. Note that at the end I tried to create a new dataset
zroot/USER.

-----------------------------------------------------------------------

And now the sends inside FreeBSD both with a new USER dataset and the
exsisting USERDATA with dnodesize=auto.

root@freebsd:~ # zfs send -c zroot/var/log@upgrade12-1 | zfs receive
zroot/USER
root@freebsd:~ # zfs send -c zroot/var/log@upgrade12-1 | zfs receive -F
zroot/USERDATA
root@freebsd:~ #

zroot/USER has been created and USERDATA existed with dnodesize=auto
The result has been as expected.

zroot/USER dnodesize legacy default
zroot/USER@upgrade12-1 dnodesize - -
zroot/USERDATA dnodesize auto local
zroot/USERDATA@upgrade12-1 dnodesize - -

---------------------------------------------------------------------

And now send from FreeBSD to Ubuntu:

see for the command attachment at the end.

and the result

rpool/USER@upgrade12-1 0B - 888K -

rpool/USER dnodesize auto inherited from rpool
rpool/USER@upgrade12-1 dnodesize - -

---------------------------------------------------------------------
Both system have the large-dnode feature active!
And almost all combinations work,
- freeBSD to freeBSD from dnodesize=legacy to a dnodesize, that is
either legacy or auto
- Ubuntu to...

Read more...

Revision history for this message
Richard Laager (rlaager) wrote :

In terms of a compact reproducer, does this work:

# Create a temp pool with large_dnode enabled:
truncate -s 1G lp1854982.img
sudo zpool create -d -o feature@large_dnode=enabled lp1854982 $(pwd)/lp1854982.img

# Create a dataset with dnodesize=auto
sudo zfs create -o dnodesize=auto lp1854982/ldn

# Create a send stream
sudo zfs snapshot lp1854982/ldn@snap
sudo zfs send lp1854982/ldn@snap > lp1854982-ldn.zfs

sudo zpool export lp1854982

cat lp1854982-ldn.zfs | ssh 192.168.1.100 zfs receive zroot/ldn

If that doesn't reproduce the problem, adjust it until it does. You were using `zfs send -c`, so maybe that's it. You may need to enable more pool features, etc.

But if this can be reproduced with an empty dataset on an empty pool, the send stream file is 8.5K (and far less compressed). Attach the script for reference and the send stream to a FreeBSD bug.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

Garret Fields also specified some test and the result of those test
were as specified here.

I used Ubuntu 19.10 and FreeBSD 12.1. I detected the issue running
FreeBSD 12.0.
Both system have the large-dnode feature active! Weekly I do send the
data with the param -c, there is however one uncompressed dataset, that
is sent without -c.

And the following combinations work,
Correct Local transfers:
- freeBSD 12.1 (dnodesize=legacy) to freeBSD 12.1 (dnodesize=legacy)
- freeBSD 12.1 (dnodesize=legacy) to freeBSD 12.1 (dnodesize=auto)
- Ubuntu to Ubuntu, with and without -c I do not remember any problem.

Correct Remote transfers:
- freeBSD 12.1 (dnodesize=legacy) to Ubuntu (dnodesize=auto)
- Ubuntu (dnodesize=legacy) to Laptop-Ubuntu (dnodesize=legacy), with
and without -c
- Ubuntu (dnodesize=legacy) to FreeBSD 12.x (dnodesize=legacy), with
and without -c
The last two I do use weekly!

Failing transfers:
- Ubuntu 19.10 (dnodesize=auto) to FreeBSD 12.x (dnodesize=legacy)
- Ubuntu 19.10 (dnodesize=auto) to FreeBSD 12.1 (dnodesize=auto)

Note that during the transfers, I do see, that the dataset is growing
in size, while using zfs list or Conky in FreeBSD (using zfs list). At
the end of the transfer, sometimes after hours or minutes I get
the error message and the dataset disappears :(

The settings selected as default by both development teams in
spendlid isolation, do not work! The one from Ubuntu 19.10
(dnodesize=auto) to FreeBSD 12.x (dnodesize=legacy)

This is the relevant information you can get out of me. Time to get
somebody involved, who knows the corresponding program code. In the
past I solved bugs, having with less information available.

Like I said: GOOD LUCK in solving the bug.

On Thu, 2020-01-30 at 05:53 +0000, Richard Laager wrote:
> In terms of a compact reproducer, does this work:
>
> # Create a temp pool with large_dnode enabled:
> truncate -s 1G lp1854982.img
> sudo zpool create -d -o feature@large_dnode=enabled lp1854982
> $(pwd)/lp1854982.img
>
> # Create a dataset with dnodesize=auto
> sudo zfs create -o dnodesize=auto lp1854982/ldn
>
> # Create a send stream
> sudo zfs snapshot lp1854982/ldn@snap
> sudo zfs send lp1854982/ldn@snap > lp1854982-ldn.zfs
>
> sudo zpool export lp1854982
>
> cat lp1854982-ldn.zfs | ssh 192.168.1.100 zfs receive zroot/ldn
>
> If that doesn't reproduce the problem, adjust it until it does. You
> were
> using `zfs send -c`, so maybe that's it. You may need to enable more
> pool features, etc.
>
> But if this can be reproduced with an empty dataset on an empty pool,
> the send stream file is 8.5K (and far less compressed). Attach the
> script for reference and the send stream to a FreeBSD bug.
>

Richard Laager (rlaager)
Changed in zfs-linux (Ubuntu):
status: New → Incomplete
Revision history for this message
BertN45 (lammert-nijhof) wrote :

I now also filed a bug-report Bug 243730 for FreeBSD with the following
ending;

Ubuntu and FreeBSD did choose different defaults for large-dnodes and
dnodesizes, but to solve bugs related to feature incompatibility both
groups have to communicate! The problem will not disappear completely,
because you start using the same source. There will be probably months
between release dates, so feature incompatibility probably will remain
an issue.

My problem is bypassed, but the default dnodesize incompatibility
between Ubuntu 19.10 and FreeBSD 12.1 remains.

On Thu, 2020-01-30 at 08:47 +0000, Richard Laager wrote:
> ** Changed in: zfs-linux (Ubuntu)
> Status: New => Incomplete
>

Revision history for this message
Richard Laager (rlaager) wrote :

The FreeBSD bug report:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243730

Like I said, boiling this down to a test case would likely help a lot. Refusing to do so and blaming the people giving you free software and free support isn’t helpful.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

I did not blame you, I only noticed a seemingly missing process in the communication between both groups (ZOL/ZOF or Ubuntu/FreeBSD).
I said I already tried your proposed case in a different way. I'm always willing to try something else, but I need to understand why. I'm not a monkey, that has to follow your commands.

So I will not bother the free Ubuntu support anymore and I will stop issuing bug-reports and enjoy the free software without being lectured.

Revision history for this message
Garrett Fields (fields-g) wrote :

BertN45,

Thanks for the continued followup. Most of these ZFS variants run test suites to ensure that new code has a limited ability to create additional issues. I would expect that cross OS compatibility is not heavily tested (if at all). You were in a position to witness this problem, took effort to report, and did some preliminary followup tests. All these helped tremendously.

Ubuntu uses code from the ZFS on Linux project. Ubuntu has chosen certain ways to use the ZOL code that fits their needs. Ubuntu's choices are reasonable, and should work. I'm glad to see the FreeBSD bug report. This is likely the first time they are being made aware of the situation. As far as I know, the ZOL project is yet to know of the issue.

With this report filed with one of the two organizations that actually develop the code involved, an investigation can ensue. At this point, others can attempt to reproduce what you have detailed.

Thanks again for staying involved. Feel free to monitor and contribute to the issues as others continue to investigate.

Revision history for this message
Richard Laager (rlaager) wrote :

There does seem to be a real bug here. The problem is that we don’t know if it is on the ZoL side or the FreeBSD side. The immediate failure is that “zfs recv” on the FreeBSD side is failing to receive the stream. So that is the best place to start figuring out why. If it turns out that ZoL is generating an invalid stream, then we can take this to ZoL. Accordingly, my main goal here is to help you produce the best possible bug report for FreeBSD to help them troubleshoot. I don’t run FreeBSD, so I can’t test this myself to produce a test case. If you can produce a test case, with an example send stream that FreeBSD can’t receive, that gives them the best chance of finding the root cause.

Revision history for this message
Garrett Fields (fields-g) wrote :

I quickly set up a VM last night and tried to recreate problem, I have been unsuccessful so far. I'm wondering if certain dnode data (perhaps the use of xattr=sa on the Linux source) might be triggering the issue?

Revision history for this message
BertN45 (lammert-nijhof) wrote :
  • properties Edit (12.6 KiB, text/plain; name="properties"; charset="UTF-8")

The system uses xattr=sa, but I did not set it myself. Like you can see
in the annex, it has been inherited from rpool everywhere and it has
been set by the installer.

I annexed the properties
- Home of Ubuntu, which properties I did not touch at all (except
canmount) and
- my main dataset with my VMs (rpool/vms/Vbox), where I only changed
its dnodesize and snapdir.

The last one (rpool/vms/Vbox) I backup weekly to FreeBSD 12.1.
For my last try-outs I did use the Home directory of Ubuntu and that
one failed.

The only significant difference, I notice between the two datasets, is
dnodesize. I made the .zfs folder visible this year to restore a VM.

Remark: I use the experimental Ubuntu install with Root on ZFS, but I
also have an ext4 installation on the PC, so I dual boot. Kind of last
resort for the experimental install. In say October and November to be
able to update ZFS on the ext4 installation, I had to change canmount
during the update, but I think that bug has been solved. Canmount is
back at its default.

On Fri, 2020-01-31 at 13:24 +0000, Garrett Fields wrote:
> I quickly set up a VM last night and tried to recreate problem, I
> have
> been unsuccessful so far. I'm wondering if certain dnode data
> (perhaps
> the use of xattr=sa on the Linux source) might be triggering the
> issue?
>

Revision history for this message
Colin Ian King (colin-king) wrote :

@folks, what's the current state of this bug? Has any progress been made on cornering this on the FreeBSD or Linux side?

Revision history for this message
BertN45 (lammert-nijhof) wrote :

Basically I backup every weekend again from my 64-bits Ubuntu to my 32-bits FreeBSD. I solved the issue by setting the property dnodesize=legacy for the datasets I wanted to backup. I also reloaded those datasets once to get the storage with the right dnodesize everywhere.

The problem has been incompatibility of chosen dnode related settings of the datapool and dataset between Ubuntu and FreeBSD. With Openzfs rel 2.0, it might be easier, esp0ecially if both system can agree on using the same defaults.

Revision history for this message
Colin Ian King (colin-king) wrote :

@BertN45, thanks for the update information. Let's keep this bug open until OpenZFS 2.0 lands and see if this resolves itself then

Revision history for this message
Colin Ian King (colin-king) wrote :

ZFS 2.0.x is now the default for Ubuntu Hirsute 21.04. If anyone would like to check that this is now compatible with BSD then that would be useful so we can close this issue.

Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
BertN45 (lammert-nijhof) wrote :

I'm using Ubuntu 21.04 Beta and FreeBSD 13.0-RC5 now on my desktop and backup server. I did check it and it works without any issue, even after I changed the setting I used to force compatibility. I changed the 'dnodesize' back to 'auto' from 'legacy'. Legay that I used to force compatibility.

Revision history for this message
BertN45 (lammert-nijhof) wrote :

I'm using Ubuntu 21.04 Beta and FreeBSD 13.0-RC5 now on my desktop and
backup server. I did check it and it works without any isue, even after
I changed the setting I used to force compatibility.

I already added this text to the bug report.

On Wed, 2021-04-07 at 13:24 +0000, Colin Ian King wrote:
> ZFS 2.0.x is now the default for Ubuntu Hirsute 21.04. If anyone
> would
> like to check that this is now compatible with BSD then that would be
> useful so we can close this issue.
>
> ** Changed in: zfs-linux (Ubuntu)
> Importance: Undecided => Medium
>

Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks, given that this is resolved with the recent ZFS issues and there is a workaround too, I'm going to close this bug.

Changed in zfs-linux (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.