review-tools unsquashfs test fails on focal for a snap built on xenial

Bug #1936871 reported by Daniel Manrique
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
review-tools
Undecided
Unassigned
squashfs-tools (Ubuntu)
Critical
Unassigned
Trusty
Undecided
Unassigned
Xenial
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

We got a report of a snap that's failing the unsquashfs test.

See https://forum.snapcraft.io/t/building-wekan-snap-failed/25400.

I've traced this to some yet-unidentified difference between Ubuntu 18.04 and 20.04 and maybe their version of unsquashfs/squashfs. I installed review-tools on an 18.04 and 20.04 system with the required dependencies; running snap-review on a revision of the above snap passes on 18.04, and fails as seen in the forum thread with 20.04.

The timing of when the snap author reports their builds started failing, matches the time frame of the snap store servers being updated from Ubuntu 18.04 to 20.04 (this happened on 2021-07-05 around 16:30 UTC).

So far this is the only snap we've had reports about.

Revision history for this message
Alex Murray (alexmurray) wrote :

Ubuntu 20.04 ships squashfs-tools 4.4 whereas on 18.04/16.04 we have 4.3 - I notice in https://github.com/plougher/squashfs-tools/blob/master/CHANGES#L9 that upstream says 4.4 introduces reproducible builds by default for mksquashfs so perhaps we should be using this in the review-tools snap and snapcraft instead so we can all align on the same version? Or since review-tools and snapcraft are both base: core18, can the store instead just use the review-tools snap directly?

Revision history for this message
Daniel Manrique (roadmr) wrote :

It's not trivial for the store to use a snap in its worker units. We can look into this. It would actually be easier to use a backport of bionic's squashfs-tools that could be installed on focal.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I had to turn off reproducible mksquashfs by default somewhere, because it would create broken squashfs that are rejected by older tooling. Can't remember where now.

Imho that should be done wherever snaps are created. Or yes upgrade to newer squashfs-tools to handle both new and old tools.

Revision history for this message
Daniel Manrique (roadmr) wrote :

I've been poking at this and strangely, if I run manually the same command that review-tools run, on the same unpacked squashfs root, I obtain a snap that matches the original one, whereas if I let the review-tools run the command (ostensibly the same), it does produce a snap that's different. This is using mksquashfs from squashfs-tools 4.4 on focal.

I'm trying to identify where the differences could lie in a review-tools mksquashfs invocation vs. a non-review-tools one.

Revision history for this message
Daniel Manrique (roadmr) wrote :

Looks like the review-tools-repacked snap has an extra inode. This is the original snap:

Found a valid SQUASHFS 4:0 superblock on /src/click-reviewers-tools/review-tools/jtpboSYvTCEyHoutkkRo1SI9ioSMOUb3_1602.snap.
Creation or last append time Sun Jul 18 16:28:03 2021
Filesystem size 162647053 bytes (158835.01 Kbytes / 155.11 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are compressed
Data is compressed
Uids/Gids (Id table) are compressed
Fragments are not stored
Xattrs are not stored
Duplicates are removed
Number of fragments 0
Number of inodes 59453
Number of ids 1

This is a repack I did manually. I'm literally compressing the same tree that review-tools would, by stopping the tools before the temp directory is removed:

mksquashfs /tmp/review-tools-r5srrr7b/squashfs-root /tmp/roadpack2.snap -fstime 1626625683 -noappend -comp xz -all-root -no-xattrs -no-fragments
and then -stat says:
Found a valid SQUASHFS 4:0 superblock on /src/click-reviewers-tools/review-tools/jtpboSYvTCEyHoutkkRo1SI9ioSMOUb3_1602.snap.
Creation or last append time Sun Jul 18 16:28:03 2021
Filesystem size 162647053 bytes (158835.01 Kbytes / 155.11 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are compressed
Data is compressed
Uids/Gids (Id table) are compressed
Fragments are not stored
Xattrs are not stored
Duplicates are removed
Number of fragments 0
Number of inodes 59453
Number of ids 1

And this is the -stat output for the review-tools-repacked snap, note different byte size and number of inodes:

Found a valid SQUASHFS 4:0 superblock on /tmp/review-tools-r5srrr7b/repack.snap.
Creation or last append time Sun Jul 18 16:28:03 2021
Filesystem size 162647101 bytes (158835.06 Kbytes / 155.11 Mbytes)
Compression xz
Block size 131072
Filesystem is exportable via NFS
Inodes are compressed
Data is compressed
Uids/Gids (Id table) are compressed
Fragments are not stored
Xattrs are not stored
Duplicates are removed
Number of fragments 0
Number of inodes 59454
Number of ids 1

Next I'll compare detailed -info output for a tools-driven and an external run.

Revision history for this message
Daniel Manrique (roadmr) wrote :

This is about the only meaningful difference I could find in mksquashfs output. The rest of the differences are inode numbers.
--- manualrun.txt 2021-07-20 17:14:27.298876254 +0000
+++ reviewtoolsrun.txt 2021-07-20 16:47:51.394142282 +0000

-symbolic link /snap/hooks/post-refresh inode 0x118f6102f LINK
-directory /snap/hooks inode 0x7b5360639
+symbolic link /snap/hooks/post-refresh inode 0x7b5360639
+directory /snap/hooks inode 0x7b5360658

and later, the summary concurs on the link-inode difference:

 Exportable Squashfs 4.0 filesystem, xz compressed, data block size 131072
        compressed data, compressed metadata, no fragments,
        no xattrs, compressed ids
        duplicates are removed
-Filesystem size 158835.01 Kbytes (155.11 Mbytes)
- 23.66% of uncompressed filesystem size (671383.43 Kbytes)
-Inode table size 507080 bytes (495.20 Kbytes)
- 23.83% of uncompressed inode table size (2127797 bytes)
-Directory table size 498142 bytes (486.47 Kbytes)
- 39.45% of uncompressed directory table size (1262834 bytes)
+Filesystem size 158835.06 Kbytes (155.11 Mbytes)
+ 23.66% of uncompressed filesystem size (671383.48 Kbytes)
+Inode table size 507124 bytes (495.24 Kbytes)
+ 23.83% of uncompressed inode table size (2127828 bytes)
+Directory table size 498146 bytes (486.47 Kbytes)
+ 39.45% of uncompressed directory table size (1262846 bytes)
 Number of duplicate files found 24728
-Number of inodes 59453
+Number of inodes 59454
 Number of files 51630
-Number of symbolic links 181
+Number of symbolic links 182
 Number of device nodes 0
 Number of fifo nodes 0
 Number of socket nodes 0

Revision history for this message
Daniel Manrique (roadmr) wrote :

It's not review-tools related - instead, it looks like the first resquash of a freshly-unsquashed tree has the weird link/file, whereas the second resquash passes.

It would appear the mere act of reading or even stat'ing the wonky link/file (squashfs-root/snap/hooks/post-refresh) tickles it into being correct.

I wonder if the problem is filesystem-related at a lower level and affects only Focal?

Reproducer using bash only, to be run on a Focal system with squashfs-tools 4.4.

(adjust the /path/to/snap, get the snap from here, it's public: https://api.snapcraft.io/api/v1/snaps/download/jtpboSYvTCEyHoutkkRo1SI9ioSMOUb3_1588.snap)

mkdir -p /tmp/review-tools-test2oud6q5y
unsquashfs -no-progress -d /tmp/review-tools-test2oud6q5y/squashfs-root /path/to/jtpboSYvTCEyHoutkkRo1SI9ioSMOUb3_1588.snap
# Uncomment the following line to make things work / generate same checksum
# stat review-tools-test2oud6q5y/squashfs-root/snap/hooks/post-refresh
echo "squashing"
mksquashfs /tmp/review-tools-test2oud6q5y/squashfs-root /tmp/review-tools-test2oud6q5y/repack.snap -fstime 1626625683 -noappend -comp xz -all-root -no-xattrs -no-fragments
mksquashfs /tmp/review-tools-test2oud6q5y/squashfs-root /tmp/review-tools-test2oud6q5y/repack2.snap -fstime 1626625683 -noappend -comp xz -all-root -no-xattrs -no-fragments
echo "comparing"
sha512sum /tmp/review-tools-test2oud6q5y/*.snap
echo "good sum is b615b403f..."

Revision history for this message
Daniel Manrique (roadmr) wrote :

I've further narrowed it down to - if the squashfs contains a hard link to a symbolic link, and is produced on xenial or bionic, and tested on focal, then it fails.

Produce on focal, test on focal - it works.
Produce on bionic, test on focal - FAILS
produce on xenial, test on bionic - it works.
produce on xenial, test on focal - FAILS

"a hard link to a symbolic link" is something like:

touch a-file
ln -s a-file a-symlink
mkdir a-dir
ln a-symlink hard-link-to-symlink

I'm attaching a simple script that allows testing/reproducing this under various combinations of packer and tester ubuntu releases by setting the PACKER and TESTER variables. Requires having lxd installed with a non-zfs storage backend.

Revision history for this message
Daniel Manrique (roadmr) wrote :

Given the tickling behavior I described in comment 7, a workaround which e.g. review-tools could implement is doing the equivalent of "stat $file" for all files that are symlinks, before doing mksquashfs:

unsquashfs whatever.snap /tmp/squashfs-root
find /tmp/squashfs-root -exec stat {} \;
mksquashfs /tmp/squashfs-root repack.snap ...
# compare checksums, they should match now

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

We must not have broken squashfs-tools. So we should find what/when got changed or fixed and backport those fixes.

Changed in squashfs-tools (Ubuntu):
importance: Undecided → Critical
tags: added: rls-bb-incoming rls-ff-incoming
Revision history for this message
Daniel Manrique (roadmr) wrote :

The correct test script, sorry for the old crappy one

tags: added: fr-1534
tags: removed: rls-bb-incoming rls-ff-incoming
Revision history for this message
Alex Murray (alexmurray) wrote :

@roadmr - are we still seeing instances of this in the store dashboard?

Revision history for this message
Daniel Manrique (roadmr) wrote :

I haven't received any further reports from snap developers. I don't know if a recent squashfs-tools update on focal took care of the issue. The test script above is self-contained so it should be easy to run on a fully-updated Focal system to see if there's still a problem.

Note that lack of reports doesn't mean it isn't an issue anymore - the circumstance was rather unique and the sole reporters (wekan) worked around it as seen in the forum thread, but if the underlying problem is still present it could come back to bite someone else.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments