mksquashfs hangs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
squashfs-tools (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: squashfs-tools
# lsb_release -rd
Description: Ubuntu 10.10
Release: 10.10
# apt-cache policy squashfs-tools
squashfs-tools:
Installed: 1:4.0-8
Candidate: 1:4.0-8
Version table:
*** 1:4.0-8 0
500 http://
100 /var/lib/
I attempted to make a series of squashfs filesystems. (Many times!). I'm using The Transparent Archivist (ftp://ftp.
Sometimes -- not very often, maybe once in 30-60 invocations -- mksquashfs simply hangs. Having made some of the filesystem it's supposed to be making, It stops making the filesystem for no apparent reason. It continues to use a wee bit of processing power after that -- very occasionally rising to the top of the "top" display -- but it makes no observable progress on the filesystem that it's supposed to be making. Once I left it running for 4 days; no progress, and no resumption of work occurred during all that time.
Reading about mksquashfs's earlier bugs led me to edit the Transparent Archivist's source code in such a way as to add '-no-sparse' to the invocation parameters of the call to mksquashfs. This change did not fix the problem; the occasional hanging continued just as before.
I also read Philip Lougher's remarks in Launchpad about problems in earlier Ubuntu distros with mixing kernel module versions and mksquashfs versions incorrectly, with respect to lzma support/defaulting. One fix for that earlier "bug" was to invoke mksquashfs with -no-lzma. The version of mksquashfs in my 10.10 distro does not recognize that argument, so there's no way to try that fix.
It's important that you understand that, while I can reliably reproduce the error, it often takes many invocations of mksquashfs to do so, and the number of invocations cannot be predicted, EVEN FOR EXACTLY THE SAME DATA. When mksquashfs hangs, and I abort it manually, I can restart mksquashfs with exactly the source data and invocation parameters, and so far it has always worked just fine when I do that.
You're about to ask me, "Is your hardware stable?". The answer is "Yes". If this is a hardware problem, it is one that appears ONLY during mksquashfs processing and at no other time. No other applications are affected by it, and no rebooting is necessary even when it does happen. Everything else works perfectly.
Realizing that mksquashfs uses all 8 threads of my i7-950 processor, I have checked its temperature during maximum mksquashfs-mediated loads, and it always stays below 55 C, and usually below 45C. And all other processes continue to work without a hitch. So I don't think this is a hardware problem.
Could "ionice -c 3" be implicated? It's something to think about, anyway. I need to use that in order to keep the computer useful for small interactive tasks during these giant, days-long archiving jobs. However, the hanging problem was occurring before I started using ionice.
-no-sparse isn't going to have any effect here, this was a workaround for some sparse file handling bugs that were fixed for Mksquashfs 3.4 (i.e. sometime before Mksquashfs 4.0 which you're using). LikewSise, the -no-lzma mess was due to a mismatch between Ubuntu's patched Mksquashfs (inherited from Debian) and Ubuntu's kernel Squashfs code which not being derived from Debian lacked their lzma patches. Again these are not relevant here because it is not a squashfs- tools/kernel code interoperability problem, and in any case that problem went away a couple of Ubuntu releases ago.
So what is the problem? From your description it sounds like a multi-threading synchronisation problem. One extremely rare synchronisation bug has come to light since Mksquashfs 4.0, plus some other bugs have come to light since Mksquashfs 4.0, which could possibly cause Mksquashfs to get sufficiently confused to hang. On the other hand this is the first Mksquashfs hang reported against Mksquashfs 4.0 in nearly two years since it's release... and on that basis the threading code seems to be very stable and almost bug free. You may of course be extremely unlucky and have a hardware/source filesystem combination that's triggered the bugs fixed in Mksquashfs 4.1, or an unknown bug.
I would suggest your first step is to download squashfs-tools 4.1 from squashfs. sourceforge. net, and see if the problem still occurs.
If the problem still occurs then your second step should be to raise a bug on the Squashfs bug tracker (squashfs. sourceforge. net).