ubuntu-image created vfat eats itself after a while (mcopy creates wrong subdirs)

Bug #1619718 reported by Oliver Grawert
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Snappy
Fix Released
Critical
Unassigned
Ubuntu Image
Fix Released
Critical
Unassigned
mtools (Ubuntu)
Fix Released
Critical
Steve Langasek
Xenial
Fix Released
Critical
Steve Langasek
ubuntu-image (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

[SRU Justification]
Since trusty, using mcopy to create directories will result in corrupted FAT entries due to uninitialized memory. This leads to data loss when an fsck is run on the filesystem.

[Test case]
1. Install dosfstools and mtools from xenial.
2. Run the script from https://bugs.launchpad.net/ubuntu/+source/mtools/+bug/1619718/comments/2
3. Observe that you are prompted to fix corrupted filesystem entries
4. Install mtools from xenial-proposed
5. Run the reproducer.sh script again
6. Observe that the newly-created filesystem passes fsck with no errors.

[Regresion potential]
This is a one-line fix to correct uninitialized memory. The only regression potential is that associated with rebuilding the tools with a new toolchain (the last rebuild of these binary packages was in vivid, with gcc-4.9).

[Original bug description]

on arm images the vfat used for the boot partition corrupts itself after a few reboots (it is noteworthy that on uboot/arm installations we write a lot more to the vfat. i would actually expect this to happen on x86 systems as well, just a lot later, i.e. after a few kernel updates)

there seem to be various differences in how ubuntu-image creates the vfat ...

in ubuntu-device-flash we use:
mkfs.vfat -F 32 -S 512 -n ... against a loop mounted partition, the -S 512 value gets actually matched against blockdev --getss (and adjusted accordingly if needed)

ubuntu-image currently only calls:
"mkfs.vfat" without any options against an img file that represents the vfat content.

u-d-f also just uses cp against the loop mounted partition while ubuntu-image uses mtools' mcopy to put the files in place ...

the result is that at random boots fsck suddenly starts to rename files in an 8+3 scheme like:
http://paste.ubuntu.com/23124260/ or http://paste.ubuntu.com/23124139/ which results in http://paste.ubuntu.com/23123498/

i spent the whole day trying to tweak the mkfs options but to no avail, the corruption always happens at some point, so my suspicion goes more towards mtools/mcopy now

http://paste.ubuntu.com/23123789/ is the original code that u-d-f uses which does not show such behaviour (i have upgraded the u-d-f rpi2 image over the last 6 weeks daily with new ubuntu-core and occasionally with new pi2-kernel packages without any corruption)

Oliver Grawert (ogra)
Changed in snappy:
importance: Undecided → Critical
Changed in ubuntu-image:
importance: Undecided → Critical
Revision history for this message
Michael Vogt (mvo) wrote :

Fwiw, I suspect mtools (or the way its used) is broken. If I run:
$ sudo ./ubuntu-image --workdir /tmp/u-i --channel edge ~/devel/go/src/launchpad.net/goget-ubuntu-touch/ubuntu-device-flash/canonical-pi2.model -o pi2.img
$ sudo fsck.vfat /tmp/u-i/.images/part0.img
fsck.fat 3.0.28 (2015-05-16)
/pi2-kernel_11.snap/DTBS
  Has a large number of bad entries. (2/5)
Drop directory ? (y/n) y
Reclaimed 88 unused clusters (45056 bytes).
Free cluster summary wrong (223996 vs. really 224084)
1) Correct
2) Don't correct
...

Revision history for this message
Michael Vogt (mvo) wrote :
Revision history for this message
Michael Vogt (mvo) wrote :
Revision history for this message
Barry Warsaw (barry) wrote : Re: [Bug 1619718] Re: ubuntu-image created vfat eats itself after a while

On Sep 05, 2016, at 10:45 AM, Michael Vogt wrote:

>Fwiw, I pushed a commit that does not use mcopy here:
>https://github.com/CanonicalLtd/ubuntu-
>image/commit/2a7dbab0109533b8f82932fedf52ad43b94a0368

Just noting that if this is a bug in mcopy we should (eventually) engage with
upstream to try to fix that.

Revision history for this message
Steve Langasek (vorlon) wrote :

Looking at the fsck output, it appears the problem is specific to the subdirectories we're creating. Although the EFI system partition we create for x86 also has subdirectories, that filesystem shows no corruption; but Michael's reproducer script shows that it's the directory inode that's broken:

Starting check/repair pass.
/TOIMG/.
  Bad short file name (.).
1) Drop file
2) Rename file
3) Auto-rename
4) Keep it
? ^C

summary: - ubuntu-image created vfat eats itself after a while
+ ubuntu-image created vfat eats itself after a while (mcopy creates wrong
+ subdirs)
Revision history for this message
Steve Langasek (vorlon) wrote :

Patch against mtools upstream that fixes the problem found here: https://lists.gnu.org/archive/html/info-mtools/2014-08/msg00000.html

Steve Langasek (vorlon)
Changed in mtools (Ubuntu):
status: New → In Progress
Changed in mtools (Ubuntu Xenial):
status: New → In Progress
Changed in mtools (Ubuntu):
importance: Undecided → Critical
Changed in mtools (Ubuntu Xenial):
importance: Undecided → Critical
Changed in mtools (Ubuntu):
assignee: nobody → Steve Langasek (vorlon)
Changed in mtools (Ubuntu Xenial):
assignee: nobody → Steve Langasek (vorlon)
Changed in mtools (Ubuntu):
status: In Progress → Fix Committed
Steve Langasek (vorlon)
description: updated
Steve Langasek (vorlon)
Changed in ubuntu-image:
status: New → Invalid
Steve Langasek (vorlon)
Changed in ubuntu-image (Ubuntu Xenial):
status: New → Invalid
Changed in ubuntu-image (Ubuntu):
status: New → Fix Committed
Revision history for this message
Adam Conrad (adconrad) wrote : Please test proposed package

Hello Oliver, or anyone else affected,

Accepted mtools into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/mtools/4.0.18-2ubuntu0.16.04 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in mtools (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ubuntu-image - 0.6ubuntu3

---------------
ubuntu-image (0.6ubuntu3) yakkety; urgency=medium

  * Make the mtools a versioned dep for real, not a versioned build-dep.

 -- Steve Langasek <email address hidden> Thu, 08 Sep 2016 00:45:37 -0700

Changed in ubuntu-image (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mtools - 4.0.18-2ubuntu1

---------------
mtools (4.0.18-2ubuntu1) yakkety; urgency=medium

  * debian/patches/initialize-direntry.patch: initialize direntry with
    memset to correct invalid bitfields. Thanks to
    Ronny Nilsson <email address hidden>. LP: #1619718.

 -- Steve Langasek <email address hidden> Wed, 07 Sep 2016 23:40:07 -0700

Changed in mtools (Ubuntu):
status: Fix Committed → Fix Released
Steve Langasek (vorlon)
Changed in snappy:
status: New → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

I've run through the test case on xenial:

$ ./reproduce.sh
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.213815 s, 628 MB/s
mkfs.fat 3.0.28 (2015-05-16)
Copying toimg
Copying baz
Copying foo
Copying bar
fsck.fat 3.0.28 (2015-05-16)
Checking we can access the last sector of the filesystem
Boot sector contents:
System ID "mkfs.fat"
Media byte 0xf8 (hard disk)
       512 bytes per logical sector
       512 bytes per cluster
        32 reserved sectors
First FAT starts at byte 16384 (sector 32)
         2 FATs, 32 bit entries
   1032704 bytes per FAT (= 2017 sectors)
Root directory start at cluster 2 (arbitrary size)
Data area starts at byte 2081792 (sector 4066)
    258078 data clusters (132135936 bytes)
32 sectors/track, 64 heads
         0 hidden sectors
    262144 sectors total
Starting check/repair pass.
/TOIMG/FOO/.
  Bad short file name (.).
1) Drop file
2) Rename file
3) Auto-rename
4) Keep it
? ^C
$ sudo apt install mtools
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
  floppyd
The following packages will be upgraded:
  mtools
1 upgraded, 0 newly installed, 0 to remove and 11 not upgraded.
Need to get 175 kB of archives.
After this operation, 10.2 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu xenial-proposed/main amd64 mtools amd64 4.0.18-2ubuntu0.16.04 [175 kB]
Fetched 175 kB in 0s (179 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
(Reading database ... 21878 files and directories currently installed.)
Preparing to unpack .../mtools_4.0.18-2ubuntu0.16.04_amd64.deb ...
Unpacking mtools (4.0.18-2ubuntu0.16.04) over (4.0.18-2) ...
Processing triggers for man-db (2.7.5-1) ...
Not building database; man-db/auto-update is not 'true'.
Setting up mtools (4.0.18-2ubuntu0.16.04) ...
$ ./reproduce.sh
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.158155 s, 849 MB/s
mkfs.fat 3.0.28 (2015-05-16)
Copying toimg
Copying baz
Copying foo
Copying bar
fsck.fat 3.0.28 (2015-05-16)
Checking we can access the last sector of the filesystem
Boot sector contents:
System ID "mkfs.fat"
Media byte 0xf8 (hard disk)
       512 bytes per logical sector
       512 bytes per cluster
        32 reserved sectors
First FAT starts at byte 16384 (sector 32)
         2 FATs, 32 bit entries
   1032704 bytes per FAT (= 2017 sectors)
Root directory start at cluster 2 (arbitrary size)
Data area starts at byte 2081792 (sector 4066)
    258078 data clusters (132135936 bytes)
32 sectors/track, 64 heads
         0 hidden sectors
    262144 sectors total
Starting check/repair pass.
Checking for unused clusters.
Checking free cluster summary.
Starting verification pass.
Checking for unused clusters.
test.img: 4 files, 4/258078 clusters
$

Looks good.

description: updated
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mtools - 4.0.18-2ubuntu0.16.04

---------------
mtools (4.0.18-2ubuntu0.16.04) xenial; urgency=medium

  * debian/patches/initialize-direntry.patch: initialize direntry with
    memset to correct invalid bitfields. Thanks to
    Ronny Nilsson <email address hidden>. LP: #1619718.

 -- Steve Langasek <email address hidden> Wed, 07 Sep 2016 23:40:07 -0700

Changed in mtools (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for mtools has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Mathew Hodson (mhodson)
Changed in ubuntu-image:
status: Invalid → Fix Released
no longer affects: ubuntu-image (Ubuntu Xenial)
Changed in ubuntu-image (Ubuntu):
importance: Undecided → Critical
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.