2019-04-11 18:34:15 |
Dimitri John Ledkov |
bug |
|
|
added bug |
2019-04-11 19:00:06 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2019-04-11 19:00:07 |
Ubuntu Kernel Bot |
tags |
|
disco |
|
2019-04-16 22:38:54 |
Michael Hudson-Doyle |
bug |
|
|
added subscriber Michael Hudson-Doyle |
2019-07-15 04:17:29 |
Launchpad Janitor |
linux (Ubuntu): status |
Incomplete |
Expired |
|
2019-10-03 15:11:29 |
Dimitri John Ledkov |
linux (Ubuntu): status |
Expired |
New |
|
2019-10-03 15:30:06 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2019-10-22 14:11:43 |
Dimitri John Ledkov |
description |
Apr 11 18:32:52 ubuntu-server kernel: SQUASHFS error: squashfs_read_data failed to read block 0x6ff3660032757063
Apr 11 18:32:52 ubuntu-server kernel: SQUASHFS error: Unable to read metadata cache entry [6ff3660032757063]
Apr 11 18:32:55 ubuntu-server kernel: SQUASHFS error: squashfs_read_data failed to read block 0x6261746d79732e
Apr 11 18:32:55 ubuntu-server kernel: SQUASHFS error: Unable to read metadata cache entry [6261746d79732e]
Apr 11 18:33:05 ubuntu-server kernel: SQUASHFS error: squashfs_read_data failed to read block 0x6ff366df00333a37
Apr 11 18:33:05 ubuntu-server kernel: SQUASHFS error: Unable to read metadata cache entry [6ff366df00333a37]
Happens when booting e.g. subiquity disco image. v5.0.0-8-generic kernel |
1) Download focal subiquity daily image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
bebroken debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
6) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
7) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
8) Exhibit A:
$ cat /etc/machine-id
(no output)
$ systemd-machine-id-setup
$ cat /etc/machine-id
(some machine id)
$ mount -o remount /
$ cat /etc/machine-id
I/O error
with overlay errors in dmesg
Similarly one can reproduce this with /etc/.pwd.lock & executing systemd-sysusers.
systemd-machine-id-setup is probably the easiest to trace. It does a simply open, truncate, lseek, write. On boot, actuall remount is done by the starting a unit which calls /lib/systemd/systemd-remount-fs
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-10-22 14:11:47 |
Dimitri John Ledkov |
linux (Ubuntu): status |
Incomplete |
New |
|
2019-10-22 14:36:12 |
Dimitri John Ledkov |
bug task added |
|
linux-hwe (Ubuntu) |
|
2019-10-22 14:36:53 |
Dimitri John Ledkov |
nominated for series |
|
Ubuntu Bionic |
|
2019-10-22 14:36:53 |
Dimitri John Ledkov |
bug task added |
|
linux (Ubuntu Bionic) |
|
2019-10-22 14:36:53 |
Dimitri John Ledkov |
bug task added |
|
linux-hwe (Ubuntu Bionic) |
|
2019-10-22 14:36:59 |
Dimitri John Ledkov |
linux-hwe (Ubuntu Bionic): milestone |
|
ubuntu-18.04.4 |
|
2019-10-22 14:37:04 |
Dimitri John Ledkov |
linux-hwe (Ubuntu Bionic): importance |
Undecided |
Critical |
|
2019-10-22 14:37:12 |
Dimitri John Ledkov |
bug task deleted |
linux (Ubuntu Bionic) |
|
|
2019-10-22 14:37:22 |
Dimitri John Ledkov |
linux (Ubuntu): milestone |
|
ubuntu-20.01 |
|
2019-10-22 14:37:28 |
Dimitri John Ledkov |
linux (Ubuntu): importance |
Undecided |
Critical |
|
2019-10-22 14:37:33 |
Dimitri John Ledkov |
linux-hwe (Ubuntu): status |
New |
Invalid |
|
2019-10-22 14:37:38 |
Dimitri John Ledkov |
linux (Ubuntu): status |
New |
Confirmed |
|
2019-10-22 14:37:40 |
Dimitri John Ledkov |
linux-hwe (Ubuntu Bionic): status |
New |
Confirmed |
|
2019-10-23 16:34:28 |
Steve Langasek |
summary |
why does booting any livefs squashfs has kernel complaining about unable to read metadata something rather |
why does booting any livefs squashfs cause the kernel to complain about being unable to read metadata |
|
2019-10-23 16:35:18 |
Brian Murray |
summary |
why does booting any livefs squashfs cause the kernel to complain about being unable to read metadata |
why does booting any livefs squashfs cause the kernel to complain about being unable to read metadata‽ |
|
2019-10-23 16:35:46 |
Brian Murray |
tags |
disco |
disco rls-ff-incoming |
|
2019-11-01 15:50:14 |
Dimitri John Ledkov |
description |
1) Download focal subiquity daily image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
bebroken debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
6) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
7) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
8) Exhibit A:
$ cat /etc/machine-id
(no output)
$ systemd-machine-id-setup
$ cat /etc/machine-id
(some machine id)
$ mount -o remount /
$ cat /etc/machine-id
I/O error
with overlay errors in dmesg
Similarly one can reproduce this with /etc/.pwd.lock & executing systemd-sysusers.
systemd-machine-id-setup is probably the easiest to trace. It does a simply open, truncate, lseek, write. On boot, actuall remount is done by the starting a unit which calls /lib/systemd/systemd-remount-fs
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
/usr/lib/systemd/systemd-remount-fs
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
(some machine uuid)
mount -o remount /
cat /etc/machine-id
I/O error
with overlay errors in dmesg
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-11-01 15:55:12 |
Dimitri John Ledkov |
description |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
/usr/lib/systemd/systemd-remount-fs
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
(some machine uuid)
mount -o remount /
cat /etc/machine-id
I/O error
with overlay errors in dmesg
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-11-01 15:56:09 |
Dimitri John Ledkov |
summary |
why does booting any livefs squashfs cause the kernel to complain about being unable to read metadata‽ |
remount of multilower moved pivoted-root overlayfs root, results in I/O errors on some modified files |
|
2019-11-01 17:16:46 |
Terry Rudd |
bug |
|
|
added subscriber Terry Rudd |
2019-11-01 17:21:32 |
Colin Ian King |
linux (Ubuntu): assignee |
|
Colin Ian King (colin-king) |
|
2019-11-01 17:21:36 |
Colin Ian King |
linux-hwe (Ubuntu Bionic): assignee |
|
Colin Ian King (colin-king) |
|
2019-11-02 02:49:27 |
Dimitri John Ledkov |
description |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) Before --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) After --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-11-04 17:39:07 |
Colin Ian King |
attachment added |
|
repro.sh https://bugs.launchpad.net/ubuntu/bionic/+source/linux-hwe/+bug/1824407/+attachment/5302762/+files/repro.sh |
|
2019-11-22 12:09:02 |
Colin Ian King |
description |
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) After --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
== SRU Justification Disco, Eoan, Focal ==
Multiple squashfs filesystems with overlayfs cause file corruption issues
when modifying zero sized files
== Fix ==
The current fix is pending in https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a
== Test case ==
With an Ubuntu ISO on the cdrom drive, use:
#!/bin/bash -x
mkdir -p /cdrom
mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
sleep 1
mkdir -p /cow
mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
sleep 1
mkdir -p /cow/upper
mkdir -p /cow/work
modprobe -q -b overlay
sleep 1
modprobe -q -b loop
sleep 1
dev=$(losetup -f)
mkdir -p /filesystem.squashfs
losetup $dev /cdrom/casper/filesystem.squashfs
mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
sleep 1
dev=$(losetup -f)
mkdir -p /installer.squashfs
losetup $dev /cdrom/casper/installer.squashfs
mount -t squashfs -o ro,noatime $dev /installer.squashfs
sleep 1
mkdir -p /root-tmp
mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root-tmp
FILE=/root-tmp/etc/.pwd.lock
echo foo > $FILE
cat $FILE
sync
#
# dropping caches or remounting causes the bug
#
echo 3 > /proc/sys/vm/drop_caches
cat $FILE
Without the fix the cat of the file will produce an error. With the the cat will work correctly.
== Regression Potential ==
There is an unhandled corner case:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
However, since this is an issue without the fix and will be addressed later with subsequent fixes once they are OK with upstream I think the risk is minimal considering nobody is complaining about these corner cases with the current broken overlayfs squashfs layering.
-----------------------
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) After --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-11-22 12:09:15 |
Colin Ian King |
description |
== SRU Justification Disco, Eoan, Focal ==
Multiple squashfs filesystems with overlayfs cause file corruption issues
when modifying zero sized files
== Fix ==
The current fix is pending in https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a
== Test case ==
With an Ubuntu ISO on the cdrom drive, use:
#!/bin/bash -x
mkdir -p /cdrom
mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
sleep 1
mkdir -p /cow
mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
sleep 1
mkdir -p /cow/upper
mkdir -p /cow/work
modprobe -q -b overlay
sleep 1
modprobe -q -b loop
sleep 1
dev=$(losetup -f)
mkdir -p /filesystem.squashfs
losetup $dev /cdrom/casper/filesystem.squashfs
mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
sleep 1
dev=$(losetup -f)
mkdir -p /installer.squashfs
losetup $dev /cdrom/casper/installer.squashfs
mount -t squashfs -o ro,noatime $dev /installer.squashfs
sleep 1
mkdir -p /root-tmp
mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root-tmp
FILE=/root-tmp/etc/.pwd.lock
echo foo > $FILE
cat $FILE
sync
#
# dropping caches or remounting causes the bug
#
echo 3 > /proc/sys/vm/drop_caches
cat $FILE
Without the fix the cat of the file will produce an error. With the the cat will work correctly.
== Regression Potential ==
There is an unhandled corner case:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
However, since this is an issue without the fix and will be addressed later with subsequent fixes once they are OK with upstream I think the risk is minimal considering nobody is complaining about these corner cases with the current broken overlayfs squashfs layering.
-----------------------
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) After --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
== SRU Justification Disco, Eoan, Focal ==
Multiple squashfs filesystems with overlayfs cause file corruption issues
when modifying zero sized files
== Fix ==
The current fix is pending in https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a
== Test case ==
With an Ubuntu ISO on the cdrom drive, use:
#!/bin/bash -x
mkdir -p /cdrom
mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
sleep 1
mkdir -p /cow
mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
sleep 1
mkdir -p /cow/upper
mkdir -p /cow/work
modprobe -q -b overlay
sleep 1
modprobe -q -b loop
sleep 1
dev=$(losetup -f)
mkdir -p /filesystem.squashfs
losetup $dev /cdrom/casper/filesystem.squashfs
mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
sleep 1
dev=$(losetup -f)
mkdir -p /installer.squashfs
losetup $dev /cdrom/casper/installer.squashfs
mount -t squashfs -o ro,noatime $dev /installer.squashfs
sleep 1
mkdir -p /root-tmp
mount -t overlay -o 'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work' /cow /root-tmp
FILE=/root-tmp/etc/.pwd.lock
echo foo > $FILE
cat $FILE
sync
#
# dropping caches or remounting causes the bug
#
echo 3 > /proc/sys/vm/drop_caches
cat $FILE
Without the fix the cat of the file will produce an error. With the the cat will work correctly.
== Regression Potential ==
There is an unhandled corner case:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B
However, since this is an issue without the fix and will be addressed later with subsequent fixes once they are OK with upstream I think the risk is minimal considering nobody is complaining about these corner cases with the current broken overlayfs squashfs layering.
-----------------------
1) Download focal subiquity pending image, or eoan release image
2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
3) After --- insert the following options
break=top debug init=/bin/bash
4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
5) in the initramfs execute:
rm /scripts/casper-bottom/25adduser
exit
6) you will be dropped into pivoted root filesystem, before systemd is execed as pid one
7) /run/initramfs/ will contain a debug log, showing how everything was mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower overlay setup from them, moved to /root, and then pivot-root to /root done to finally end up as /. Underlying layers are moved into /cow for your convenience.
8) At this point modifying zero-byte length files, that exist in the lowest layer, but not the middle one, in certain ways, will results in them to be corrupted, after / is remounted.
9) Corruption examples
(On both focal & eoan)
cat /etc/.pwd.lock
systemd-sysusers
cat /etc/.pwd.lock
mount -o remount /
cat /etc/.pwd.lock
overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
cat: /etc/.pwd.lock: Input/output error
(Only on eoan)
cat /etc/machine-id
systemd-machine-id-setup
cat /etc/machine-id
mount -o remount /
cat /etc/machine-id
overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
cat: /etc/machine-id: Input/output error
Lots of things break once machine-id and .pwd.lock are corrupted. I.e. unable to dhcp, connect to dbus, add/remove/change users or groups, etc.
We were unable to recreate the issue outside of booting things with casper. Ie. statically on a regular host machine without pivot-root. But hopefully booting to a quite state with nothing running is sufficient to reproduce this.
Instead of booting with `bebroken init=/bin/bash` you can boot with `bebroken systemd.mask=systemd-remount-fs.service` this will complete the boot, with /etc/machine-id & .pwd.lock modified, meaning that remount of / will cause IO errors on those files.
Currently, we are shipping two hacks in casper's 25adduser script to "rm" the offending files, and create them again on the upper rw layer. They then survive remount without i/o errors. However, we'd rather not ship those hacks, and have kernel overlay fixed to work correctly with multi-lower-dir and not corrupt files upon remounting /. |
|
2019-11-25 23:20:31 |
Colin Ian King |
nominated for series |
|
Ubuntu Focal |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux (Ubuntu Focal) |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux-hwe (Ubuntu Focal) |
|
2019-11-25 23:20:31 |
Colin Ian King |
nominated for series |
|
Ubuntu Eoan |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux (Ubuntu Eoan) |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux-hwe (Ubuntu Eoan) |
|
2019-11-25 23:20:31 |
Colin Ian King |
nominated for series |
|
Ubuntu Disco |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux (Ubuntu Disco) |
|
2019-11-25 23:20:31 |
Colin Ian King |
bug task added |
|
linux-hwe (Ubuntu Disco) |
|
2019-11-25 23:20:56 |
Colin Ian King |
bug task deleted |
linux-hwe (Ubuntu Focal) |
|
|
2019-11-25 23:21:03 |
Colin Ian King |
bug task deleted |
linux-hwe (Ubuntu Eoan) |
|
|
2019-11-25 23:21:08 |
Colin Ian King |
bug task deleted |
linux-hwe (Ubuntu Disco) |
|
|
2019-11-25 23:21:15 |
Colin Ian King |
linux (Ubuntu Focal): status |
Confirmed |
In Progress |
|
2019-11-25 23:21:20 |
Colin Ian King |
linux-hwe (Ubuntu Bionic): status |
Confirmed |
In Progress |
|
2019-11-28 15:48:02 |
Stefan Bader |
linux (Ubuntu Eoan): importance |
Undecided |
Critical |
|
2019-11-28 15:48:06 |
Stefan Bader |
linux (Ubuntu Disco): importance |
Undecided |
Critical |
|
2019-11-28 15:57:10 |
Stefan Bader |
linux (Ubuntu Eoan): status |
New |
Fix Committed |
|
2019-11-28 15:57:15 |
Stefan Bader |
linux (Ubuntu Disco): status |
New |
Fix Committed |
|
2019-12-03 15:42:24 |
Ubuntu Kernel Bot |
tags |
disco rls-ff-incoming |
disco rls-ff-incoming verification-needed-disco |
|
2019-12-04 11:04:17 |
Colin Ian King |
tags |
disco rls-ff-incoming verification-needed-disco |
disco rls-ff-incoming verification-done-disco |
|
2019-12-05 11:27:13 |
Ubuntu Kernel Bot |
tags |
disco rls-ff-incoming verification-done-disco |
disco rls-ff-incoming verification-done-disco verification-needed-eoan |
|
2019-12-08 15:50:48 |
Colin Ian King |
tags |
disco rls-ff-incoming verification-done-disco verification-needed-eoan |
disco rls-ff-incoming verification-done-disco verification-done-eoan |
|
2019-12-11 16:59:27 |
Dimitri John Ledkov |
linux (Ubuntu Focal): status |
In Progress |
Fix Committed |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
linux (Ubuntu Eoan): status |
Fix Committed |
Fix Released |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14895 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14896 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14897 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14901 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-18660 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-19055 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-19072 |
|
2020-01-06 13:12:44 |
Launchpad Janitor |
linux (Ubuntu Disco): status |
Fix Committed |
Fix Released |
|
2020-01-06 13:12:44 |
Launchpad Janitor |
cve linked |
|
2019-2214 |
|
2020-01-06 22:31:20 |
Launchpad Janitor |
linux (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2020-01-06 22:31:20 |
Launchpad Janitor |
cve linked |
|
2019-19078 |
|
2020-01-06 22:31:20 |
Launchpad Janitor |
cve linked |
|
2019-19332 |
|
2020-01-16 20:55:18 |
Launchpad Janitor |
linux-hwe (Ubuntu Bionic): status |
In Progress |
Fix Released |
|
2024-04-10 05:47:58 |
Rahul Verma |
bug |
|
|
added subscriber Rahul Verma |