The 'new' persistent live method starting in 19.10 no longer works

Bug #1863672 reported by sudodus
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
casper (Ubuntu)
High
Unassigned

Bug Description

I am iso-testing Focal Fossa and I am creating and maintaining tools to create live and persistent live USB drives. Lubuntu Focal daily live dated 2020-02-12 works. But the current version of Lubuntu Focal dated 2020-02-16 (zsynced approx. one hour ago) does not work when using exactly the same method using mkusb-plug, that edits the iso file to replace 'quiet splash' with 'persistent ' /12 characters/ and creating an ext partition behind it.

I noticed that the version of casper has changed from 1.438build1 (when it worked) to 1.439 so I suspect that this has caused the failure.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: casper 1.439
ProcVersionSignature: Ubuntu 5.4.0-14.17-generic 5.4.18
Uname: Linux 5.4.0-14-generic x86_64
ApportVersion: 2.20.11-0ubuntu16
Architecture: amd64
CasperVersion: 1.439
CurrentDesktop: LXQt
Date: Mon Feb 17 20:54:54 2020
LiveMediaBuild: Lubuntu 20.04 LTS "Focal Fossa" - Alpha amd64 (20200216)
SourcePackage: casper
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.casper.conf: 2020-02-17T20:52:45.667205

Revision history for this message
sudodus (nio-wiklund) wrote :
Revision history for this message
sudodus (nio-wiklund) wrote :

I add a couplt of text files with output explaining what is different between the working version and the no longer working version of Lubuntu Focal Fossa. First the working one from Febr.12.

Revision history for this message
sudodus (nio-wiklund) wrote :

And then the no longer working version from Febr.16

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Hmm. You're almost certainly right that it was the casper upload that caused this.

I changed casper so that it creates a filesystem with the label "writable" but it should still mount a pre-existing filesystem with label "casper-rw" (and there are autopkgtests that it does this). There could be a race in this area I suppose.

Can I get you to try a few things?

1. attach casper.log from a failing boot (possibly after adding verbose to the kernel command line)
2. change your script to create a filesystem with label writable instead and see if that works

Revision history for this message
sudodus (nio-wiklund) wrote :

Strange results from the continued testing:

The Xubuntu iso file dated 2020-02-17 01:04 fails too (like the Lubuntu iso filefrom Febr.16)

The Ubuntu iso file dated 2020-02-17 07:50 succeeds (there is persistence with this method)

Lubuntu is updated:

The Lubuntu iso file dated 2020-02-17 17:40 fails too (like the Lubuntu iso file from Febr.16)

So I am no longer so sure that we should blame only casper. There could be some (other?) change that must be ported to the community flavours in order to make things work again.

Revision history for this message
sudodus (nio-wiklund) wrote :

The Lubuntu iso file dated 2020-02-17 17:40 fails too. I added casper.log from this one, with the boot options persistent and verbose.

The live drive is still the same in my Toshiba laptop, where I run the tests: /dev/sdc.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I bet it's a race and sometimes it works and sometimes it doesn't with the new casper (which of course can drive anyone trying to figure out what is going on crazy)

Revision history for this message
sudodus (nio-wiklund) wrote :

Here is the logfile for the boot options writable and verbose (Lubuntu still failing). Do you want results from the working Ubuntu with the same date? That way we might help the developers of Lubuntu and Xubuntu (and maybe also the other community flavours, that I have not tested).

Revision history for this message
sudodus (nio-wiklund) wrote :

and a file corresponding to the file in comment #2

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

"Warning: Unable to find the persistent medium" is the message indicating failure.

Do you know how to make changes to the initramfs? (I know it's not the simplest procedure)

If you do, can you insert "ls -l /dev/disk/by-label/" just above the "if [ ! -e "/dev/disk/by-label/${root_persistence}" ] && [ -e "/dev/disk/by-label/casper-rw" ]; then" line in mountroot?

If you don't, can you explain exactly what you are testing and I'll try to reproduce here.

Revision history for this message
sudodus (nio-wiklund) wrote :

Sorry for the misunderstanding alias careless reading of your instructions. I got it now, and Lubuntu works, when I modified the label to 'writable' (and kept the boot option persistent). To make things complete I will give you the log file for this case too.

Revision history for this message
sudodus (nio-wiklund) wrote :

and a file corresponding to the file in comment #2

Revision history for this message
sudodus (nio-wiklund) wrote :

So it seems to me that the problem for the community flavours is the label. The are not accepting the old casper-rw. By the way, what about home-rw. Will it be recognized by this new version?

Revision history for this message
sudodus (nio-wiklund) wrote :

No I don't know how to make changes to the initramfs in a live system. I have done it in an installed system long ago, and can follow instructions, at least when I am not too tired :-P

So that is one possibility.

The other one is that you install and use mkusb-plug according to this link (either via PPA or via a tarball).

https://help.ubuntu.com/community/mkusb/plug

Then use the current daily Lubuntu (and for comparison maybe Xubuntu and Ubuntu) focal iso files.

I create the persistent live drives in 16 GB USB 3 pendrives, but you can use any removable drive, also a memory card or eSATA drive).

And then I boot another computer (dedicated for testing). I happened to boot in UEFI mode, but I guess that the same thing might happen in BIOS mode. Anyway, the persistence should work in both boot modes.

But it is getting late here in Sweden, I had to sleep, but I will be back tomorrow and help you the way you want. Good night :-)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in casper (Ubuntu):
status: New → Confirmed
Revision history for this message
C.S.Cameron (cscameron) wrote :

Made a Live drive using balenaEtcher.
Changed the label on focal-desktop-amd64 20200223 writable partition to home-rw, resulted in a partition that was persistent for home stuff, wallpaper, etc, but not for new installed programs, as would be expected for a home-rw partition.

Made a persistent Focal drive using mkusb.
Relabeled the "casper-rw" partition to "writable".
Persistence did not work.

The writable label only seems to work for persistence with Live USB.

Revision history for this message
C.S.Cameron (cscameron) wrote :

Tried mkusb persistent using focal-desktop-amd64 20200223, changing partition label from "casper-rw" to "writable". This time it worked, Now have persistent partition labeled writable. Home stuff and installed programs are persistent.

Create an additional partition labeled home-rw, home stuff went there while downloaded programs went to writable similar to having casper-rw + home-rw partitions.

Revision history for this message
sudodus (nio-wiklund) wrote :

@C.S.Cameron,

Thanks for testing mkusb including the combo "writable & home-rw" :-)

I will wait a while to see if 'casper-rw' will be revived (alongside 'writable') in Focal, or if I have to make a check for the version of Ubuntu or casper in my programs to select between 'casper-rw' and 'writable'.

Revision history for this message
sudodus (nio-wiklund) wrote :

> I noticed that the version of casper has changed from 1.438build1 (when it worked) to 1.439 so I > suspect that this has caused the failure.

Today I tested the Lubuntu Focal daily iso file with caxper version 1.441. It is still affected by this bug: There is no persistence with the label 'casper-rw'. It works when I re-labeled the partition for persistence to 'writable'.

I noticed that there are some new output lines during boot, that I guess are for debugging casper, so I have hope that this bug will soon be fixed :-)

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1863672] Re: The 'new' persistent live method starting in 19.10 no longer works

Argh sorry I've been meaning to look into this but other things keep coming
up. To be clear: an existing casper-rw partition *should* be being used.
It's a bug that it is not.

On Fri, 28 Feb 2020, 21:25 sudodus, <email address hidden> wrote:

> > I noticed that the version of casper has changed from 1.438build1
> (when it worked) to 1.439 so I > suspect that this has caused the
> failure.
>
> Today I tested the Lubuntu Focal daily iso file with caxper version
> 1.441. It is still affected by this bug: There is no persistence with
> the label 'casper-rw'. It works when I re-labeled the partition for
> persistence to 'writable'.
>
> I noticed that there are some new output lines during boot, that I guess
> are for debugging casper, so I have hope that this bug will soon be
> fixed :-)
>
> --
> You received this bug notification because you are a member of Ubuntu
> Installer Team, which is subscribed to casper in Ubuntu.
> https://bugs.launchpad.net/bugs/1863672
>
> Title:
> The 'new' persistent live method starting in 19.10 no longer works
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/casper/+bug/1863672/+subscriptions
>

Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/1863672

tags: added: iso-testing
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

So I tried to reproduce this in a VM and failed. This is what I did:

$ cd ~/isos/
$ zsync http://cdimage.ubuntu.com/ubuntu-server/daily-live/pending/focal-live-server-amd64.iso.zsync
[...]
$ cd ~/images/
$ truncate -s 1G casper-rw.img
$ truncate -s 1G root.img
$ parted --script --align optimal casper-rw.img -- mklabel gpt mkpart primary ext4 2MiB -2048s
$ sudo losetup --show -Pf casper-rw.img
/dev/loop39
$ sudo mkfs.ext4 -L casper-rw /dev/loop39p1
mke2fs 1.45.5 (07-Jan-2020)
...
$ sudo losetup -d /dev/loop39
$ kvm -m 1024 -boot d -cdrom ~/isos/focal-live-server-amd64.iso -hda ~/images/casper-rw.img -hdb ~/images/root.img
[boot image until installer starts]
$ sudo losetup --show -Pf casper-rw.img
/dev/loop39
$ ls /mnt
install-logs-2020-03-02.0 lost+found

This shows that the casper-rw filesystem was mounted as expected.

Can someone provide very very detailed instructions on what they did that did not work?

Revision history for this message
sudodus (nio-wiklund) wrote :

@Michael Hudson-Doyle,

The bug might not appear in your VM.

I think a very very detailed instruction to do what I did is to

1. Work in *buntu 18.04.x LTS which is updated and full-upgraded to be up to date

2. Install mkusb-plug according to comment #14

3. zsync any current Ubuntu desktop Focal daily iso file (Lubuntu and Xubuntu have the smallest iso files). You may want to try with a corresponding Ubuntu 19.10 iso file to see how it should work.

4. I have several brand names and sizes of USB pendrives, but I often use Sandisk Extreme 16 GB. The bug appears with different brand names and sizes, so I think you will get similar results with the USB pendrives that are available.

5. Use mkusb-plug to create a persistent live drive in a USB pendrive. There is a wizard-style user interface, and I think you will use it the right way. Otherwise you can ask.

6. Move the pendrive to another computer and boot it. For me it will boot live-only, but after re-labeling the drive to 'writable' and reboot there will be persistence. The bug appears both in the old style BIOS mode and UEFI mode.

As described earlier, this bug appeared some weeks ago. Persistence works with 19.10 and with Focal with casper version 1.438build1 when using the old style label 'casper-rw'.

Revision history for this message
sudodus (nio-wiklund) wrote :

typing error in the previous comment:
change 'after re-labeling the drive to 'writable' and reboot'
to 'after re-labeling the partition for persistence to 'writable' and reboot'

Revision history for this message
sudodus (nio-wiklund) wrote :

@Michael Hudson-Doyle,

After some more testing I have to add some (maybe confusing, maybe helpful) information:

A - The bug appears in

1. a Toshiba laptop with an Intel i5 generaton 3 CPU (in BIOS mode and UEFI mode)

http://www.toshiba.se/laptops/satellite-pro/c850/satellite-pro-c850-19w/

This Toshiba laptop has been my main testing computer for several years.

2. a Dell laptop with an Intel i5 generation 4 CPU (in BIOS mode and UEFI mode)

https://www.cnet.com/products/dell-latitude-e7240-12-5-core-i5-4310u-8-gb-ram-128-gb-ssd-english/specs/

3. A Lenovo laptop with an Intel i3 generation 2 CPU (tested only in UEFI mode)

https://shop.lenovo.com/ISS_Static/ww/wci/products/us/laptop/thinkpad/x-series/x131e-intel/X131e-Datasheet-Intel.pdf

B - The bug does *not* appear in

1. an HP Probook 6450b laptop with an Intel i5 M520, tested only in BIOS mode

2. a Lenovo V130-14IKB laptop with an Intel i5 generation 7 CPU (rather new), tested only in UEFI mode

3. a Dell laptop with an Intel i7 generation 4 CPU (in BIOS mode and UEFI mode)

https://www.cnet.com/products/dell-precision-mobile-workstation-m4800-15-6-core-i7-4810mq-8-gb-ram-256-gb-ssd-english/specs/

---

I think that there is some kind of race condition as you suspected already in comment #4.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Yes, so roughly speaking it works on faster machines and fails on slower machines. (I also tested your steps on my laptop, and it worked there -- but my laptop is a t480s with nvme so that's consistent). I do have a slower machine that I use for testing but I'm travelling and so don't have access to it right now.

Do you think you could try one more time on a non-working machine, remove quiet from the kernel command line, and capture the output from "dmesg" and "journalctl" and attach them to this bug? I don't know if it'll be useful but we might be able to see something going on.

Revision history for this message
sudodus (nio-wiklund) wrote :

Quiet is always removed when persistent live by mkusb-plug because 'quiet splash' is replaced by 'persisstent '. Anyway, I ran the same Xubuntu Focal system, that I tested according to comment #25 in the Toshiba laptop

http://www.toshiba.se/laptops/satellite-pro/c850/satellite-pro-c850-19w/

in UEFI mode, and it was live-only. The output files, that you want, are attached here and in the next comment.

Revision history for this message
sudodus (nio-wiklund) wrote :
Revision history for this message
sudodus (nio-wiklund) wrote :

@Michael Hudson-Doyle,

Why have you changed to 'writable' as the primary label of the partition for persistence?

How committed are you to use 'writable' instead of 'casper-rw'?

By the way, are you aware of the current Debian corresponding label, 'persistence'? Debian uses the boot option 'persistence' too.

-o-

Would it be possible to return to the classic 'casper-rw'? That would solve the problems for all existing tools to create persistent live drives with Ubuntu and Ubuntu community flavours.

Revision history for this message
C.S.Cameron (cscameron) wrote :

I agree with Sudodus, when I started using Ubuntu the persistent partition was labeled casper-cow.
I have finally gotten used to casper-rw.
Why change it to writable, It seems to be identical to casper-rw and works the same way with home-rw.
Are we going to change the name of home-rw to settingable?
If it ain't broke, don't fix it.
Best regards

Revision history for this message
Brian Murray (brian-murray) wrote :

'casper-rw' was not a name which is easily understood while 'writable' is and given that the partition is being used more (with the new server installer) a decision was taken to rename it to something that was more understandable. Additionally, 'writable' is also used with the images we produce for Raspberry Pi so we are also gaining a consistent naming scheme.

tags: added: rls-ff-incoming
Revision history for this message
sudodus (nio-wiklund) wrote :

@Brian Murray,

As a developer and maintainer of software to create persistent live drives, I have to consider the race condition affecting several computers, when using the old 'casper-rw' label. Can I expect that you will let soneone spend time trying to fix the bug, or should I modify my software to use 'writable' for Focal Fossa?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I definitely hope to fix the bug before release. But not being able to reproduce it does make it hard.

Revision history for this message
sudodus (nio-wiklund) wrote :

@Michael Hudson-Doyle,

I am willing to help you (to respond as soon as possible to requests for testing in my computers). But I am also ready to release tools that can identify 20.04 and modify the label accordingly.

Revision history for this message
Akeo (pbatard) wrote :
Download full text (22.9 KiB)

Please do not go to release with this issue!

This is a MAJOR regression compared to 19.10, and considering the amount of pain the previous persistent issue with LTS has caused (just go to reddit, superuser or askubuntu and search for "mounting /cow on /root/cow failed" reports to get an idea of just how many Ubuntu users are being negatively impacted by this), if Ubuntu does value its userbase, it does not want a repeat of the 18.04 persistence fiasco for the next LTS.

I will provide steps on how to replicate the issue below, but first of all, I want to express my puzzlement with regards to unilaterally deciding to introduce a new 'writable' label for persistent partitions. Why????

"writable" provides no hints about persistence, and is way too generic to actually be understood. Unless you are using an optical media, where it's the hardware itself that is preventing write access, tagging a partition as "writeable" yields nothing, since, in the absolute, EVERY partition is of course writeable. So this tells nothing of value to users when it comes to persistence. Why not take a page from Debian here and just use 'persistence' for the label? At least, if people don't understand why their partition is called that way, they can perform a search and get a good overview of what persistent means. But with 'writable', you get nothing of the sort.

And seriously, for crying out loud, could you please coordinate with other distros when it comes to persistence. It's bad enough that Ubuntu chose to use a different kernel option ('persistent' vs. Debian's 'persistence') but you had a golden opportunity to bridge a gap by following what Debian was doing (by introducing the new 'persistence' partition label), yet chose once again to disregard the common good and do it your way.

Which brings me to the main point, the 'casper-rw' race condition is a MAJOR issue, which is far from being limited to slow machines. I am seeing it consistently happening on all of the UEFI platforms I am testing with, be it relatively slow dual-core based platform from >5 years ago to a recent 6-cores platform bought in January this year. So can you *PLEASE* test on real hardware instead of on VMs? I am confident that if you do test it on real hardware by creating a media in the manner highlighted below, you will consistently see the issue.

Below are the exact commands I used to create a UEFI-bootable persistent USB Flash Drive, which you should be able to follow more or less exactly to replicate the problem (tested with the latest focal-desktop-amd64.iso daily build from 2020.04.03):

-----------------------------------------------------------------------------------------
root@nano:/# ## Make sure to change the following disk to your USB media
root@nano:/# export TARGET_DISK=/dev/sda
root@nano:/# ## The following two commands erase the partition tables
root@nano:/# dd if=/dev/zero of=$TARGET_DISK bs=512 count=34
34+0 records in
34+0 records out
17408 bytes (17 kB, 17 KiB) copied, 0.00602524 s, 2.9 MB/s
root@nano:/# dd if=/dev/zero of=$TARGET_DISK bs=512 count=34 seek=$((`blockdev --getsz $TARGET_DISK` - 34))
34+0 records in
34+0 records out
17408 bytes (17 kB, 17 KiB) copied, ...

Changed in casper (Ubuntu):
importance: Undecided → High
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

First of all, I'm sorry I appear to have been neglecting this bug. It's hard to debug something you can't see! And I've had lots of other things on my plate unfortunately.

I don't think anyone can claim with a straight face that casper-rw is anything other than opaque. We could have an argument about whether 'writable' is better or not but that's not very interesting.

And, well, I still can't reproduce this. I have tried, several times, on real hardware. It's a bit harder to test on hw currently because my usual test machine is in an office I can't travel to thanks to the COVID-19 lockdown, but I tried your instructions and booted the resulting disk on my machine and it still mounted the casper-rw partition.

So let's try something else. I've uploaded a hacked up initrd to https://people.canonical.com/~mwh/casper-initrd.gz. If you can drop this over the one that came on the ISO and boot there should be two files present: /run/udev-ls.txt and /run/udev-dbg.txt. Can you boot a failing system and attach the files to the bug? udev-dbg.txt will be a few hundred kilobytes, the other one will be small.

Revision history for this message
sudodus (nio-wiklund) wrote :

@Michael Hudson-Doyle,

I have tried but cannot boot with your casper-initrd file. I extracted it (with gunzip) and pointed to it in grub.cfg with a custom menuentry.

I come to Busybox:
(initramfs) unable to find a medium containing a live file system

The 'standard' menuentry works (with persistence and 'casper-rw' in this computer, my Dell Precision, where there is no race condition. Still I fail with your casper-initrd.

1. Please describe exactly what you mean by 'drop this over the one that came on the ISO'.

2. At what stage and where should I to look for the files /run/udev-ls.txt and /run/udev-dbg.txt?

Revision history for this message
sudodus (nio-wiklund) wrote :

I tried (and failed) with the current Focal iso files of Ubuntu and Lubuntu. Should I try with Ubuntu 19.10 or specifically with Ubuntu Focal Beta?

Revision history for this message
Akeo (pbatard) wrote :
Download full text (5.0 KiB)

Hi Michael, thanks for looking into this.

> I don't think anyone can claim with a straight face that casper-rw is
> anything other than opaque.

It is still a lot more searchable than 'writable'. As I explained above, 'writable' yields next to nothing in terms of letting users understand why a partition might be labelled that way. 'casper-rw' does. And, considering that the label is something that the casper scripts are picking up, I don't really see how 'casper-rw' is that obtuse, because it pretty much tells you without looking anything up that it's associated with something called 'casper' and ensuring that something is both readable and writeable.

> We could have an argument about whether
> 'writable' is better or not but that's not very interesting.

We absolutely *SHOULD* because this is the crux of the issue here!

For one thing, as I mentioned, you could have used 'persistence', as other distros do, and not introduce a completely new label out of the blue.

And also, the problem is that:
- You decided to introduce support for a new label that I don't think anybody asked for.
- It seems to me like you unilaterally picked the label name that you *liked* best, without consulting much of anybody else (but I'd be more than happy to stand corrected on that).
- This introduced new *UNWARRANTED* complexity in the casper scripts, which resulted precisely in the problem we are being faced with today.

In other words, you tried to fix a problem that didn't exist, and in doing so, broke existing behaviours.

I will therefore assert the following:
- You should revert the casper scripts to the working 19.10 version (that only supports 'casper-rw) *NOW*. As far as I can tell, there is exactly zero urgency to introduce a new persitent label for 20.04, unless you can point to specific user cases that were left stranded by the inability to use something else than 'casper-rw' in 18.04 or 19.10.
- You should move the introduction of a new label to after 20.04 has been released, because, again, there is no real urgency on adding support for a new label and if this issue demonstrates anything, it's that there should exist some concertation as well as proper testing before this option is made public.
- Unless you can demonstrate otherwise, I don't believe it should be that big a deal to revert the casper script changes that pertain to additional label support from 20.04 to 19.10, but this needs to be done *now*, so that we can use the little time that is left before the release to test and ensure that the reversal doesn't introduce a new issue.

> And, well, I still can't reproduce this.

At this stage, I don't think it matters. Multiple people other than you can, so it's not something that you can release in its present state and "hope for the best".

I can consistently see the issue on intel NUC computers (I have 2 of these), which are not even PCs built with custom parts, and therefore I pretty much expect every user of a NUC platform to be unable to use 'casper-rw' persistence in 20.04.

Even if you can't reproduce the issue right now, you do have to err on the cautious side and assume that you are the exception, and that this is going to affect the m...

Read more...

Revision history for this message
sudodus (nio-wiklund) wrote :

> So can we please revert support for 'writable' in 20.04, and use the next
> release to try to fix the issues with trying to support 2 partition labels?

...

> Again, unless you have confidence that you know precisely what the issue is,
> and should be able to fix it right now, I would advocate against trying to
> rush a fix just days before release, and instead revert the scripts, and use
> the leftover time to ensure that the reverted scripts still work as
> expected. This is an LTS release. This is not the time to introduce
> potentially breaking changes and/or behaviours that have not been
> thoroughly tested.

I agree with these words by @Akeo. Please revert back to 'casper-rw' as the one and only name of the file and label of the partition for persistence in Ubuntu 20.04 LTS.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I'll have a conversation with the release team about this. In the mean time, I've updated https://people.canonical.com/~mwh/casper-initrd.gz with one that might work better (no need to uncompress it btw).

Revision history for this message
Akeo (pbatard) wrote :

Thanks Michael.

As you know, I would strongly favour reverting the problematic changes so that everybody has more time to analyse where the issue lies (and especially why some machines don't seem to be affected), but I think I made my position clear already so I'll leave you and the release team decide what you think is best at this stage.

With regards to the updated initrd, this one seems to work much better, so I have attached the files I get. This test was performed on a very recent Intel NUC 10i7FNK where the persistent partition failed to mount. As you will see, nothing got written to udev-ls.txt, but hopefully you can get enough data from the other file.

If you need more data, please let me know.

Revision history for this message
Akeo (pbatard) wrote :

Oh, and since I managed to get one occurrence where persistence worked with the same drive (all I did was reboot a few times), I'm attaching the same data from a session where 'casper-rw' was mounted. The most obvious difference I can see is that this time, udev-ls.txt was not empty.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Hmm this is interesting, and maybe points to a bug in initramfs-tools :( That /dev/disk/by-label directory should be populated by the time init-top/udev exits.

Can you download and try again (added a bit more logging)? The zip of all of /run is perfect.

Revision history for this message
Akeo (pbatard) wrote :

Here you go. If you really need the full content of /run let me know.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Hi, I've uploaded yet another initrd -- I think this one might fix the bug, hopefully. It seems that for whatever reason, it takes a long time for the block device to show up at all in your system (seems it's more the kernel being slow rather than udev lagging behind which is what I suspected at first). It also I guess means that the chances of encountering this bug might depend on the model of the motherboard of the system or something equally crazy.

So what I've done is to just delay the check for the /dev/disk/by-label/casper-rw to later. It's a bit unsatisfactory but I'm reasonably confident this will work (because if the device hasn't appeared by this point, persistence would not work with a writable filesystem label either). Please let me know how it goes.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Here's the debdiff of my proposed fix

Revision history for this message
Akeo (pbatard) wrote :

Thanks for the update.

Initial testing seems to indicate that the extra delaying appears to work, but I still need to check this out a bit more.

I will point out that during one boot (after rebooting the same machine a few times), I got the following fatal error:

ln: /tmp/mountroot-fail-hooks.d/scripts/init-premount/lvm2: No such file or directory

I've only seen it once so far, using the new initrd, and, at the moment, I don't exactly believe that this is related, but since it mentions mount scripts, I'm not entirely sure.

Another thing I've been consistently observing (I'll open a new issue -- haven't had a chance to do that yet) is that, when persistence is active, and regardless of whether the partition is labelled 'casper-rw' or 'writeable', I'm seeing "access beyond end of device" for /dev/sda and "I/O error while writing superblock" for /dev/sda2 (peristent partition) on powerdown/reboot. On occasion, this actually seems to freeze powerdown altogether and I have to perform a hard reset, whereas none of this happens when persistence is not in used. If it can be replicated, you should see it if you follow the persistent partition process I gave above (again, using either 'writable' or 'casper-rw').

All this to say that it still seems to me like, even if you are doing a good job fixing problems, the persistent partition handling process of 20.04 appears to be a lot more brittle than it was in 19.10, so I would still advise you to consider reverting, and wait for a non LTS release to "upgrade" it...

Oh I have to still URGE you do change your choice of introducing 'writable' as the newly allowed label for persistent partitions to 'persistence', which is what Debian and derivatives use.

By supporting 2 labels, you have a unique opportunity to bridge a gap that has been causing a lot of pain for many Linux users, by UNIFYING the means in which users are advised to create a persistent partition, and finally stop this utter nonsense of having this or that distro doing something entirely different for persistence, when the fundamental underlying user-process (add option XYZ to the kernel boot options and create an ext# partition with label ABC) is the same and has no valid reason to deviate that much from one distro to another.

So can you please at least switch to using 'persistence' instead of 'writable' for the 20.04 release? If you need additional validation that Debian Live uses 'persistence' as its persistent partition label (since, from regularly testing the automated persistent media creation process of my app with the latest Debian Live I believe I do have a pretty accurate view of what label Debian requires), you can find some at https://unix.stackexchange.com/a/538665/314167.

I'll post some more about my testing as well as provide a link to that additional issue I'm planning to open when I get a chance.

Revision history for this message
sudodus (nio-wiklund) wrote :

> I'm seeing "access beyond end of device" for /dev/sda and
> "I/O error while writing superblock" for /dev/sda2
> (peristent partition) on powerdown/reboot.

I can confirm this. I have tested in several computers and it happens in all of them.

Revision history for this message
sudodus (nio-wiklund) wrote :

I tested with the current casper-initrd with a current Lubuntu Focal persistent live system, and it works with a 'casper-rw' partition also in the computers where it did not work before as described in post #25:

1. a Toshiba laptop with an Intel i5 generaton 3 CPU (in BIOS mode and UEFI mode)

http://www.toshiba.se/laptops/satellite-pro/c850/satellite-pro-c850-19w/

This Toshiba laptop has been my main testing computer for several years.

2. a Dell laptop with an Intel i5 generation 4 CPU (in BIOS mode and UEFI mode)

https://www.cnet.com/products/dell-latitude-e7240-12-5-core-i5-4310u-8-gb-ram-128-gb-ssd-english/specs/

3. A Lenovo laptop with an Intel i3 generation 2 CPU (tested only in UEFI mode)

https://shop.lenovo.com/ISS_Static/ww/wci/products/us/laptop/thinkpad/x-series/x131e-intel/X131e-Datasheet-Intel.pdf

-o-

@Michael Hudson-Doyle,

Do you want me to attach debug files from one of these computers?

-o-

I noticed that when there was unallocated drive space, the system did not pick the existing 'casper-rw' partition, but created a 'writable' partition in the unallocated drive space and used that partition.

So this is a step forward, but provides no guarantee that it will work in other computers. I think the time until release is too short for extensive testing, so I still want this 'writable' business to be reverted. Please revert back to 'casper-rw' as the one and only name of the file and label of the partition for persistence in Ubuntu 20.04 LTS.

tags: added: patch
Revision history for this message
Akeo (pbatard) wrote :

Completed my testing on the 3 UEFI machines I have that were displaying the symptoms before (2 intel NUCs and one ASUS based UEFI PCs, ranging from 2-cores to 6-cores, and ranging from 2010 to 2020), and I am happy to report that, no matter how many times I boot the system, the 'casper-rw' labelled persistent partition always appears to mount properly.

I have also not seen that "ln: /tmp/mountroot-fail-hooks.d/scripts/init-premount/lvm2: No such file or directory" error manifest itself again, so I am hoping this was just a fluke.

I have however seen the "access beyond end of device" and "I/O error while writing superblock" errors every single time during powerdown/reboot, so I have created https://bugs.launchpad.net/ubuntu/+source/casper/+bug/1871454 for this issue.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Thanks for testing the casper-rw fallback. I'll get that uploaded.

I see "ln: /tmp/mountroot-fail-hooks.d/scripts/init-premount/lvm2: No such file or directory" message sometimes, I think it's harmless (although I guess someone should look into it).

The superblock stuff does sound a bit worrying -- but let's leave that for the other bug.

Changed in casper (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
C.S.Cameron (cscameron) wrote :

@Akeo Not sure renaming the persistent partitions to persistent works. There are two kinds of persistent file or partition in Ubuntu. casper-rw is the catch all and takes new programs, data and settings, when there is no home-rw persistent partition. When there is a home-rw partition, home-rw takes the user data and settings like /home does in a full install.

Revision history for this message
Akeo (pbatard) wrote :

I'll let Michael elaborate but I really don't see how that wouldn't work.

20.04 is introducing *NEW* alternative 'writable' label for persistent partitions. This is a brand new label that is being introduced.

So my take on this is that, if we are introducing a brand new label for persistent partition, we might as well use 'persistence' instead of 'writable', because 'persistence' is what Debian live uses.

We are at the precise point where we do have a choice in a new alternative label name being introduced, so my point is that, rather than FRAGMENT the persistent landscape further, by having each distro do whatever the heck they want with no regards for what the others do, we should try to BRIDGE it.

As such, I will assert that if you do have an issue with using 'persistence' as a label, then you also have an issue with using 'writable' as a label (which is what Michael proposes), and are voting to not have a new alternate label being introduced at all, which is actually something I can rally with, since, unless it is done to bridge the gap, I really don't see the point of introducing yet another label for persistent partitions...

Revision history for this message
C.S.Cameron (cscameron) wrote :

The persistent file/partition names casper-rw and home-rw should be retained.
However home-rw partitions still work when the other persistent file/partition is named writable.
It is nice having the option for a home-rw file/partition. It is interchangeable with /home partition in a Full install and can be copied back and forth, if having duplicate homes in the desktop and Persistent thumb drive is desired..

Revision history for this message
Akeo (pbatard) wrote :

> However home-rw partitions still work when the other persistent file/partition is named writable.

Again, why would this not work when the other partition is named 'persistence' instead of 'writable'?

I still fail to see the point you are trying to make. Can you please elaborate on the issue you think using 'persistence' instead of 'writable' could bring?

Revision history for this message
C.S.Cameron (cscameron) wrote :

Ubuntu has been using casper-rw and home-rw ever since it was changed from casper-cow back around 7.04.

If it ain't broke, don't fix it.

Revision history for this message
Akeo (pbatard) wrote :

> If it ain't broke, don't fix it.

Then that makes 3 of us making that point now. ;)

Revision history for this message
C.S.Cameron (cscameron) wrote :

Sort of like asking the British to change the name of the pound to dollar.

Revision history for this message
sudodus (nio-wiklund) wrote :

... or like asking the US to change the name of the 'dollar' to 'valuable' ;-)

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package casper - 1.445

---------------
casper (1.445) focal; urgency=medium

  * Fix segfault in casper-md5check when plymouth is not installed (i.e.
    Ubuntu Server).

 -- Michael Hudson-Doyle <email address hidden> Thu, 09 Apr 2020 10:48:20 +1200

Changed in casper (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
sudodus (nio-wiklund) wrote :

It seems to me that the description of the bug-fix does not match this bug.

Are you describing a fix for another bug, or is it only the description, that is wrong or difficult to understand?

Revision history for this message
Akeo (pbatard) wrote :

So it seems our request not to use 'writable' is being ignored still, and that Ubuntu is just going to go ahead and introduce support for a completely new label that, not only no other distro appears to be using, but also nobody has really been asking for:

https://launchpadlibrarian.net/473795039/casper_1.443_1.445.diff.gz

As far as I am concerned, this issue is *NOT* fixed with the current 1.445 proposal.

Can you at least, in light of the feedback, try to justify your reason for still wanting to go with 'writable'?

Revision history for this message
sudodus (nio-wiklund) wrote :

By the way, today and during the weekend (Easter) 3 more persons are testing your 'hack', capeer-initrd.gz. We have found a few more computers that are affected by this bug, and your hack works in those computers. You find some results via

https://ubuntuforums.org/showthread.php?t=2440260

So it looks good, and I hope that it will continue looking good during the Easter holiday ;-)

Revision history for this message
sudodus (nio-wiklund) wrote :

The package casper version 1.445 is bundled with the current Lubuntu iso file dated 2020-04-09. I tested it (made a persistent live USB-connected SSD) in one of my computers that were affected by this bug, the Toshiba. Persistence works with the label 'casper-rw' :-)

So my alert in comment #62 can be dismissed. The new version of casper squashes the bug reported here even though the description tells us about plymouth and Ubuntu Server. Maybe it squashes a bug there too.

(The bug *discussed* here is still alive.)

Revision history for this message
C.S.Cameron (cscameron) wrote :

For those who have dad a problem with scrolling when shutting down, I noticed that pressing enter shuts things down.

Revision history for this message
Chris Guiver (guiverc) wrote :

Not sure how this will paste/read, but my findings..
(original can be viewed at https://docs.google.com/spreadsheets/d/18-XMDjy5lWiHFqFww_865jR6hbcHf00MI_Z92PayDMM/edit#gid=0 )

hp dc7900, 1x dell 755, dell 790

box (copied from list; most detail from lshw) debug (/cow size) start (/cow size)

hp dc7700 (c2d-e6320, 5gb, nvidia quadro nvs 290) 4.3/4.7gb 4.3/4.7gb
hp dc7900 (c2d-e8400, 4gb, intel 4 series integrated i915) 4.3/4.7gb 1.8/1.8gb
dell [optiplex] 745 (c2d-6600, 6gb, amd/ati radeon rv516/x1300/x1550) 4.2/4.7gb 4.2/4.7gb
dell [optiplex] 755 (c2d-e6850, 5gb, amd/ati radeon rv516/x1300/x1550) 4.4/4.7gb 2.4/2.4gb
dell [optiplex] 755 (c2d-e8300, 8gb, amd/ati radeon rv610/radeon hd2400 pro/xt) 4.4/4.7gb 4.4/4.7gb
dell [optiplex] 780 (c2q-q9400, 8gb?, amd/ati cedar radeon hd 5000/6000/7350/8350) 4.4/4.7gb 4.4/4.7gb
dell [optiplex] 960 (c2q-q9400, 8gb, amd/ati cedar radeon hd 5000/6000/7350/8350) 4.4/4.7gb 4.2/4.7gb
hp 8200 elite sff (i5-2400, 8gb, nvidia quadro 600) 4.3/4.7gb 4.3/4.7gb
dell vostro 430 (i7-870, 12gb, ??) - -
dell [optiplex] 990 (i7-2600, 16gb, nvidia geforce gt 6600 gt) 4.3/4.7gb 7.8/7.8gb
lenovo thinkpad sl510 (c2d-t6570, 2gb ram, i915) 4.4/4.7gb 4.4/4.7gb
lenovo thinkpad x201 (i5-m520, 4gb, i915) 4.4/4.7gb 4.4/4.7gb
motion computing j3400 (c2d-u9400, 4gb, intel mobile 4 series) 4.2/4.7gb 4.2/4.7gb
sony vaio ultrabook (i5-9400u, 4gb, intel haswell-ULT) 4.2/4.7gb 4.2/4.7gb

(all details are in https://ubuntuforums.org/showthread.php?t=2440260 mentioned previously)

Revision history for this message
Akeo (pbatard) wrote :

Since there has been no follow up from Michael on this critical question, I have opened a request to drop or alter 'writable' as a new persitent label in https://bugs.launchpad.net/ubuntu/+source/casper/+bug/1872065.

Revision history for this message
sudodus (nio-wiklund) wrote :

@ Michael Hudson-Doyle,

When booting a live system of Ubuntu 20.04 there is file and disk checking.. Not just with Live USB but with Persistent USB also.

If not fast enough with ctrl-C it runs until over 80% complete. Very irritating. If it was just a run once it would not be so bad.

Is there some way to get rid of this? For example some boot option or some command in grub.cfg?

Revision history for this message
C.S.Cameron (cscameron) wrote :

Best I think if the Disk check only ran once on the first boot of a new Live or Persistent USB drive and then vanished forever. Is there really any need for it to run every boot?.

It also seems that the Try Ubuntu / Install Ubuntu screen is back with a vengeance. It disappeared from persistent drives for a while, but I see it has returned to Rufus and mkusb persistent USB drives. It is easy to get rid of in Rufus, by overwriting syslinux.cfg, but I have not been able to disable it in mkusb.

Revision history for this message
sudodus (nio-wiklund) wrote :

@ C.S.Cameron (cscameron),

I think I understand now. You can remove the boot option maybe-ubiquity from the 'linux line' of grub.cfg:

from

menuentry "Ubuntu - persistent live" {
 search --set=root --fs-uuid 2020-04-23-07-51-42-00
        set gfxpayload=keep
        linux ($root)/casper/vmlinuz file=/cdrom/preseed/ubuntu.seed maybe-ubiquity quiet splash persistent ---
        initrd ($root)/casper/initrd
}

to

menuentry "Ubuntu - persistent live" {
 search --set=root --fs-uuid 2020-04-23-07-51-42-00
        set gfxpayload=keep
        linux ($root)/casper/vmlinuz file=/cdrom/preseed/ubuntu.seed quiet splash persistent ---
        initrd ($root)/casper/initrd
}

If you confirm that this is what you mean, I can modify that in mkusb-dus (the shellscript dus-persistent), maybe also in mkusb-plug (the shellscript mkusb-sedd).

When dropping the option maybe-ubiquity, the option to select language (via grub) will also be dropped. Maybe it is worthwhile to keep this option anyway, or to create an extra menuentry for the purpose to select language?

Revision history for this message
C.S.Cameron (cscameron) wrote :

@sudodus:
20.04 is the first Ubuntu version to have "maybe-ubiquity" in grub.cfg. previously Ubuntu's grub had an "Install Ubuntu" menuentry option.

From Ubuntu 19.10:

linux /casper/vmlinuz file=/cdrom/preseed/ubuntu.seed only-ubiquity quiet splash ---

This would make the Try/Install screen redundant.

Revision history for this message
robert key (rob54321) wrote :

I have also had this persistence problem with ubuntu 20.04.
Casper will not load a file labelled Casper-rw. It does use a partition labelled Casper-rw.
Ubuntu 19.04 persistence works with files and partitions labelled casper-rw.
Rob

Revision history for this message
robert key (rob54321) wrote :

I changed the file name from casper-rw to writable and now it works. Thanks.
Took me 5 days to realise there was a bug :)
Rob

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On Tue, 9 Jun 2020 at 21:31, robert key <email address hidden> wrote:

> I have also had this persistence problem with ubuntu 20.04.
> Casper will not load a file labelled Casper-rw. It does use a partition
> labelled Casper-rw.
> Ubuntu 19.04 persistence works with files and partitions labelled
> casper-rw.
>

Can you file another bug about that?

Revision history for this message
robert key (rob54321) wrote :

Will do.
Rob

Revision history for this message
sudodus (nio-wiklund) wrote :

@robert key,

Please post a link here to that new bug report.

To post a comment you must log in.