cloud-initramfs-copymods hides the full list of modules from the system

Bug #1958260 reported by Ernst Persson
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
cloud-initramfs-tools
New
Undecided
Unassigned
cloud-initramfs-tools (Ubuntu)
Incomplete
High
Dave Jones

Bug Description

After my system has booted, many modules like
/lib/modules/5.4.0-1065-azure/kernel/net/ipv4/netfilter/ip_tables.ko
are not available.

This is because the copymods system has hidden them with its mount on top:
mount | grep copymods
copymods on /lib/modules type tmpfs (rw,relatime)

So this system doesn't seem to be working as it should...

If I do:
sudo apt install --reinstall linux-modules-extra-5.4.0-1065-azure linux-modules-5.4.0-1065-azure

they reappear in the tmpfs but on the next boot they are gone again.

This seems to have happened to others in the past:
https://unix.stackexchange.com/questions/405146/removed-lib-modules-folder-after-every-reboot

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: cloud-initramfs-copymods 0.40ubuntu1.1
ProcVersionSignature: Ubuntu 5.4.0-1065.68~18.04.1-azure 5.4.157
Uname: Linux 5.4.0-1065-azure x86_64
ApportVersion: 2.20.9-0ubuntu7.27
Architecture: amd64
Date: Tue Jan 18 15:02:34 2022
PackageArchitecture: all
SourcePackage: cloud-initramfs-tools
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Ernst Persson (ernstp) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

A similar sounding issue was reported on focal: LP: #1901464, but lacked the /proc/cmdline.

Another issue that may be related:
    https://bugs.launchpad.net/cloud-initramfs-tools/+bug/1766723

There are also changes in 0.47ubuntu1 that may address this issue, that might be worth considering backporting.

Changed in cloud-initramfs-tools (Ubuntu):
importance: Undecided → High
status: New → Triaged
Bryce Harrington (bryce)
tags: added: server-todo
Revision history for this message
Ernst Persson (ernstp) wrote :

I looked for a nested directory like /lib/modules/4.15.0-15-generic/4.15.0-15-generic/ but I didn't have that.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (3.4 KiB)

Well, IMHO the question is why is it active in your case ...

cloud-initramfs-copymods has a rather clear, cut use case:
- If booted with an external kernel/initrd make the modules from initrd available in /lib/modules/
- If not, do nothing

That is why "usually" (tm) one has the package installed and it does nothing.
The links that were added all revolve around removing cloud-initramfs-copymods, but IMHO as stated above why was it active when it should not?

Because if booting a "normal" local kernel/initrd it should do nothing.
A random booted server image I had has it installed and no tmpfs mount in /lib/modules just as I'd expect. If I might ask let us try to understand why it is active for Ernst and maybe out of that we find a condition it missed to check to work only in the right cases.

Usually it works like:
0. detect kernel myver=$(uname -r)
0. $rootmnt is usually empty, but if not it would also change where it acts on which isn't the case here

1. if $rootmnt/lib/modules/$myver exists do nothing
1b: only if there is a kernel commandline option copymods=force set it acts without the above

Hence Lucas asked for the content of /proc/cmdline in that other case - @Ernst do you have something set in there?

I wondered how this could happen ..
If one would by accident remove the kernel modules manually and then reboot it would act.
From that moment on any later (re)install of the kernel mods will be in that tmpfs directory, so after a reboot you'd be back to the start.

$ uname -a
Linux j 5.15.0-17-generic #17-Ubuntu SMP Thu Jan 13 16:27:23 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ mount | grep /lib/modules
$ sudo rm -rf /lib/modules/*
$ sudo reboot
...
$ mount | grep /lib/modules
copymods on /usr/lib/modules type tmpfs (rw,relatime,inode64)

So this caused the problem, but only due to mis-administrating the system.
And since the initrd does not contain ALL modules that could be in /lib/modules the bug report that "some are missing" as we only get the subset of the initrd.

In this case we are actually happy that we have some modules thanks to cloud-initramfs-copymods :-)

@Ernst:
If you have NOT set by accident or any external source the kernel commandline copymods=force, was there by any chance such an accidential deletion of /lib/modules ?

P.S. If that was what might have happened, the way to resolve is not to "just reinstall the mods" but to remove the tmpfs before that. That isn't easy with the modules open from there.
Most likely you are e.g. disconnecting your network or disk when trying to unload all modules.
Instead use a bind-mount to trick it into the original place "behind" that tmpfs.

# First get all modules (not just the initrd content) into the tmpfs
$ sudo apt install --reinstall linux-modules-5.15.0-17-generic
$ sudo mkdir /mnt/helper
$ sudo mount --bind /usr/lib/ /mnt/helper
# now /mnt/helper/modules/ is the real /usr/lib/modules/ on disk, so copy the tmpfs there
$ sudo cp -a /usr/lib/modules/5.15.0-17-generic /mnt/helper/modules/
$ sudo reboot
...
$ mount | grep /lib/modules
#empty now and we have all the extended modules available

You might need to adapt that to your kernel version, but otherwise - if you got into t...

Read more...

Changed in cloud-initramfs-tools (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Ernst Persson (ernstp) wrote :

Hi!

So this is a pretty standard Ubuntu Server 18.04 Azure machine, I've just installed a couple of extra packages.

My boot cmdline right now is

BOOT_IMAGE=/boot/vmlinuz-5.4.0-1067-azure root=UUID=136d7771-527b-490b-b111-f851f974c604 ro console=tty1 console=ttyS0 earlyprintk=ttyS0

I removed cloud-initramfs-copymods and reinstalled the modules and the server came back to normal.

I then tried to install copymods again and reboot but I can no longer reproduce the issue.

Revision history for this message
Brian Murray (brian-murray) wrote :
Revision history for this message
Jason (jasmas) wrote :

This is a problem for raspberry pi and has been reported as bugs with docker (missing ip_tables module) and other packages around the internet. It seems this has been a problem for a few releases. Raspberry pi images have this package installed by default and it is part of the -server meta package. I believe this is because the initrd and kernel are, at least initially, external in /boot/firmware. I am not sure the logic of copymods, but it does not seem that it accounts for the system being updated after boot.
If additional non-default module packages like -extras get installed, they are installed in the tmpfs because copymods has overlayed a tmpfs after boot. If the modules packages include hooks to copy into initrd, those modules will work, but not all module packages include these hooks, nor should they if the additional modules are not required before root is mounted.
Would it not be more appropriate for the copymods union to overlay the actual disk? I guess I understand why this too would be difficult, but overlaying a tmpfs with apt et al. unaware seems like it is going to persist problems like this unless copymods includes some shutdown task to write the modules in the tmpfs back to disk/flash on shutdown.

Revision history for this message
Jason (jasmas) wrote :

I should have noted that I also triggered this problem in Ubuntu 22.04.1 LTS. I'm pretty sure copymods is getting triggered somehow when it should not and then causing the problem.
In my case it could possibly also be related to the use of overlayroot=tmpfs, but I have not been able to reproduce.

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

Jason, could you clarify how exactly you reproduced the issue? Are you running Ubuntu in a raspberry pi? If yes which Ubuntu release. Are you using docker to reproduce the issue? Please, try to provide detailed steps to reach the buggy state you are claiming so we can investigate better your situation.

Revision history for this message
Jason (jasmas) wrote :

I just confirmed I reproduced this issue simply by upgrading the raspberry pi. I had a new kernel update today and when I applied it and rebooted copymods was activated. I believe it has to do with the stages of the raspberry pi and some error in how it is detecting.
Once it is triggered by a kernel upgrade on raspberry pi. I cannot find a way to get rid of it being triggered other than uninstalling copymods and manually reinstalling the kernel and modules. after that it works fine.
This is clearly some problem with the raspberry pi distribution of ubuntu.

Revision history for this message
Jason (jasmas) wrote :

I am currently in a failed state. What could I gather that would be of some help?

Revision history for this message
Jason (jasmas) wrote :

$ uname -a
Linux ubuntu 5.15.0-1015-raspi #17-Ubuntu SMP PREEMPT Mon Sep 12 13:14:51 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

$ ls -la /lib/modules
total 4
drwxrwxrwt 3 root root 60 Sep 27 17:41 .
drwxr-xr-x 86 root root 4096 Sep 27 17:43 ..
drwxr-xr-x 3 root root 300 Sep 27 17:46 5.15.0-1015-raspi

$ mount | grep copy
copymods on /usr/lib/modules type tmpfs (rw,relatime,inode64)

?????
that's the logic I see in the script, so I'm not exactly sure what to do from here or why it is being triggered

Revision history for this message
Jason (jasmas) wrote :

Why is there a linux-firmware-raspi, linux-headers-raspi, linux-image-raspi, linux-modules-extra-raspi, but no linux-modules-raspi?
It makes it hard to reinstall the modules without pointing to a specific versioned package.

Revision history for this message
Juerg Haefliger (juergh) wrote :

@Jason can you post the logs from the time when you upgraded the kernel and rebooted?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since we couldn't reproduce or understand why it is active in some of those environments yet, but recently some RPI4 cases were mentioned I asked Dave to have a look with his array of RPi setups if he can make sense of it in a repeatable fashion.

Changed in cloud-initramfs-tools (Ubuntu):
assignee: nobody → Dave Jones (waveform)
Revision history for this message
Wai Keat, Chong (iwantsimplelife) wrote :

I have Ubuntu 22.04 that is exhibiting this issue. The original install is within a VM, and I have cloned the VM using Clonzilla onto a real PC.

In the VM there is no problem, but after cloning, it shows this error.

As the VM is not running UEFI, what I did was first installed Ubuntu 22.04 onto the real PC, then cloned the "/" partition over it. Leaving the boot and uefi partition untouched on the real PC.

If i reinstall the linux modules package, the files will show up, but will also disappear after a reboot.

Also the copymods is also mounted over my lib/modules.

Anyone have any new insight on this issue?

tags: removed: server-todo
Revision history for this message
Masahiro Yamada (myamada) wrote :

Hi.

I was also hit by this issue.

I think I did something like follows:

[1] I booted the Ubuntu system from an external kernel and initrd.
[2] copymods mounted tmpfs because /root/lib/modules/$(uname -r) was missing.
[3] I installed a new linux-module package by running "sudo dpkg -i <package>" or "sudo apt upgrade".
[4] The modules from that package were installed onto the tmpfs, hence lost after the reboot.
[5] I booted the Ubuntu with the new kernel, but modules are missing, and copymods is active again.

Maybe it is my "mis-administrating the system".

Once you get into a situation that activates copymods, the problem is sticky - you cannot fix it by rebooting the system or by reinstalling a kernel because /lib/modules/ is hidden by tmpfs.

Perhaps, people might want the kernel parameter "copymods=disable" instead of "copymods=force".

I do not think there is much situation to enable it forcibly, but there is good reason, as many people reported, to disable it forcibly.

Anyway, I did "sudo apt remove cloud-initramfs-copymods" to avoid stumbling on this again.

Revision history for this message
Stefan (stsichler) wrote (last edit ):

Hi,

I'm also affected by this bug on my RasPi in exactly the same way Masahiro wrote.
In my case, it was initally caused by an incomplete/aborted kernel update (because of a disk full error) which temporarily caused the kernel version and modules version to become inconsistent, so copymods correctly switched itself on at reboot and now I'm stuck, because all re-installs of the modules and all further module updates end up in copymods tmpfs.
The only way to escape that situation seems to manually unmount the copymods tmpfs from /lib/modules and then 'apt install --reinstall' the current linux-modules(-extra) packages.

I think in order to mitigate this bug in the first place, the /usr/share/initramfs-tools/scripts/init-bottom/copymods script may be adapted to mount the initramfs modules in a tmpfs on /lib/modules/$(uname -r) only instead of whole /lib/modules.
This way, the situation would at least be automatically fixed by the next kernel update, because the updated linux-modules(-extra) packages would no longer decompress into the copymods tmpfs.

Revision history for this message
Robie Basak (racb) wrote :

> I think in order to mitigate this bug in the first place, the /usr/share/initramfs-tools/scripts/init-bottom/copymods script may be adapted to mount the initramfs modules in a tmpfs on /lib/modules/$(uname -r) only instead of whole /lib/modules

That sounds like a reasonable enhancement proposal. I suggest someone proposes a patch for that, and then it can be reviewed.

For this bug though, I think it's still unclear what steps can reproduce this issue where the steps are something that the platform can be reasonably expected to support, so I'm leaving the bug status as Incomplete.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.