Kernel modules missing from ephemeral environment

Bug #1795530 reported by Joshua Powers
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Triaged
Critical
Unassigned

Bug Description

Summary:
When working with a zfsroot deployed node I am unable to boot into recovery mode and recover/fix the zfsroot due to not zfs kernel modules.

Actual Result:
$ zpool list
The ZFS modules are not loaded.
Try running '/sbin/modprobe zfs' as root to load them.
$ lsmod | grep zfs | wc -l
0

Steps to Reproduce:
1. Using MAAS 2.4.2 (7034-g2f5deb8b8-0ubuntu1)
2. Deploy system using 18.04 LTS on zfsroot
3. Launch recovery mode (4.15.0-34-generic)
4. Try to use zfs utilities and find no zfs module

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Josh,

What release is it being used for rescue mode? The reason I ask is because the ephemeral environment is the same when deploying a machine vs when running rescue mode with the exception of the kernel, which in this case, you may be using the commissioning release.

That said, I wonder then, if there's no zfs on the initrd, how is curtin able to partition zfs ? or i guess it doesn't really need it in the kernel module?

Changed in maas:
status: New → Incomplete
Changed in maas-images:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Joshua Powers (powersj) wrote :
Download full text (3.3 KiB)

> What release is it being used for rescue mode?

Looks like bionic

I went to reproduce this last night and I deployed two systems again with Bionic, selecting the GA kernel. Once I logged into the deployed systems I got the following kernels:

ubuntu@nexus:~$ uname -a
Linux nexus 4.15.0-36-lowlatency #39-Ubuntu SMP PREEMPT Tue Sep 25 00:16:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@falcon:~$ uname -a
Linux falcon 4.15.0-36-lowlatency #39-Ubuntu SMP PREEMPT Tue Sep 25 00:16:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I did not select lowlatency, so I am confused why that was used.

When I booted into rescue mode I saw the following kernel:
ubuntu@nexus:~$ uname -a
Linux nexus 4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@falcon:~$ uname -a
Linux falcon 4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Logs and package versions below:

maas.log
http://paste.ubuntu.com/p/hNjQtVcqCr/

rackd.log
http://paste.ubuntu.com/p/fthSWGhk9s/

regiond.log
http://paste.ubuntu.com/p/yvyqkh5fpp/

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-==============================-============-=================================================
ii maas 2.4.2-7034-g2f5deb8b8-0ubuntu1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cli 2.4.2-7034-g2f5deb8b8-0ubuntu1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.4.2-7034-g2f5deb8b8-0ubuntu1 all MAAS server common files
ii maas-dhcp 2.4.2-7034-g2f5deb8b8-0ubuntu1 all MAAS DHCP server
un maas-dns <none> <none> (no description available)
ii maas-proxy 2.4.2-7034-g2f5deb8b8-0ubuntu1 all MAAS Caching Proxy
ii maas-rack-controller 2.4.2-7034-g2f5deb8b8-0ubuntu1 all Rack Controller for MAAS
ii maas-region-api 2.4.2-7034-g2f5deb8b8-0ubuntu1 all Region controller API service for MAAS
ii maas-region-controller 2.4.2-7034-g2f5deb8b8-0ubuntu1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.4.2-7034-g2f5deb8b8-0ubuntu1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.4.2-7034-g...

Read more...

Revision history for this message
Lee Trager (ltrager) wrote :

/lib/modules is missing in all ephemeral environments. This means ALL kernel modules are missing. If they haven't been loaded in the initrd there is no way to load anything. During deployment Curtin is installing the running kernel locally before starting.

In Brussels we discovered that /lib/modules is missing from the SquashFS image from http://cloud-images.ubuntu.com/ which is causing the iSCSI service to take 90s to start. I can't find that bug but I believe is why all kernel modules are missing.

summary: - rescue mode has no zfs
+ Kernel modules missing from ephemeral environment
Changed in maas:
status: Incomplete → Triaged
importance: Undecided → Critical
no longer affects: maas-images
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.