Ubuntu 22.04 Server Autoinstall Traceback crash trying to install packages

Bug #2022856 reported by Hari Sekhon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
subiquity (Ubuntu)
New
Undecided
Unassigned

Bug Description

Autoinstall installation crashes with this error when using a local iso cidata containing the user-data on arm64. This exact same user-data and package selection works on Ubuntu 22.04 installations on x86_64 with user-data served over http:

Title: install failed crashed with CalledProcessError
Traceback:
 Traceback (most recent call last):
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/controllers/install.py", line 290, in install
     await self.postinstall(context=context)
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquitycore/context.py", line 148, in decorated_async
     return await meth(self, **kw)
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/controllers/install.py", line 313, in postinstall
     await self.install_package(context=context, package=package)
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquitycore/context.py", line 148, in decorated_async
     return await meth(self, **kw)
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/controllers/install.py", line 340, in install_package
     await run_curtin_command(
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/curtin.py", line 203, in run_curtin_command
     return await cmd.wait()
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/curtin.py", line 116, in wait
     result = await self.runner.wait(self.proc)
   File "/snap/subiquity/4381/lib/python3.8/site-packages/subiquity/server/runner.py", line 84, in wait
     raise subprocess.CalledProcessError(proc.returncode, proc.args)
 subprocess.CalledProcessError: Command '['systemd-run', '--wait', '--same-dir', '--property', 'SyslogIdentifier=subiquity_log.2256', '--setenv', 'PATH=/snap/subiquity/4381/bin:/snap/subiquity/4381/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/subiquity/4381/bin:/snap/subiquity/4381/sbin', '--setenv', 'PYTHONPATH=:/snap/subiquity/4381/lib/python3.8/site-packages', '--setenv', 'PYTHON=/snap/subiquity/4381/usr/bin/python3.8', '--setenv', 'SNAP=/snap/subiquity/4381', '--', '/snap/subiquity/4381/usr/bin/python3.8', '-m', 'curtin', '--showtrace', '-vvv', '--set', 'json:reporting={"subiquity": {"type": "journald", "identifier": "curtin_event.2256.11"}}', 'system-install', '-t', '/target', '--', 'net-tools']' returned non-zero exit status 100.

Reproduce this easily on any new Mac M1/M2 from my public GitHub repo where the autoinstall user data and Hashicorp Packer config is for 100% reproducibility:

git clone https://github.com/HariSekhon/Packer-templates pack
cd pack
make install tart # Tart is only for new ARM processor Macs M1/M2
make ubuntu-tart # downloads the arm64 ISO, creates the cidata iso image with autoinstall user-data, and then run packer to do the build

On an x86_64 system with VirtualBox installed you can see the same exact autoinstall user-data served over http working to install on the same version of Ubuntu 22.04:

git clone https://github.com/HariSekhon/Packer-templates pack
cd pack
make install vbox # run only on Mac to install packer and virtualbox, otherwise install them yourself
make ubuntu-vbox

Tags: crash
Revision history for this message
Hari Sekhon (harisekhon) wrote :
Hari Sekhon (harisekhon)
description: updated
description: updated
description: updated
Hari Sekhon (harisekhon)
description: updated
Hari Sekhon (harisekhon)
description: updated
description: updated
Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

I've tried to isolate the code path in case it was related to local boot medium, by making an alternate configuration to boot using an autoinstall user data over the network as works on x86_64 installs on VirtualBox, but the exact same occurs in user-data provided over the network too.

You can reproduce this second case by running the alternate make target:

make ubuntu-tart-http

which also takes care of starting a simple `python3 -m http.server` from the installers/ directory which I'd already tested from a tty2 on the installer in the first case can `curl http://192.168.64.1:8000/user-data` as the address of the Tart VM host and to test the web server was serving out the user-data.

Revision history for this message
Hari Sekhon (harisekhon) wrote :

These appear to be bugs with the ARM version of the Ubuntu 22.04 subiquity installer, but I've also raised it with the Tart virtualization project in case anyone there has seen this behaviour before:

https://github.com/cirruslabs/packer-plugin-tart/issues/72

Revision history for this message
Hari Sekhon (harisekhon) wrote :

I've tried on newer Ubuntu 23.04 too which also fails, you can reproduce using:

make ubuntu-23-tart

results in:

Can't find a SQUASHFS superblock on loop6

Failed to mount snap-lxd-23646.mount

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Unfortunately it's not really possible to see what the problem is from the information you have posted. Can you extract the crash file that will have been produced (or the journal) somehow?

Revision history for this message
Hari Sekhon (harisekhon) wrote :

Do you mean that Crash.log that I attached above with the original post?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Oops not sure how I missed that, sorry. Looks like we might need the journal to understand what's going on though (which is another bug really)

Revision history for this message
Hari Sekhon (harisekhon) wrote :

I'm getting a crash trying this on qemu on my Intel Mac using qemu-system-x86_64 now too, am uploading the crash log from that one too in case it helps.

The config to reproduce it is in the same GitHub repo:

https://github.com/HariSekhon/Packer-templates

packer build ubuntu-x86_64.qemu.pkr.hcl

Revision history for this message
Hari Sekhon (harisekhon) wrote :

Crash log on Intel Mac qemu

Revision history for this message
Hari Sekhon (harisekhon) wrote :

Crash meta from Intel Mac qemu

Revision history for this message
Hari Sekhon (harisekhon) wrote :
Revision history for this message
Hari Sekhon (harisekhon) wrote :
Revision history for this message
Hari Sekhon (harisekhon) wrote :
Revision history for this message
Hari Sekhon (harisekhon) wrote :

I've repeated the last build to collect the journal for you and have attached some different variations of of the journalctl log in the hopes that this helps.

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

Screenshot from the failing build in qemu on my Intel Mac x86_64.

It looks like the error is that it fails to update the installer, even though this works on the same machine in VirtualBox.

Reproducible via:

packer build ubuntu-x86_64.qemu.pkr.hcl

The exact same build using the exact same autoinstall user data and same 22.04 amd64 iso gets past that point and fails later, so it appears I am hitting different bugs in the installer on different virtualization platforms, which I might expect if they're presenting different virtualized hardware and therefore using different device drivers... although this has diverged a bit now as the original issue was a traceback trying to install a basic package.

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

On the Qemu x86_64 build changing autoinstall user-data to set refresh-installer update to 'no' gets past the above screenshot and hits a different error just afterwards shown in this screenshot.

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

From the run on Qemu x86_64 with update installer set to no

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

From the run on Qemu x86_64 with update installer set to no

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

From the run on Qemu x86_64 with update installer set to no

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

Another day, another Python traceback failure in Subiquity running the installer in VirtualBox, this time an input/output error

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The "NoneType has no attribute 'id'" error is probably a (poorly reported) problem in your config.

Revision history for this message
Hari Sekhon (harisekhon) wrote :

Another general Exception Traceback on Qemu x86_64, appears related to refreshing autoinstaller.

Revision history for this message
Hari Sekhon (harisekhon) wrote :
Revision history for this message
Hari Sekhon (harisekhon) wrote :

@Michael - the entire autoinstaller config is here:

https://github.com/HariSekhon/Packer-templates/blob/main/installers/autoinstall-user-data

Any ideas how I can narrow it down give the Python equivalent of a null pointer exception is masking it?

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

So I commented out most of the autoinstall user-data, the whole packages section, the whole snaps section, ssh keys, the whole late command section and ran it again on x86_64 qemu and it still crashed out, related to the filesystem:

 2023-06-13 00:57:38,888 ERROR root:37 finish: subiquity/Filesystem/apply_autoinstall_config/convert_autoinstall_config: FAIL: 'NoneType' object has no attribute 'id'
 2023-06-13 00:57:38,891 ERROR root:37 finish: subiquity/Filesystem/apply_autoinstall_config: FAIL: 'NoneType' object has no attribute 'id'
 2023-06-13 00:57:38,897 ERROR root:37 finish: subiquity/apply_autoinstall_config: FAIL: 'NoneType' object has no attribute 'id'
 2023-06-13 00:57:38,901 ERROR subiquity.server.server:424 top level error
 Traceback (most recent call last):
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquity/server/server.py", line 666, in start
     await self.apply_autoinstall_config()
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquitycore/context.py", line 148, in decorated_async
     return await meth(self, **kw)
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquity/server/server.py", line 475, in apply_autoinstall_config
     await controller.apply_autoinstall_config()
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquitycore/context.py", line 148, in decorated_async
     return await meth(self, **kw)
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 141, in apply_autoinstall_config
     self.convert_autoinstall_config(context=context)
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquitycore/context.py", line 142, in decorated_sync
     return meth(self, **kw)
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 638, in convert_autoinstall_config
     self.run_autoinstall_guided(self.ai_data['layout'])
   File "/snap/subiquity/4380/lib/python3.8/site-packages/subiquity/server/controllers/filesystem.py", line 612, in run_autoinstall_guided
     target = GuidedStorageTargetReformat(disk_id=disk.id)
 AttributeError: 'NoneType' object has no attribute 'id'
 2023-06-13 00:57:38,926 INFO subiquity.common.errorreport:406 saving crash report 'unknown error crashed with AttributeError' to /var/crash/1686617858.918541193.unknown.crash
 2023-06-13 00:57:38,932 INFO root:37 start: subiquity/ErrorReporter/1686617858.918541193.unknown/add_info:

Full crash log is attached.

Given that the storage section in the autoinstall user-data is only this:

  storage:
    layout:
      name: lvm
      match:
        size: largest

I'm not sure what else I can cut... perhaps in this case it's caused Qemu's storage or lack of being detected?

Revision history for this message
Hari Sekhon (harisekhon) wrote :
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Oh right, there are no disks other than the install ISO in your VM. Where are you expecting the install to go? If you want to overwrite the install media you can put "toram" on the kernel command line, which will copy the ISO to ram and make the install media available. This does require a fairly large amount of RAM though.

Revision history for this message
Hari Sekhon (harisekhon) wrote (last edit ):

The qemu config is in the Packer .hcl file specifies a disk size if you want to reproduce it so it should create a disk and install there.

But even if the disk wasn't detected for some reason, surely the installer shouldn't crash with None type traceback, that's the equivalent of an unhandled null pointer exception or array out of bounds type programming error...

I guess the first point is that the code probably needs to be strengthened to check for these assumptions and give more user friendly error messages that will aid in better debugging.

I think there are several bugs in this thread though as the bug I started with at the top was an exception around installing packages that install fine with the same package names on an already running Ubuntu of the same version...

To post a comment you must log in.