partition number required when preserving existing partitions

Bug #1885638 reported by ov2k
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin
Incomplete
Undecided
Unassigned
subiquity
New
Undecided
Unassigned

Bug Description

When preserving existing partitions, the partition number must be explicitly provided. When existing partitions are not preserved, partition numbers do not need to be provided for curtin to create the same disk layout. The curtin documentation indicates partition numbers are not required.

A user-data file is attached that demonstrates the problem. For anybody trying to use it, it is designed to be run on an EFI system. Partitioning is done via an early command.

At the time the error occurs, neither /var/log/curtin nor /var/log/installer/subiquity-curtin-install.conf exists, so I don't think curtin is involved.

The actual error message is:

2020-06-29 23:03:10,964 INFO root:39 start: subiquity/InstallProgress/install/curtin_install: installing system
 2020-06-29 23:03:10,965 DEBUG subiquitycore.controller.installprogress:259 curtin_install
 2020-06-29 23:03:10,965 DEBUG subiquity.models.subiquity:324 merging config from <subiquity.models.subiquity.DebconfSelectionsModel object at 0x7fd472f2c400>
 2020-06-29 23:03:10,966 DEBUG subiquity.models.subiquity:324 merging config from <subiquity.models.filesystem.FilesystemModel object at 0x7fd472f2c828>
 2020-06-29 23:03:10,966 DEBUG subiquity.models.filesystem:1591 mountpoints {'/boot/efi': 'espmount', '/': 'rootmount'}
 2020-06-29 23:03:10,966 ERROR root:39 finish: subiquity/InstallProgress/install/curtin_install: FAIL: '<' not supported between instances of 'NoneType' and 'NoneType'
 2020-06-29 23:03:10,967 ERROR subiquitycore.controller.installprogress:145 curtin_error
 Traceback (most recent call last):
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/controllers/installprogress.py", line 304, in install
     await self.curtin_install(context=context)
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquitycore/context.py", line 142, in decorated_async
     return await meth(self, **kw)
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/controllers/installprogress.py", line 268, in curtin_install
     curtin_cmd = self._get_curtin_command()
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/controllers/installprogress.py", line 236, in _get_curtin_command
     self.model.render(syslog_identifier=ident))
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/models/subiquity.py", line 325, in render
     merge_config(config, model.render())
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/models/filesystem.py", line 1618, in render
     'config': self._render_actions(),
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/models/filesystem.py", line 1601, in _render_actions
     if can_emit(obj):
   File "/snap/subiquity/1938/lib/python3.6/site-packages/subiquity/models/filesystem.py", line 1568, in can_emit
     if p._number < obj._number and p.id not in emitted_ids:
 TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'

Revision history for this message
ov2k (ov2k) wrote :

Ignore the originally attached user-data. That was for a different bug. The correct (broken) user-data is attached here.

Revision history for this message
ov2k (ov2k) wrote :

A user-data file is attached that demonstrates the workaround of providing partition numbers.

Revision history for this message
ov2k (ov2k) wrote :

A user-data file is attached that demonstrates that partition numbers are not required when partitions are not preserved.

Revision history for this message
Ryan Harper (raharper) wrote :

IMO Curtin really should always require partition numbers.

In subiquity, the existing system is probed and a storage config yaml is generated describing the detected system at boot; This allows subiquity/curtin to ensure the preserving a particular partition (say 3 on sda) remains the same.

In your scenario where you're "editing the system configuration" via early_stages is breaking the contract between subiquity's probe and the config it sends when preserving existing layouts to curtin.

Are you working around something that's not working that can be done with the early partitioning?

Changed in curtin:
status: New → Incomplete
Revision history for this message
ov2k (ov2k) wrote :

As mentioned earlier, subiquity crashes before even starting curtin. So part of this bug is purely a subiquity parsing issue. The correct behavior depends on whether partition numbers are supposed to be required in curtin or not.

If the partition number is supposed to be required in curtin, then I think the documentation needs to be fixed. There might also need to be some validation checks in curtin, because it works without partition numbers when existing partitions are not preserved. Also, proper enforcement of this rule might break existing working autoinstall configurations.

As for the early script, most of it is just to create and document a known configuration for the purposes of demonstrating the issue. That part can be implemented as a cloud-init bootcmd or a boot hook that runs before subiquity even starts. That should completely separate it from the autoinstall. The rest of it doesn't modify the disk and is a workaround for a separate bug:

https://bugs.launchpad.net/subiquity/+bug/1885635

I'm also under the impression subiquity's probe occurs after the early commands run. Isn't that sort of the point of /autoinstall.yaml? The first storage config reported in the debug logs, the "extracted (unmerged) storage config", is consistent with having been run after the early commands.

Just for fun, I've attached a user-data file that partitions the disk through cloud-init before subiquity runs, as described earlier.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I think the rule (for now, at least) is: if you set preserve: true you need to set number as well. I'll try to find a place to document this.

IMO curtin is a touch incoherent about whether "partition number" refers to ordering by offset, by slot in the partition table or by order in the list of actions. I think it should always refer to the slot in the partition table (that's where the number the kernel uses comes from after all) but it's all a bit confusing.

Revision history for this message
ov2k (ov2k) wrote :

Would it be difficult to have subiquity return a more descriptive error message when the partition number is missing and the partition is to be preserved? The error message when size is missing is much easier to interpret.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.