Handle structure parts with size and/or offset < 1MiB

Bug #1630709 reported by Barry Warsaw
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Image
Fix Released
High
Łukasz Zemczak

Bug Description

When calculating the partition spec for sgdisk, we have this bit of code:

            partdef = '{}M:+{}M'.format(
                part.offset // MiB(1), part.size // MiB(1))

This will fail if part.size is < 1MiB because floor division will return a size of zero and thus a spec of e.g. 4M:+0M.

We should either fail early, e.g. in the parser, or min the size value to 1MiB.

Tags: tech-debt
Revision history for this message
Steve Langasek (vorlon) wrote :

But see also bug #1621151.

Barry Warsaw (barry)
Changed in ubuntu-image:
milestone: none → 0.13
Barry Warsaw (barry)
Changed in ubuntu-image:
milestone: 0.13 → 0.14
Changed in ubuntu-image:
assignee: nobody → Łukasz Zemczak (sil2100)
Changed in ubuntu-image:
status: New → In Progress
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Looking at this realistically how I would propose dealing with this is making sure 1KiB is the minimum everywhere in the code (since we have some assumptions during copy_blob() calls too) and just add a check for any size being smaller than that.

Revision history for this message
Barry Warsaw (barry) wrote : Re: [Bug 1630709] Re: Handle structure parts with size < 1MiB

On Jan 04, 2017, at 11:41 AM, Łukasz Zemczak wrote:

>Looking at this realistically how I would propose dealing with this is
>making sure 1KiB is the minimum everywhere in the code (since we have
>some assumptions during copy_blob() calls too) and just add a check for
>any size being smaller than that.

I'm sure this is insane (and maybe slangasek can disabuse me of this folly)
but I've been thinking that maybe we want to implement sgdisk/sfdisk in
Python. It wouldn't need to be completely compatible, but given a spec or
defacto standard layout, we could probably just write the bytes however we
want them. We'd need to be able to read and translate the bytes back out (I
like the --json format) for diagnostics.

For the use cases we care about, we can probably get functionally identical to
sgdisk/sfdisk by comparing their effects to our tool. I'd write our tool as a
standalone library so it can be imported instead of executed via a subprocess
(so that would be another win).

This would also give us consistency between MBR/GPT.

These pages have lots of technical detail, although we may want to UTSL or go
back to original specs where available. MBR in particular seems to mean
several different things.

https://en.wikipedia.org/wiki/Master_boot_record
https://en.wikipedia.org/wiki/GUID_Partition_Table

Revision history for this message
Barry Warsaw (barry) wrote :

Note too that if my suggestion is deemed insane, then switching to 1K seems
reasonable. I think we should dup #1653949 to this bug (since this is the
earlier one) and just handle both requests at the same time.

summary: - Handle structure parts with size < 1MiB
+ Handle structure parts with size and offset < 1MiB
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Re: Handle structure parts with size and offset < 1MiB

Yeah, I suppose this could be doable! The plus side of it is that having our own implementation is that we stop relying on additional external tools and can get more testing done. The thing that I don't like though is the idea that we'll be reimplementing the wheel. The latter is not super bad I suppose, we just need to think if the effort is worth the price.

Maybe we could use some readily available python libraries that can handle MBR/GPT structures to save some work? I'm pretty sure someone already did something like that. That being said, I would certainly like to do some tech spec digging to implement it from zero - just don't want to duplicate work unnecessarily.

Anyway, Steve, what do you think?

summary: - Handle structure parts with size and offset < 1MiB
+ Handle structure parts with size and/or offset < 1MiB
Revision history for this message
Barry Warsaw (barry) wrote :
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Alright, so for now I assume this is the way we want to go. I'll start reading up and tinkering on getting the functionality incorporated into our code-base.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Looking through the available solutions, I tried out pyparted (python3-parted) and it seems to be sufficient for our usage regarding the GPT. I'll do some tests to see if the generated partition table is the same and compatible as when generated through sgdisk (although I can't think of a reason why not), but creating and modifying the GPT through pyparted is really straightforward. And we won't have to re-invent the wheel here so that's another win.

As for the MBR, well, here sadly I could not really find any existing libraries/tools in Python for modifying the partition tables. There's a lot of different solutions for reading those, yes, so we can use those as an entry point. Still, this means we'll have to write this part of the tooling ourselves from scratch. For our uses this shouldn't be hard since from what I see the format of the resulting MBRs that we generate through sfdisk is really simple.

Barry Warsaw (barry)
Changed in ubuntu-image:
milestone: 0.14 → 0.15
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

...actually scratch out the last part I wrote. pyparted bindings seem to support MBR type partition tables as well. Not sure how complete the support is - need to look into it still.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

One serious complication that popped up when I was reimplementing the partition logic. We won't be able to get rid of sgdisk completely yet sadly as pyparted (as well as its native API in parted) do not support operation on GPT UUIDs. It's always generated automatically and there is no way I saw to actually change it. So in the first solution what I'll do is create the partitions in Python and then call sgdisk to set the GUID of the partition to the expected value (geh).

Future solutions? We could either add UUID-modification support to parted and pyparted or simply try and implement this ourselves in Python (for our usage). I suppose the latter is more realistic.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

...continuing my rant here: parted (and pyparted) does GUID handling 'the other way around', e.g. since it tries to be label agnostic, it only sets the GUID internally based on the partition flags given. For instance, if a partition has the PARTITION_BOOT flag set, the GUID is set to 'EFI System'. If the PARTITION_BIOS_GRUB flag is set, the GUID is 'BIOS boot partition'. Checking the parted code it doesn't seem straightforward to export the GUID direct API outside.

But this gives me a few other ideas, depending on how people are using the GUIDs. One of them would be to do a wrapper and do a GUID -> flags binding for the most popular ones. But I guess there can also be custom GUIDs, right? I guess in that case we would need to use a tool like sgdisk anyway...

This being said, I'll continue looking for an easy way to just set the GUID as we want.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Just a status report on what I'm working on right now. In both solutions (manually overwriting GUIDs and using sfdisk) there's an additional problem that appeared. It seems libparted tries to be a bit too smart - what popped up during testing is that whenever, when using the GPT label, the partition tables get committed to disk libparted tries to be overly cleaver and clobbers the MBR hybrid partition. This actually breaks some of our use-cases with the normal partition handling. I'm working that around by postponing the mbr partition copies to lated (in the case of a two-step process it's nothing unnatural), but looking now on how to do it more nicely.

In overall it seems libparted isn't the perfect solution we were seeking. It's a very high-level API for partition management while we need to be able to do exactly what we want. Sadly, this is the best we can get without reimplementing everything from scratch. I hoped for some libfdisk python bindings but there are none.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

The bug is fixed in this PR: https://github.com/CanonicalLtd/ubuntu-image/pull/112

There will be some minor changes following up on that like making sure that the sizes and offsets are a multiple of the sector size (since we can't get better granularity than that). And I want to add a few other tests. Anyway, I'll set a separate bug for that.

Barry Warsaw (barry)
Changed in ubuntu-image:
status: In Progress → Fix Committed
importance: Undecided → High
Barry Warsaw (barry)
Changed in ubuntu-image:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.