In 3.5.0 HA a new MAAS installation has no available architecture for deployments after the images are in synch

Bug #2058377 reported by Jacopo Rota
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Committed
High
Jacopo Rota
3.5
Fix Committed
High
Jacopo Rota

Bug Description

Install MAAS HA on more than 2 nodes and wait for the images to be in sync. Then on both the CLI and the UI there are failures as no architectures are available.

CLI:

$ maas root machines create hostname=node1 power_type=virsh architecture=amd64/generic mac_addresses=XXXXX power_parameters_power_address=qemu+ssh://ubuntu@10.244.40.1/system zone=zone2 power_parameters_power_id=node1 testing_scripts=none

-> {"architecture": ["'amd64/generic' is not a valid architecture. It should be one of: ''."]}

UI: see screenshot

The same works fine when MAAS is on a single node (no HA)

Related branches

Revision history for this message
Jacopo Rota (r00ta) wrote :
no longer affects: maas/3.6
Changed in maas:
milestone: 3.5.0 → 3.6.0
milestone: 3.6.0 → 3.6.x
Changed in maas:
milestone: 3.6.x → 3.6.0
Revision history for this message
Jacopo Rota (r00ta) wrote :

The issue comes from

>>> BootResource.objects.all()
<QuerySet [<BootResource: <BootResource name=grub-efi-signed/uefi, arch=amd64/generic, kflavor=None, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/hwe-20.04-lowlatency-edge, kflavor=lowlatency, base= rtype=0>>, <BootResource: <BootResource name=grub-efi/uefi, arch=arm64/generic, kflavor=None, base= rtype=0>>, <BootResource: <BootResource name=grub-ieee1275/open-firmware, arch=ppc64el/generic, kflavor=None, base= rtype=0>>, <BootResource: <BootResource name=pxelinux/pxe, arch=i386/generic, kflavor=None, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/ga-20.04, kflavor=generic, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/ga-20.04-lowlatency, kflavor=lowlatency, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/hwe-20.04, kflavor=generic, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/hwe-20.04-edge, kflavor=generic, base= rtype=0>>, <BootResource: <BootResource name=ubuntu/focal, arch=amd64/hwe-20.04-lowlatency, kflavor=lowlatency, base= rtype=0>>]>

>>> BootResource.objects.get_usable_architectures()
[]

Revision history for this message
Jacopo Rota (r00ta) wrote :

All the architectures are excluded here https://github.com/maas/maas/blob/3f1cde4bee61c42f9b07acfcee1708eb1c1db73d/src/maasserver/models/bootresource.py#L592 because apparently

`resource_set.sync_size != (
                n_regions * resource_set.files_size
            )`

Jacopo Rota (r00ta)
Changed in maas:
assignee: nobody → Jacopo Rota (r00ta)
Revision history for this message
Jacopo Rota (r00ta) wrote :

The issue is here

            resource_sets = self.sets.order_by("-id").annotate(
                files_count=Count("files__id", distinct=True),
                files_size=Sum("files__size"),
                sync_size=Sum("files__bootresourcefilesync__size"),
            )

as the query is FULL OUTER JOINING bootresourcefilesync and the `files_size` are duplicated on multiple rows depending on the number of bootresourcefilesync records.

>>> print(b.sets.order_by("-id").annotate(files_count=Count("files__id", distinct=True),files_size=Sum("files__size"),sync_size=Sum("files__bootresourcefilesync__size"),).query)
SELECT "maasserver_bootresourceset"."id", "maasserver_bootresourceset"."created", "maasserver_bootresourceset"."updated", "maasserver_bootresourceset"."resource_id", "maasserver_bootresourceset"."version", "maasserver_bootresourceset"."label", COUNT(DISTINCT "maasserver_bootresourcefile"."id") AS "files_count", SUM("maasserver_bootresourcefile"."size") AS "files_size", SUM("maasserver_bootresourcefilesync"."size") AS "sync_size" FROM "maasserver_bootresourceset" LEFT OUTER JOIN "maasserver_bootresourcefile" ON ("maasserver_bootresourceset"."id" = "maasserver_bootresourcefile"."resource_set_id") LEFT OUTER JOIN "maasserver_bootresourcefilesync" ON ("maasserver_bootresourcefile"."id" = "maasserver_bootresourcefilesync"."file_id") WHERE "maasserver_bootresourceset"."resource_id" = 6 GROUP BY "maasserver_bootresourceset"."id" ORDER BY "maasserver_bootresourceset"."id" DESC

Changed in maas:
status: Triaged → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.