MaaS fails to over-commit resources correctly on VIRSH KVM hosts

Bug #1982423 reported by James
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Invalid
Undecided
Unassigned
MAAS
Triaged
Medium
Unassigned

Bug Description

I'm using snap version/build 3.2.0-11989-g.84a255c14

I deployed a libvirt KVM (not LXD) host using MAAS on both a Dell r710 and a Dell r630. In both cases, the physical node deployed successfully (Ubuntu 20.04 - hwe-20.04 kernel). When I issue an over-commit, regardless of the size, the GUI reports correctly but the CLI shows no difference. Also, and it might be related, if I create a VM on this host, the commissioning fails. Previously, we were using MaaS 2.9 (I don't have the exact build, sorry), but libvirt worked perfectly fine.

For example, Dell r710, physically has 16 cores with 144GB RAM.

Steps to reproduce:
1. Confirm that a server is discovered and in the ready state.
2. Select the Deploy option, OS: Ubuntu, Release: Ubuntu 20.04 LTS "Focal Fossa", Kernel: focal (hwe-20.04), select the 'Register as MAAS KVM host' and select the 'libvirt' option.
3. Start deployment for machine.
4. Update CPU & Memory Overcommit (see screenshot png)
5. Log into CLI and issue `pods read` command.
6. CLI available resources are unchanged:

 ``` {
        "name": "R1-710-26",
        "architectures": [
            "amd64/generic"
        ],
        "zone": {
            "name": "default",
            "description": "",
            "id": 1,
            "resource_uri": "/MAAS/api/2.0/zones/default/"
        },
        "capabilities": [
            "composable",
            "dynamic_local_storage",
            "over_commit",
            "storage_pools"
        ],
        "total": {
            "cores": 16,
            "memory": 145035,
            "local_storage": 1967300759552
        },
        "id": 42,
        "storage_pools": [
            {
                "id": "cfa9a856-ff02-4b93-83a4-0389fc8f23ea",
                "name": "maas",
                "type": "dir",
                "path": "/var/lib/libvirt/maas-images",
                "total": 1967300759552,
                "used": 0,
                "available": 1967300759552,
                "default": true
            }
        ],
        "tags": [
            "pod-console-logging",
            "virtual"
        ],
        "available": {
            "cores": 16,
            "memory": 145035,
            "local_storage": 1967300759552
        },
        "memory_over_commit_ratio": 10.0,
        "used": {
            "cores": 0,
            "memory": 0,
            "local_storage": 0
        },
        "host": {
            "system_id": "hxq3k4",
            "__incomplete__": true
        },
        "default_macvlan_mode": "",
        "version": "6.0.0",
        "pool": {
            "name": "virtual_machine_pool",
            "description": "",
            "id": 1,
            "resource_uri": "/MAAS/api/2.0/resourcepool/1/"
        },
        "type": "virsh",
        "cpu_over_commit_ratio": 10.0,
        "resource_uri": "/MAAS/api/2.0/vm-hosts/42/"
    }
```

Revision history for this message
James (james-o-benson) wrote :
Revision history for this message
James (james-o-benson) wrote :

This shows what is available via GUI.

Bill Wear (billwear)
Changed in maas:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
James (james-o-benson) wrote :

Update: I deployed an LXD host and looked at the CLI vs GUI, it appears that the information is also incorrect there. The LXD host physically has what is reported in "Total" section:

"cores": 40,
"memory": 131072,
"local_storage": 983349022720

Currently there are VM's loaded occupying 54 cores, 144GB RAM, and 630GB HDD.

The CLI reports that there are
    "available": {
        "cores": -14,
        "memory": -16384,
        "local_storage": 353349022720
    },
But available resources should show as 400 total - 54 used= 346 cores, 1.25TB RAM total - 0.144GB = 1.106TB RAM, and 353349022720 of disk.

CLI Output of `pod read`:
https://paste.ubuntu.com/p/Hx7TQwPtrc/

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

It appears that MAAS does not take overcommit factor into account when reporting resource utilisation via the API.

Changed in maas:
milestone: none → 3.5.0
Revision history for this message
Michael Fischer (michaelandrewfischer) wrote :

$ snap list maas
Name Version Rev Tracking Publisher Notes
maas 3.4.0-14321-g.1027c7664 32469 3.4/stable canonical✓ -

When using juju, (juju deploy charmed-kubernetes) maas does not restrict creating VM resources beyond overcommit settings. I have configured each kvm of 3 virsh hosts with CPU and memory overcommit setting of 1. When using compose in UI, I am not able to create vm's beyond the overcommit value, however, juju is way over-allocating memory in kvm host.

Changed in juju:
status: New → Invalid
Changed in maas:
milestone: 3.5.0 → 3.5.x
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.