[2.5] Commissioning results in an alias interface automatically created

Bug #1803188 reported by Andres Rodriguez
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Mike Pontillo
2.4
Fix Committed
High
Mike Pontillo

Bug Description

I commissioning a machine (which happened to be a VM inside a MAAS deployed Pod), and the interface obtained two different IP addresses during commissioning.

The network commissioning script captured this and created a new interface in the machine with Ready state. The commissioning output for interfaces show:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:0f:2d:94 brd ff:ff:ff:ff:ff:ff
    inet 10.90.90.225/24 brd 10.90.90.255 scope global ens4
       valid_lft forever preferred_lft forever
    inet 10.90.90.198/24 brd 10.90.90.255 scope global secondary ens4
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe0f:2d94/64 scope link
       valid_lft forever preferred_lft forever

The resulting machine, please see attached screenshot.

That said, this doesn't seem like an issue isolated to machines inside a pod because I /think/ I've seen this issue in other machines.

Related branches

Revision history for this message
Andres Rodriguez (andreserl) wrote :
Changed in maas:
importance: Undecided → High
status: New → Triaged
milestone: none → 2.5.0rc1
assignee: nobody → Mike Pontillo (mpontillo)
description: updated
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Just thinking out loud here, but I'm wondering if this could be related to bug #1749019. I know we've had issues with DHCP IP addresses from the PXE environment not playing well with subsequently-acquired addresses from ISC DHCP, but that's the most recent thing that has changed in this area...

Revision history for this message
Andres Rodriguez (andreserl) wrote :

I've not explored why it would have gotten 2 different IP addresses, but it could also be the fact that during commissioning, we try to bring up other interfaces to see if we can discover networks. I wonder if this has regressed and it is trying to dhcp scan the interface used for PXE, which would cause it from potentially obtaining a new IP?

Or rather, there was a bug were cloud-initramfs-tools (i think it was there) that would copy the network config from the initrd to the ephemeral environment, causing the machine not to re-dhcp and that caused a few regressions, such as not renewing the IP lease because network configuration was "statically" configured?

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Yes, we should take a look at the handoff between the IP address the machine gets in the pre-boot environment and the next DHCP client that will be taking over the lease. I would be willing to bet that's part of the issue.

The most likely reasons I can think of for this to go wrong:

 - The DHCP server cannot match up the lease acquired at PXE boot time with the DHCP request from the ephemeral environment (possibly if it uses a different client identifier).

 - The lease expires between PXE boot time and the time of the DHCP request in the ephemeral environment. For example, in a small dynamic range with many machines booting, it could have expired and been handed to a different machine.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Sure, that said, the fact that a single with 2 addresses results on a alias after commissioning is still an issue (regardless of what other issues may be happening in the commissioning environment).

Changed in maas:
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

We are seeing this on 2.4.3 also:

Fetching Juju GUI 2.14.0
Waiting for address
Attempting to connect to 10.244.40.201:22
Attempting to connect to 10.244.40.202:22
Connected to 10.244.40.202
Running machine configuration script...
Bootstrap agent now started
Contacting Juju controller at 10.244.40.201 to verify accessibility...
Bootstrap complete, "foundations-maas" controller now available
Controller machines are in the "controller" model
Initial model "default" added
ERROR juju-ha-space is not set and a unique usable address was not found for machines: 0
run "juju config juju-ha-space=<name>" to set a space for Mongo peer communication

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

The failure in comment #6 happened around 2018-12-11-11:38:26

Revision history for this message
Nicolas Pochet (npochet) wrote :
Download full text (6.6 KiB)

When using MAAS 2.5.3, and composing a machine on a pod, it creates an alias for the first interface.

The command used to compose the VM:

maas root pod compose 9 cores=2 memory=1024 interfaces='eth0:space=oam-space;eth1:space=maas2'
Success.
Machine-readable output follows: {
    "system_id": "ar68rw",
    "resource_uri": "/MAAS/api/2.0/machines/ar68rw/"
}

When inspecting the interfaces for this machine after:
maas root interfaces read ar68rw
Success.
Machine-readable output follows:
[
    {
        "system_id": "ar68rw",
        "parents": [],
        "effective_mtu": 1500,
        "id": 55,
        "name": "eth0",
        "type": "physical",
        "vlan": {
            "vid": 1,
            "mtu": 1500,
            "dhcp_on": true,
            "external_dhcp": null,
            "relay_vlan": null,
            "id": 5001,
            "name": "untagged",
            "primary_rack": "k87hss",
            "fabric": "default",
            "fabric_id": 1,
            "space": "oam-space",
            "secondary_rack": null,
            "resource_uri": "/MAAS/api/2.0/vlans/5001/"
        },
        "tags": [],
        "discovered": [
            {
                "subnet": {
                    "name": "oam",
                    "vlan": {
                        "vid": 1,
                        "mtu": 1500,
                        "dhcp_on": true,
                        "external_dhcp": null,
                        "relay_vlan": null,
                        "id": 5001,
                        "name": "untagged",
                        "primary_rack": "k87hss",
                        "fabric": "default",
                        "fabric_id": 1,
                        "space": "oam-space",
                        "secondary_rack": null,
                        "resource_uri": "/MAAS/api/2.0/vlans/5001/"
                    },
                    "cidr": "192.168.105.0/24",
                    "rdns_mode": 2,
                    "gateway_ip": "192.168.105.1",
                    "dns_servers": [],
                    "allow_dns": true,
                    "allow_proxy": true,
                    "active_discovery": false,
                    "managed": true,
                    "id": 1,
                    "space": "oam-space",
                    "resource_uri": "/MAAS/api/2.0/subnets/1/"
                },
                "ip_address": "192.168.105.14"
            }
        ],
        "vendor": null,
        "enabled": true,
        "mac_address": "52:54:00:6a:e2:ff",
        "params": "",
        "links": [
            {
                "id": 276,
                "mode": "auto",
                "subnet": {
                    "name": "oam",
                    "vlan": {
                        "vid": 1,
                        "mtu": 1500,
       ...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.