MAAS fails to create node because a node with the hostname exists

Bug #2059715 reported by Amjad Chami

This bug report will be marked for expiration in 53 days if no further activity occurs. (find out why)

6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Incomplete
Undecided
Unassigned

Bug Description

MAAS seems to restart during a command to compose a vm-host which causes it to fail because the command is issued twice

Console log:
2024-03-28-11:16:55 foundationcloudengine.layers.configuremaas INFO Creating graylog-3 in sunset
2024-03-28-11:16:55 root DEBUG [localhost]: maas root vm-host compose 2 hostname=graylog-3 cores=2 memory=4096 storage=40.0 zone=3
2024-03-28-11:17:06 root ERROR [localhost] Command failed: maas root vm-host compose 2 hostname=graylog-3 cores=2 memory=4096 storage=40.0 zone=3
2024-03-28-11:17:06 root ERROR 1[localhost] STDOUT follows:
{"hostname": ["Node with hostname \"graylog-3\" already exists"]}

In the logs:
2024-03-28T11:16:55+00:00 sunset maas.node: [info] juju-upgrade-2: Status transition from TESTING to READY
2024-03-28T11:16:55+00:00 sunset maas.service_monitor: [info] Service 'maas-http' has been restarted. Its current state is 'on' and 'running'.

testrun: https://solutions.qa.canonical.com/testruns/459947f6-df64-4af7-97ca-0337e42455fe
artificats: https://oil-jenkins.canonical.com/artifacts/459947f6-df64-4af7-97ca-0337e42455fe/index.html

Revision history for this message
Jacopo Rota (r00ta) wrote :

Hi @Amjad, was actually your script retrying the request because the first one was apparently failing? Or did you get
```
2024-03-28-11:17:06 root ERROR [localhost] Command failed: maas root vm-host compose 2 hostname=graylog-3 cores=2 memory=4096 storage=40.0 zone=3
2024-03-28-11:17:06 root ERROR 1[localhost] STDOUT follows:
{"hostname": ["Node with hostname \"graylog-3\" already exists"]}
```
after a single call to `maas root vm-host compose 2`?

Revision history for this message
Amjad Chami (amjad-chami) wrote :

I think it was a single call, if it was multiple it would have shown in the fce debug output

Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

Hi Amjad,I can see inside foundation engine a logic that is retrying a command 10 times and, by default, the output_mode is set to quiet. Meaning that there is high chance we lost something from the output. (search for `remotehelpers/run_cmd`). Actually you can start from add_vm_host_vm function and follow the chain of execution till run_cmd. May I ask you to update the script to something like output_mode="live" and retry?

Changed in maas:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.