Comment 0 for bug 2060803

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Deployed sunbeam with maas provider.

1. sunbeam cluster bootstrap --> successful
2. sunbeam cluster deploy --> failed as the node has no storage role.

$ sunbeam cluster deploy
Deployments needs at least one of each role to work correctly:
        control: 1
        compute: 1
        storage: 0

3. Added the role storage to the machine.
4. sunbeam cluster deploy --> Hanged forever

sunbeam tried to acquire a new node from maas.

juju models
Controller: sb2-controller

Model Cloud/Region Type Status Machines Cores Units Access Last connection
controller sb2/default maas available 1 2 2 admin just now
openstack-machines sb2/default maas available 2 2 - admin 4 minutes ago
ubuntu@sunbeam:~$ juju status -m openstack-machines
Model Controller Cloud/Region Version SLA Timestamp
openstack-machines sb2-controller sb2/default 3.4.2 unsupported 11:38:10Z

Machine State Address Inst id Base AZ Message
0 started 10.40.0.206 frank-sloth ubuntu@22.04 default Deployed
1 down pending ubuntu@22.04 failed to acquire node: No machine with system ID hdabk4 available.

Expected: sunbeam to pick `machine-0` or should skip deploying machine

Relevant logs:
11:33:02,411 sunbeam.jobs.common DEBUG Skipping step Add infrastructure model
11:33:02,413 sunbeam.jobs.common DEBUG Starting step 'Add machines'
11:33:03,301 sunbeam.provider.maas.steps DEBUG Machines fetched: [{'system_id': 'hdabk4', 'hostname': 'frank-sloth', 'roles': ['compute', 'storage', 'control'], 'zone': 'default', 'status': 'Deployed', 'root_disk': {'name': 'vda', 'tags': ['rotary', '1rpm'], 'root_partition': {'size': 42941284352}}, 'storage': {'ceph': []}, 'spaces': ['admin-space'], 'nics': [{'id': 39, 'name': 'ens3', 'mac_address': 'fa:16:3e:4a:d9:2c', 'tags': []}], 'cores': 2, 'memory': 4096}, {'system_id': '4wy4dd', 'hostname': 'good-dane', 'roles': ['juju-controller'], 'zone': 'default', 'status': 'Deployed', 'root_disk': {'name': 'vda', 'tags': ['rotary', '1rpm'], 'root_partition': {'size': 42941284352}}, 'storage': {'ceph': []}, 'spaces': ['admin-space'], 'nics': [{'id': 40, 'name': 'ens3', 'mac_address': 'fa:16:3e:11:37:47', 'tags': []}], 'cores': 2, 'memory': 4096}]
11:33:03,301 sunbeam.provider.maas.steps DEBUG Machines containing worker roles: [{'system_id': 'hdabk4', 'hostname': 'frank-sloth', 'roles': ['compute', 'storage', 'control'], 'zone': 'default', 'status': 'Deployed', 'root_disk': {'name': 'vda', 'tags': ['rotary', '1rpm'], 'root_partition': {'size': 42941284352}}, 'storage': {'ceph': []}, 'spaces': ['admin-space'], 'nics': [{'id': 39, 'name': 'ens3', 'mac_address': 'fa:16:3e:4a:d9:2c', 'tags': []}], 'cores': 2, 'memory': 4096}]
11:33:03,301 sunbeam.clusterd.service DEBUG [get] https://10.40.0.205:7000/1.0/nodes, args={'allow_redirects': True}
11:33:03,306 urllib3.connectionpool DEBUG https://10.40.0.205:7000 "GET /1.0/nodes HTTP/1.1" 200 193
11:33:03,307 sunbeam.clusterd.service DEBUG Response(<Response [200]>) = {"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":[{"name":"frank-sloth","role":["compute","control"],"machineid":0,"systemid":"hdabk4"}]}

11:33:03,307 sunbeam.jobs.common DEBUG Running step Add machines
11:33:03,307 sunbeam.clusterd.service DEBUG [put] https://10.40.0.205:7000/1.0/nodes/frank-sloth, args={'data': '{"role": ["compute", "storage", "control"], "machineid": -1, "systemid": ""}'}
11:33:03,315 urllib3.connectionpool DEBUG https://10.40.0.205:7000 "PUT /1.0/nodes/frank-sloth HTTP/1.1" 200 108
11:33:03,316 sunbeam.clusterd.service DEBUG Response(<Response [200]>) = {"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":{}}

11:33:03,317 sunbeam.jobs.common DEBUG Finished running step 'Add machines'. Result: ResultType.COMPLETED
11:33:03,320 sunbeam.jobs.common DEBUG Starting step 'Deploy machines'
11:33:03,321 sunbeam.clusterd.service DEBUG [get] https://10.40.0.205:7000/1.0/nodes, args={'allow_redirects': True}
11:33:03,326 urllib3.connectionpool DEBUG https://10.40.0.205:7000 "GET /1.0/nodes HTTP/1.1" 200 203
11:33:03,327 sunbeam.clusterd.service DEBUG Response(<Response [200]>) = {"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":[{"name":"frank-sloth","role":["compute","control","storage"],"machineid":0,"systemid":"hdabk4"}]}

11:33:03,566 connector DEBUG Connector: closing controller connection
11:33:03,570 sunbeam.jobs.common DEBUG Running step Deploy machines
11:33:03,571 sunbeam.provider.maas.steps DEBUG Adding machine frank-sloth to model openstack-machines
11:33:03,792 connector DEBUG Connector: closing controller connection
11:33:03,846 sunbeam.clusterd.service DEBUG [put] https://10.40.0.205:7000/1.0/nodes/frank-sloth, args={'data': '{"role": null, "machineid": 1, "systemid": ""}'}
11:33:03,853 urllib3.connectionpool DEBUG https://10.40.0.205:7000 "PUT /1.0/nodes/frank-sloth HTTP/1.1" 200 108
11:33:03,853 sunbeam.clusterd.service DEBUG Response(<Response [200]>) = {"type":"sync","status":"Success","status_code":200,"operation":"","error_code":0,"error":"","metadata":{}}

Additional Notes:
Need to pass some timeout to wait_all_machines_deployed https://github.com/canonical/snap-openstack/blob/e753afbc73efefd6746f96373f5cc1008e334798/sunbeam-python/sunbeam/provider/maas/steps.py#L1294