cli-overcloud-node-provision not passing capabilities when reserving instances
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ironic |
Invalid
|
Undecided
|
Unassigned | ||
tripleo |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Description
===========
I believe we should pass capabilities when reserving instances with metalsmith [a].
If we don't, we end up with instances scheduled anywhere and not respecting the scheduling hints.
Steps to reproduce
==================
export VIRTHOST=r720-2
export LIBGUESTFS_
cloud_config=
work_dir=
release=master
STANDARD_ARGS="-w $work_dir -R $release --no-clone --tags all --nodes $cloud_config"
bash quickstart.sh $STANDARD_ARGS -p quickstart.yml $VIRTHOST
bash quickstart.sh $STANDARD_ARGS -I --teardown none -p quickstart-
bash quickstart.sh $STANDARD_ARGS -I --teardown none -p quickstart-
bash quickstart.sh $STANDARD_ARGS -I --teardown none -p quickstart-
Expected result
===============
Capabilities should be honored so each role is scheduled at the right place
Actual result
=============
Bad scheduling [1]
Environment
===========
Master branch with tripleo-quickstart
Logs & Configs
==============
For example, this node [2] was imported with profile:compute but was provisioned with a ceph instance. This is problematic because it's not the same disk layout.
[1]
~~~
Created port overcloud-
Created port overcloud-
Created port overcloud-
Created port overcloud-
Created port overcloud-
Created port overcloud-
Created port overcloud-
Created port overcloud-
Attached port overcloud-
Attached port overcloud-
Attached port overcloud-
Provisioning started on node ceph-2 (UUID 86c40998-
Attached port overcloud-
Attached port overcloud-
Attached port overcloud-
Attached port overcloud-
Provisioning started on node ceph-1 (UUID ba7e64c7-
Attached port overcloud-
Provisioning started on node compute-1 (UUID 65ac0317-
Provisioning started on node ceph-0 (UUID 94fa1ec5-
Provisioning started on node control-1 (UUID e3ca3943-
Provisioning started on node control-2 (UUID eeafb6b3-
Provisioning started on node compute-0 (UUID c12beec4-
Provisioning started on node control-0 (UUID 38dbad09-
~~~
[2]
~~~
(undercloud) [stack@undercloud metalsmith]$ openstack baremetal allocation show 226d3b86-
+------
| Field | Value |
+------
| candidate_nodes | [] |
| created_at | 2020-12-
| extra | {} |
| last_error | None |
| name | overcloud-
| node_uuid | 65ac0317-
| owner | None |
| resource_class | baremetal |
| state | active |
| traits | [] |
| updated_at | 2020-12-
| uuid | 226d3b86-
+------
(undercloud) [stack@undercloud ~]$ openstack baremetal node show 65ac0317-
| instance_info | {'traits': [], 'capabilities': {'boot_option': 'local'}, 'display_name': 'overcloud-
| properties | {'cpus': '6', 'memory_mb': '16384', 'local_gb': '49', 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:
tags: | added: tripleo-ansible |
tags: | added: metalsmith |
tags: | added: deployment quickstart |
Changed in tripleo: | |
status: | New → Triaged |
milestone: | none → wallaby-2 |
Changed in ironic: | |
status: | New → Invalid |
After looking deeper into this, it might be because the scheduling is done with profile capabilities. This is a nova concept. I'll try with real scheduler hints and see if it works.