Adopting heat server resrouce baremetal instances fails

Bug #1929555 reported by Harald Jensås
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Harald Jensås

Bug Description

NOTE, the output below has extra debug info.

Issue, Key error when creating/adopting overcloud node instance ports:

2021-05-21 10:44:30.516403 | fa163e74-25e0-1a2f-3ab5-00000000001c | OK | Provision instance network ports log | localhost | result={
    "changed": false,
    "instance_network_ports.logging": "Traceback (most recent call last):\n File \"/tmp/ansible_tripleo_overcloud_network_ports_payload_9e4fxprf/ansible_tripleo_overcloud_network_ports_payload
.zip/ansible/modules/tripleo_overcloud_network_ports.py\", line 599, in run_module\n File \"/tmp/ansible_tripleo_overcloud_network_ports_payload_9e4fxprf/ansible_tripleo_overcloud_network_port
s_payload.zip/ansible/modules/tripleo_overcloud_network_ports.py\", line 540, in tag_metalsmith_managed_ports\nKeyError: 'baremetal-26390-leaf1-1'\n"

# Generated with the following on 2021-05-21T10:39:59.963721
#
# openstack overcloud node extract provisioned --stack overcloud --roles-file /home/centos/overcloud/my_roles_data.yaml --output /home/centos/overcloud/baremetal_deployment.yaml
#

- name: Controller
  count: 3
  hostname_format: '%stackname%-controller-%index%'
  defaults:
    network_config:
      default_route_network:
      - External
      network_deployment_actions:
      - CREATE
      - UPDATE
      physical_bridge_name: br-ex
      public_interface_name: nic1
      template: templates/multiple_nics/multiple_nics_dvr.j2
    networks:
    - network: ctlplane
      subnet: ctlplane-leaf1
      vif: true
    - network: external
      subnet: external_subnet
    - network: internal_api
      subnet: internal_api_subnet
    - network: storage
      subnet: storage_subnet
    - network: storage_mgmt
      subnet: storage_mgmt_subnet
    - network: tenant
      subnet: tenant_subnet
  instances:
  - hostname: overcloud-controller-0
    name: baremetal-26390-leaf1-1
  - hostname: overcloud-controller-1
    name: baremetal-26390-leaf1-2
  - hostname: overcloud-controller-2
    name: baremetal-26390-leaf1-0
- name: Compute
  count: 1
  hostname_format: '%stackname%-novacompute-%index%'
  defaults:
    network_config:
      network_deployment_actions:
      - CREATE
      - UPDATE
      physical_bridge_name: br-ex
      public_interface_name: nic1
      template: templates/multiple_nics/multiple_nics_dvr.j2
    networks:
    - network: ctlplane
      subnet: ctlplane-leaf2
      vif: true
    - network: internal_api
      subnet: internal_api_subnet02
    - network: storage
      subnet: storage_subnet02
    - network: tenant
      subnet: tenant_subnet02
  instances:
  - hostname: overcloud-novacompute-0
    name: baremetal-26390-leaf2-0

The "Existing instances" does not have the correct hostname.

2021-05-21 10:44:18.638171 | fa163e74-25e0-1a2f-3ab5-000000000015 | TASK | DEBUG Existing instances
2021-05-21 10:44:18.710546 | fa163e74-25e0-1a2f-3ab5-000000000015 | OK | DEBUG Existing instances | localhost | result={
    "baremetal_existing.instances": [
        {
            "hostname": "baremetal-26390-leaf1-1",
            "id": "7caa634d-d97b-4914-a3f7-1a2dcd501d79",
            "name": "baremetal-26390-leaf1-1"
        },
        {
            "hostname": "baremetal-26390-leaf1-2",
            "id": "2252b722-c293-49cc-af65-01d7fe8374d3",
            "name": "baremetal-26390-leaf1-2"
        },
        {
            "hostname": "baremetal-26390-leaf1-0",
            "id": "854fa682-cc99-4edb-b9fe-51d869e8d4b9",
            "name": "baremetal-26390-leaf1-0"
        },
        {
            "hostname": "baremetal-26390-leaf2-0",
            "id": "fe8ddf28-ae33-4e85-83b0-6c7fe4702da2",
            "name": "baremetal-26390-leaf2-0"
        }
    ],
    "changed": false
}

The hostname in existing instances does not match the hostnames in the YAML definition exported from the stack.

Workaround, since the "Check Existing" task will create baremetal allocation's with the correct hostname. Re-running the command solves the problem.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/792972
Committed: https://opendev.org/openstack/tripleo-ansible/commit/f661f34c90b747772219f8a68a65128dafe871e1
Submitter: "Zuul (22348)"
Branch: master

commit f661f34c90b747772219f8a68a65128dafe871e1
Author: Harald Jensås <email address hidden>
Date: Tue May 25 15:12:21 2021 +0200

    Refresh instance after adding allocation

    When the baremetal deployment workflow does the
    "Check Existing" task it will create a baremetal
    allocation for provisioned nodes that does not
    already hae one.

    After creating the allocation, refresh the instance
    to ensure the hostname is correctly represented in
    the object returned in the list of "found" nodes.

    Closes-Bug: #1929555
    Change-Id: I0d43f42c0d0b092218f78b32ac3bfd198c988de0

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/795002

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 4.0.0

This issue was fixed in the openstack/tripleo-ansible 4.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/795002
Committed: https://opendev.org/openstack/tripleo-ansible/commit/e7186bf54b94b541f48ff84482108341c9cf037f
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit e7186bf54b94b541f48ff84482108341c9cf037f
Author: Harald Jensås <email address hidden>
Date: Tue May 25 15:12:21 2021 +0200

    Refresh instance after adding allocation

    When the baremetal deployment workflow does the
    "Check Existing" task it will create a baremetal
    allocation for provisioned nodes that does not
    already hae one.

    After creating the allocation, refresh the instance
    to ensure the hostname is correctly represented in
    the object returned in the list of "found" nodes.

    Closes-Bug: #1929555
    Change-Id: I0d43f42c0d0b092218f78b32ac3bfd198c988de0
    (cherry picked from commit f661f34c90b747772219f8a68a65128dafe871e1)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 3.2.0

This issue was fixed in the openstack/tripleo-ansible 3.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.