Openstack overcloud node provision fails "Failed to connect to the host via ssh: ssh: Could not resolve hostname" for a node which is undeployed using metalsmith undeploy command

Bug #1969356 reported by swogat pradhan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
New
Undecided
Unassigned

Bug Description

After running metalsmith undeploy <failed node>, when i run "openstack overcloud node provision --stack overcloud --network-config --overcloud-ssh-key /home/stack/sshkey/id_rsa --output overcloud-baremetal-deployed.yaml overcloud-baremetal-deploy.yaml"

the node provision process starts and fails after sometime in the growvols section:

PLAY [Overcloud Node Grow Volumes] *********************************************
2022-04-18 18:34:36.538550 | 48d539a1-1679-3cd6-4618-000000000013 | TASK | Wait for provisioned nodes to boot
[WARNING]: Reset is not implemented for this connection
[WARNING]: Reset is not implemented for this connection
[WARNING]: Reset is not implemented for this connection
[WARNING]: Reset is not implemented for this connection
[WARNING]: Reset is not implemented for this connection
[WARNING]: Reset is not implemented for this connection
2022-04-18 18:34:47.491774 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novecompute-1
2022-04-18 18:34:47.494887 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novecompute-1 | 0:00:10.998182 | 10.87s
2022-04-18 18:34:47.546795 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novacompute-2
2022-04-18 18:34:47.548397 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novacompute-2 | 0:00:11.051706 | 10.98s
2022-04-18 18:34:47.588085 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novacompute-3
2022-04-18 18:34:47.589687 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novacompute-3 | 0:00:11.093003 | 11.00s
2022-04-18 18:34:47.611106 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novacompute-0
2022-04-18 18:34:47.612727 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novacompute-0 | 0:00:11.116033 | 11.07s
2022-04-18 18:34:47.723571 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novacompute-5
2022-04-18 18:34:47.725272 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novacompute-5 | 0:00:11.228587 | 11.12s
2022-04-18 18:34:47.769155 | 48d539a1-1679-3cd6-4618-000000000013 | OK | Wait for provisioned nodes to boot | overcloud-novacompute-1
2022-04-18 18:34:47.770876 | 48d539a1-1679-3cd6-4618-000000000013 | TIMING | Wait for provisioned nodes to boot | overcloud-novacompute-1 | 0:00:11.274187 | 11.22s
2022-04-18 18:34:47.784141 | 48d539a1-1679-3cd6-4618-000000000015 | TASK | Find the growvols utility
2022-04-18 18:34:48.861226 | 48d539a1-1679-3cd6-4618-000000000015 | CHANGED | Find the growvols utility | overcloud-novacompute-0
2022-04-18 18:34:48.864844 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novacompute-0 | 0:00:12.368129 | 1.08s
2022-04-18 18:34:48.898703 | 48d539a1-1679-3cd6-4618-000000000015 | CHANGED | Find the growvols utility | overcloud-novacompute-2
2022-04-18 18:34:48.900942 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novacompute-2 | 0:00:12.404254 | 0.98s
2022-04-18 18:34:48.903167 | 48d539a1-1679-3cd6-4618-000000000015 | CHANGED | Find the growvols utility | overcloud-novacompute-3
2022-04-18 18:34:48.905050 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novacompute-3 | 0:00:12.408365 | 0.95s
2022-04-18 18:34:48.931621 | 48d539a1-1679-3cd6-4618-000000000015 | CHANGED | Find the growvols utility | overcloud-novecompute-1
2022-04-18 18:34:48.933842 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novecompute-1 | 0:00:12.437153 | 0.84s
[WARNING]: Unhandled error in Python interpreter discovery for host overcloud-
novacompute-5: Failed to connect to the host via ssh: ssh: Could not resolve
hostname overcloud-novacompute-5: Name or service not known
[WARNING]: Unhandled error in Python interpreter discovery for host overcloud-
novacompute-1: Failed to connect to the host via ssh: ssh: Could not resolve
hostname overcloud-novacompute-1: Name or service not known
2022-04-18 18:34:57.088190 | 48d539a1-1679-3cd6-4618-000000000015 | UNREACHABLE | Find the growvols utility | overcloud-novacompute-5
2022-04-18 18:34:57.089999 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novacompute-5 | 0:00:20.593313 | 9.03s
2022-04-18 18:35:01.245531 | 48d539a1-1679-3cd6-4618-000000000015 | UNREACHABLE | Find the growvols utility | overcloud-novacompute-1
2022-04-18 18:35:01.247346 | 48d539a1-1679-3cd6-4618-000000000015 | TIMING | Find the growvols utility | overcloud-novacompute-1 | 0:00:24.750655 | 13.42s

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
overcloud-novacompute-0 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
overcloud-novacompute-1 : ok=1 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
overcloud-novacompute-2 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
overcloud-novacompute-3 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
overcloud-novacompute-5 : ok=1 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
overcloud-novecompute-1 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

In this current scenario i used metalsmith undeploy command for openstack-novacompute-5 and openstack-novacompute-1.

This issue should not occur as i have removed the 2 above mentioned compute nodes (novacompute5 and novacompute1) from the overcloud-baremetal-deploy.yaml file (which is being passed in openstack overcloud node provision command)

overcloud-baremetal-deploy.yaml:
- name: Controller
  count: 2
  defaults:
    networks:
    - network: ctlplane
      vif: true
    - network: external
      subnet: external_subnet
    - network: internal_api
      subnet: internal_api_subnet
    - network: storage
      subnet: storage_subnet
    - network: storage_mgmt
      subnet: storage_mgmt_subnet
    - network: tenant
      subnet: tenant_subnet
    network_config:
      template: /home/stack/templates/controller.j2
      default_route_network:
      - external
  instances:
    #- hostname: overcloud-controller-0
    #name: dc2-controller1
  - hostname: overcloud-controller-1
    name: dc2-controller2
    #- hostname: overcloud-controller-2
    #name: dc1-controller1
  - hostname: overcloud-controller-3
    name: dc1-controller2

- name: Compute
  count: 4
  defaults:
    networks:
    - network: ctlplane
      vif: true
    - network: internal_api
      subnet: internal_api_subnet
    - network: tenant
      subnet: tenant_subnet
    - network: storage
      subnet: storage_subnet
    network_config:
      template: /home/stack/templates/compute.j2
  instances:
  - hostname: overcloud-novacompute-0
    name: dc2-compute1
  - hostname: overcloud-novecompute-1
    name: dc2-compute2
  - hostname: overcloud-novacompute-2
    name: dc1-compute1
  - hostname: overcloud-novacompute-3
    name: dc1-compute2

- name: CephStorage
  count: 3
  defaults:
    networks:
    - network: ctlplane
      vif: true
    - network: internal_api
      subnet: internal_api_subnet
    - network: storage
      subnet: storage_subnet
    - network: storage_mgmt
      subnet: storage_mgmt_subnet
    network_config:
      template: /home/stack/templates/ceph-storage.j2
  instances:
  - hostname: overcloud-cephstorage-0
    name: dc2-ceph1
    # - hostname: overcloud-cephstorage-1
    # name: dc2-ceph2
  - hostname: overcloud-cephstorage-2
    name: dc1-ceph1
  - hostname: overcloud-cephstorage-3
    name: dc1-ceph2

I have also tried adding the failed nodes as mentioned below in the yaml file:
    - hostname: overcloud-novacompute-5
      name: dc2-compute3
      provisioned: false
    - hostname: overcloud-novacompute-1
      name: dc1-controller1
      provisioned: false

Still the issue persists.
When i try openstack overcloud node delete or unprovision i receive an output that says 'No nodes to unprovision'

Steps to Reproduce:
1. Install openstack tripleo for wallaby
2. Introspect the nodes
3. provision network using openstack overcloud network provision
4. provision node using openstack overcloud node provision (if any node fails or if you stop the command halfway follow the next step)
5. metalsmith undeploy <failed node> (metalsmith list should show deploying for that particular node)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.