B&R: Restore fails when mgmt/cluster_host networks are not isolated

Bug #2045691 reported by Joshua Kraitberg
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Joshua Kraitberg

Bug Description

Brief Description
-----------------
Restore process appears to fail at '_upgrade_downgrade_kube_networking' with:

ERROR oslo.service.wsgi [-] Could not bind to 192.168.202.1:6385: OSError: [Errno 99] Cannot assign requested address

being thrown repeatedly after.

Severity
-----------------

Major

Steps to Reproduce
-----------------

apply WRA
apply oidc apps
apply metrics server
Restore system with: ansible-playbook /usr/share/ansible/stx-ansible/playbooks/ restore_platform.yml -e "initial_backup_dir=/home/sysadmin/" -e "ansible_become_ pass=Li69nux*" -e "admin_password=Li69nux*" -e "backup_filename=localhost_platfo rm_backup_2023_11_08_21_21_08.tgz" -e "restore_registry_filesystem=true" -e "res tore_mode=optimized"
[sysadmin@controller-0 helm(keystone_admin)]$ system application-list
+--------------------------+----------+-------------------------------------------+------------------+---------+-----------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+----------+-------------------------------------------+------------------+---------+-----------+
| cert-manager | 22.12-8 | cert-manager-fluxcd-manifests | fluxcd-manifests | applied | completed |
| metrics-server | 22.12-1 | metrics-server-fluxcd-manifests | fluxcd-manifests | applied | completed |
| nginx-ingress-controller | 22.12-1 | nginx-ingress-controller-fluxcd-manifests | fluxcd-manifests | applied | completed |
| oidc-auth-apps | 22.12-6 | oidc-auth-apps-fluxcd-manifests | fluxcd-manifests | applied | completed |
| platform-integ-apps | 22.12-66 | platform-integ-apps-fluxcd-manifests | fluxcd-manifests | applied | completed |
| wr-analytics | 23.09-0 | wr-analytics-fluxcd-manifests | fluxcd-manifests | applied | completed |
+--------------------------+----------+-------------------------------------------+------------------+---------+-----------+

Expected Behavior
-----------------

Restore completes. System is healthy.

Actual Behavior
-----------------

system-* commands fail w/:

[sysadmin@controller-0 ~(keystone_admin)]$ system host-list
Unable to establish connection to http://192.168.204.1:6385/v1/ihosts: HTTPConnectionPool(host='192.168.204.1', port=6385): Max retries exceeded with url: /v1/ihosts (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0f7125cdc0>: Failed to establish a new connection: [Errno 111] Connection refused'))

Reproducibility
-----------------

Seen once

System Configuration
-----------------

AIO-SX IPv4

wrcp-synce4-xr11

Load info (eg: 2022-03-10_20-00-07)
-----------------
2023-10-26_17-34-12

Last Pass
-----------------

Unknown on this lab

Timestamp/Logs
-----------------

/var/log/sysinv.log

skipping: [localhost] => (item=None)
skipping: [localhost] => (item=None)
skipping: [localhost] => (item=None)
skipping: [localhost]PLAY RECAP *********************************************************************
localhost : ok=73 changed=12 unreachable=0 failed=0 skipped=91 rescued=0 ignored=0 Wednesday 08 November 2023 22:00:15 +0000 (0:00:00.132) 0:01:19.229 ****
===============================================================================
common/push-docker-images : Download images and push to local registry -- 14.33s
Create the upgrade overrides file -------------------------------------- 10.15s
<snip>
Fail if kubernetes_version is not defined ------------------------------- 0.11s
.
sysinv 2023-11-08 22:00:15.499 22718 INFO sysinv.conductor.manager [-] _upgrade_downgrade_kube_networking executing playbook: /usr/share/ansible/stx-ansible/playbooks/upgrade-k8s-networking.yml for version v1.24.4
sysinv 2023-11-08 22:00:17.823 22718 INFO sysinv.conductor.manager [-] Preparing for restore procedure.
sysinv 2023-11-08 22:00:44.081 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost_patch_start_2023-11-08-22-00-44 patch
sysinv 2023-11-08 22:00:44.081 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 1. delta_handle ['action']
sysinv 2023-11-08 22:00:44.111 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost check_unlock_application
sysinv 2023-11-08 22:00:44.118 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost check_unlock_controller
sysinv 2023-11-08 22:00:44.290 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost check_unlock_worker
sysinv 2023-11-08 22:00:44.643 14683 INFO sysinv.api.controllers.v1.host [-] Memory: Total=125979 MiB, Allocated=16144 MiB, 2M: 57989 pages None pages pending, 1G: 113 pages None pages pending
sysinv 2023-11-08 22:00:44.655 14683 INFO sysinv.api.controllers.v1.host [-] host(controller-0) node(1): vm_mem_mib=115979,
sysinv 2023-11-08 22:00:44.655 14683 INFO sysinv.api.controllers.v1.host [-] Updating mem values of host(controller-0) node(1): {'vm_hugepages_nr_4K': 28117760}
sysinv 2023-11-08 22:00:45.688 14683 INFO sysinv.common.rest_api [-] GET cmd:http://192.168.204.1:5491/v1/query_hosts/ hdr:None payload:None
sysinv 2023-11-08 22:00:45.714 14683 WARNING sysinv.common.rest_api [-] HTTP Error e.code=503 e=HTTP Error 503: Service Unavailable: urllib.error.HTTPError: HTTP Error 503: Service Unavailable
sysinv 2023-11-08 22:00:45.715 14683 WARNING sysinv.api.controllers.v1.host [-] No response from patch api controller-0 e='NoneType' object is not subscriptable
sysinv 2023-11-08 22:00:45.728 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (22.12-66) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.729 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (22.12-8) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.729 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application nginx-ingress-controller (22.12-1) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.730 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application wr-analytics (23.09-0) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.740 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application oidc-auth-apps (22.12-6) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.740 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application metrics-server (22.12-1) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'mtc-action', 'extra': {'app_status': 'restore-requested', 'action': 'unlock'}}.
sysinv 2023-11-08 22:00:45.741 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 _semantic_mtc_check_action unlock
sysinv 2023-11-08 22:00:45.742 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 action=unlock ihost_val_prenotify: {'ihost_action': 'unlock'} ihost_val: {'ihost_action': 'unlock', 'task': 'Unlocking'}
sysinv 2023-11-08 22:00:45.742 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 host.update.ihost_val_prenotify {'ihost_action': 'unlock'}
sysinv 2023-11-08 22:00:45.742 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 action_check action=unlock, notify_vim=False notify_mtce=True rc=True
sysinv 2023-11-08 22:00:45.743 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 post action_check hostupdate action=unlock notify_vim=False notify_mtc=True skip_notify_mtce=False
sysinv 2023-11-08 22:00:45.743 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 stage_action unlock
sysinv 2023-11-08 22:00:45.743 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 _handle_unlock_action
sysinv 2023-11-08 22:00:45.744 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 Action staged: unlock
sysinv 2023-11-08 22:00:45.744 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 post action_stage hostupdate action=unlock notify_vim=False notify_mtc=True skip_notify_mtce=False
sysinv 2023-11-08 22:00:45.744 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 2. delta_handle ['action']
sysinv 2023-11-08 22:00:45.745 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 post delta_handle hostupdate action=unlock notify_vim=False notify_mtc=True skip_notify_mtce=False
sysinv 2023-11-08 22:00:45.745 14683 INFO sysinv.api.controllers.v1.host [-] update ihost_val_prenotify: {'ihost_action': 'unlock'}
sysinv 2023-11-08 22:00:45.764 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 apply ihost_val {'ihost_action': 'unlock', 'task': 'Unlocking'}
sysinv 2023-11-08 22:00:45.764 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 Perform configure_ihost.
sysinv 2023-11-08 22:00:45.766 22718 INFO sysinv.conductor.manager [-] configure_ihost controller-0
sysinv 2023-11-08 22:00:46.904 22718 INFO sysinv.puppet.sssd [-] UPDATE Remote LDAP Domains
sysinv 2023-11-08 22:00:46.921 22718 INFO sysinv.puppet.puppet [-] Updating hiera for host: controller-0 with config_uuid: None
sysinv 2023-11-08 22:00:51.637 22718 INFO sysinv.puppet.kubernetes [-] get_kubernetes_join_cmd join_cmd=kubeadm join 192.168.205.1:6443 --token a2uylh.ir6uds6eaanfj73o --discovery-token-ca-cert-hash sha256:716d9ae9cbcf7a10bdb0ec6e49fa7bffb20c4ed940e046b2df4e25f23af67248 --control-plane --certificate-key 1349ceaa98186c47798e070e31b0a7ff4839c24f32b2b7295de0a7f4d8d99627 --apiserver-advertise-address 192.168.205.2 --cri-socket /var/run/containerd/containerd.sock
sysinv 2023-11-08 22:00:53.022 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 Action unlock perform notify_mtce
sysinv 2023-11-08 22:00:53.022 14683 INFO sysinv.api.controllers.v1.mtce_api [-] number of calls to rest_api_request=1 (max_retry=3)
sysinv 2023-11-08 22:00:53.022 14683 INFO sysinv.common.rest_api [-] PATCH cmd:http://localhost:2112/v1/hosts/fbf0bc19-148a-45d5-b78e-fa2a779a7e8b hdr:{'Content-type': 'application/json', 'User-Agent': 'sysinv/1.0'} payload:{"id": 1, "isystem_uuid": "d1011a79-f4e1-4ee8-b10f-b6940811dbc9", "peer_id": null, "recordtype": "standard", "hostname": "controller-0", "personality": "controller", "subfunctions": "controller,worker", "subfunction_oper": "disabled", "subfunction_avail": "online", "apparmor": "disabled", "reserved": false, "uuid": "fbf0bc19-148a-45d5-b78e-fa2a779a7e8b", "invprovision": "provisioned", "mgmt_mac": "00:00:00:00:00:00", "mgmt_ip": "192.168.204.2", "bm_ip": "128.224.52.255", "bm_mac": null, "bm_type": "dynamic", "bm_username": "sysadmin", "location": {}, "serialid": null, "administrative": "locked", "operational": "disabled", "availability": "online", "inv_state": "inventoried", "mtce_info": "{bmc_protocol:redfish}", "action": "unlock", "config_status": null, "capabilities": {"is_max_cpu_configurable": "configurable", "min_cpu_mhz_allowed": null, "max_cpu_mhz_allowed": null, "cstates_available": "C1,C1E,C6,POLL", "stor_function": "monitor"}, "clock_synchronization": "ptp", "boot_device": "/dev/disk/by-path/pci-0000:00:17.0-ata-1.0", "rootfs_device": "/dev/disk/by-path/pci-0000:00:17.0-ata-1.0", "hw_settle": "0", "install_output": "text", "console": "ttyS0,115200n8", "tboot": "", "vsc_controllers": null, "ttys_dcd": false, "software_load": "22.12", "target_load": "22.12", "install_state": null, "install_state_info": null, "iscsi_initiator_name": "iqn.1993-08.org.debian:01:381fbaa76550", "device_image_update": null, "reboot_needed": false, "max_cpu_mhz_configured": null, "max_cpu_mhz_allowed": null, "operation": "modify"}
sysinv 2023-11-08 22:00:53.073 14683 INFO sysinv.api.controllers.v1.host [-] host controller-0 ihost_patch_end_2023-11-08-22-00-53 patch
sysinv 2023-11-08 22:00:53.091 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost_patch_start_2023-11-08-22-00-53 patch
sysinv 2023-11-08 22:00:53.091 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 1. delta_handle ['subfunction_avail', 'administrative', 'availability']
sysinv 2023-11-08 22:00:53.091 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 2. delta_handle ['administrative']
sysinv 2023-11-08 22:00:53.091 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 post delta_handle hostupdate action=None notify_vim=False notify_mtc=False skip_notify_mtce=False
sysinv 2023-11-08 22:00:53.148 14683 INFO sysinv.api.controllers.v1.host [-] host controller-0 ihost_patch_end_2023-11-08-22-00-53 patch
sysinv 2023-11-08 22:00:53.195 14683 INFO sysinv.api.controllers.v1.host [-] controller-0 1. delta_handle ['uptime', 'task']
sysinv 2023-11-08 22:00:58.631 13862 INFO sysinv.agent.manager [-] Agent config applied install
sysinv 2023-11-08 22:00:58.660 22718 INFO sysinv.conductor.manager [-] _remove_config_from_reboot_config_list host: fbf0bc19-148a-45d5-b78e-fa2a779a7e8b,config_uuid: install
sysinv 2023-11-08 22:00:58.675 22718 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 3f912841-f53b-482b-a8ae-e6fb2c065dca, applied install
sysinv 2023-11-08 22:00:58.675 22718 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: install vs. target: 3f912841-f53b-482b-a8ae-e6fb2c065dca.
sysinv 2023-11-08 22:00:58.729 13862 INFO sysinv.agent.manager [-] iscsi initiator name = iqn.1993-08.org.debian:01:381fbaa76550
sysinv 2023-11-08 22:00:59.460 22718 INFO sysinv.conductor.manager [-] Updating platform data for host: fbf0bc19-148a-45d5-b78e-fa2a779a7e8b with: {'availability': 'available', 'config_applied': 'install', 'iscsi_initiator_name': 'iqn.1993-08.org.debian:01:381fbaa76550', 'first_report': True}
sysinv 2023-11-08 22:00:59.557 22718 INFO sysinv.conductor.manager [-] _remove_config_from_reboot_config_list host: fbf0bc19-148a-45d5-b78e-fa2a779a7e8b,config_uuid: install
sysinv 2023-11-08 22:00:59.557 22718 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 3f912841-f53b-482b-a8ae-e6fb2c065dca, applied install
sysinv 2023-11-08 22:00:59.558 22718 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: install vs. target: 3f912841-f53b-482b-a8ae-e6fb2c065dca.
sysinv 2023-11-08 22:00:59.563 22718 INFO sysinv.conductor.manager [-] Evaluating apps reapply {'type': 'host-availability-updated', 'availability': 'available'}
sysinv 2023-11-08 22:00:59.572 22718 INFO sysinv.conductor.manager [-] Apps reapply order: ['platform-integ-apps', 'cert-manager', 'oidc-auth-apps']
sysinv 2023-11-08 22:00:59.573 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (22.12-66) started {'mode': 'auto', 'lifecycle_type': 'check', 'operation': 'evaluate-reapply', 'extra': {'trigger': {'type': 'host-availability-updated', 'availability': 'available'}}}.
sysinv 2023-11-08 22:00:59.575 22718 INFO sysinv.conductor.manager [-] Evaluating app reapply of platform-integ-apps
sysinv 2023-11-08 22:00:59.576 22718 INFO sysinv.conductor.manager [-] platform-integ-apps app active:True status:restore-requested does not warrant re-apply
sysinv 2023-11-08 22:00:59.576 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (22.12-8) started {'mode': 'auto', 'lifecycle_type': 'check', 'operation': 'evaluate-reapply', 'extra': {'trigger': {'type': 'host-availability-updated', 'availability': 'available'}}}.
sysinv 2023-11-08 22:00:59.576 22718 INFO sysinv.conductor.manager [-] Evaluate reapply for cert-manager rejected: trigger field availability expected services-enabled but got available
sysinv 2023-11-08 22:00:59.577 22718 INFO sysinv.conductor.kube_app [-] lifecycle hook for application oidc-auth-apps (22.12-6) started {'mode': 'auto', 'lifecycle_type': 'check', 'operation': 'evaluate-reapply', 'extra': {'trigger': {'type': 'host-availability-updated', 'availability': 'available'}}}.
sysinv 2023-11-08 22:00:59.580 22718 INFO sysinv.conductor.manager [-] Evaluating app reapply of oidc-auth-apps
sysinv 2023-11-08 22:00:59.580 22718 INFO sysinv.conductor.manager [-] oidc-auth-apps app active:True status:restore-requested does not warrant re-apply
sysinv 2023-11-08 22:00:59.660 13862 INFO sysinv.agent.manager [-] Sysinv Agent platform update by host: {'availability': 'available', 'config_applied': 'install', 'iscsi_initiator_name': 'iqn.1993-08.org.debian:01:381fbaa76550', 'first_report': True}
sysinv 2023-11-08 22:01:25.357 13905 INFO oslo_service.service [-] Caught SIGTERM, stopping children
sysinv 2023-11-08 22:01:25.359 13905 INFO oslo.service.wsgi [-] Stopping WSGI server.
sysinv 2023-11-08 22:01:25.360 13905 INFO oslo_service.service [-] Waiting on 1 children to exit
sysinv 2023-11-08 22:01:25.361 14683 INFO oslo.service.wsgi [-] Stopping WSGI server.
sysinv 2023-11-08 22:01:25.375 13905 INFO oslo_service.service [-] Child 14683 exited with status 0
sysinv 2023-11-08 22:01:25.376 13905 INFO oslo_service.service [-] Caught SIGTERM, stopping children
sysinv 2023-11-08 22:01:25.377 13905 INFO oslo.service.wsgi [-] Stopping WSGI server.
sysinv 2023-11-08 22:01:25.377 13905 INFO oslo_service.service [-] Waiting on 1 children to exit
sysinv 2023-11-08 22:01:25.377 14684 INFO oslo.service.wsgi [-] Stopping WSGI server.
sysinv 2023-11-08 22:01:25.382 13905 INFO oslo_service.service [-] Child 14684 exited with status 0
sysinv 2023-11-08 22:01:25.845 13862 INFO sysinv.openstack.common.service [-] Caught SIGTERM, exiting
sysinv 2023-11-08 22:01:26.522 22718 INFO sysinv.openstack.common.service [-] Caught SIGTERM, exiting
sysinv 2023-11-08 22:05:31.201 3148 INFO sysinv.agent.lldp.manager [-] Configured sysinv LLDP agent drivers: ['lldpd']
sysinv 2023-11-08 22:05:31.326 3148 INFO sysinv.agent.lldp.manager [-] Loaded sysinv LLDP agent drivers: ['lldpd']
sysinv 2023-11-08 22:05:31.326 3148 INFO sysinv.agent.lldp.manager [-] Registered sysinv LLDP agent drivers: ['lldpd']
sysinv 2023-11-08 22:05:31.330 3148 INFO sysinv.agent.manager [-] _report_to_conductor initial_reports_required={'numa', 'port', 'cpu', 'memory', 'pci_device', 'lvg', 'pv', 'disk'}
sysinv 2023-11-08 22:05:31.330 3148 INFO sysinv.agent.manager [-] Sysinv Agent audit running inv_get_and_report.
sysinv 2023-11-08 22:05:31.339 3148 INFO sysinv.zmq_rpc.zmq_rpc [-] Starting zmq server at tcp://[192.168.204.2]:9502
sysinv 2023-11-08 22:05:32.190 3148 WARNING sysinv.agent.pci [-] Enabling device eno8303 to query link speed
sysinv 2023-11-08 22:05:33.770 3148 WARNING sysinv.agent.pci [-] eno8303 did not become operational after 16 attempts
sysinv 2023-11-08 22:05:33.772 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: eno8303 (link operstate: down)
sysinv 2023-11-08 22:05:33.772 3148 WARNING sysinv.agent.pci [-] Disabling device eno8303 after querying link speed
sysinv 2023-11-08 22:05:34.513 3148 WARNING sysinv.agent.pci [-] Enabling device eno8403 to query link speed
sysinv 2023-11-08 22:05:36.095 3148 WARNING sysinv.agent.pci [-] eno8403 did not become operational after 16 attempts
sysinv 2023-11-08 22:05:36.096 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: eno8403 (link operstate: down)
sysinv 2023-11-08 22:05:36.096 3148 WARNING sysinv.agent.pci [-] Disabling device eno8403 after querying link speed
sysinv 2023-11-08 22:05:36.882 3148 WARNING sysinv.agent.pci [-] Enabling device eno8503 to query link speed
sysinv 2023-11-08 22:05:38.473 3148 WARNING sysinv.agent.pci [-] eno8503 did not become operational after 16 attempts
sysinv 2023-11-08 22:05:38.475 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: eno8503 (link operstate: down)
sysinv 2023-11-08 22:05:38.475 3148 WARNING sysinv.agent.pci [-] Disabling device eno8503 after querying link speed
sysinv 2023-11-08 22:05:39.261 3148 WARNING sysinv.agent.pci [-] Enabling device eno8603 to query link speed
sysinv 2023-11-08 22:05:40.842 3148 WARNING sysinv.agent.pci [-] eno8603 did not become operational after 16 attempts
sysinv 2023-11-08 22:05:40.844 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: eno8603 (link operstate: down)
sysinv 2023-11-08 22:05:40.844 3148 WARNING sysinv.agent.pci [-] Disabling device eno8603 after querying link speed
sysinv 2023-11-08 22:05:40.917 3148 INFO sysinv.common.utils [-] Could not acquire lock(12): [Errno 11] Resource temporarily unavailable (1/5), will retry
sysinv 2023-11-08 22:05:50.297 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f1 to query link speed
sysinv 2023-11-08 22:05:50.401 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f1 (link operstate: down)
sysinv 2023-11-08 22:05:50.402 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f1 after querying link speed
sysinv 2023-11-08 22:05:51.184 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f2 to query link speed
sysinv 2023-11-08 22:05:51.288 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f2 (link operstate: down)
sysinv 2023-11-08 22:05:51.289 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f2 after querying link speed
sysinv 2023-11-08 22:05:52.802 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f4 to query link speed
sysinv 2023-11-08 22:05:52.894 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f4 (link operstate: down)
sysinv 2023-11-08 22:05:52.895 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f4 after querying link speed
sysinv 2023-11-08 22:05:53.691 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f5 to query link speed
sysinv 2023-11-08 22:05:53.777 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f5 (link operstate: down)
sysinv 2023-11-08 22:05:53.779 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f5 after querying link speed
sysinv 2023-11-08 22:05:54.551 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f6 to query link speed
sysinv 2023-11-08 22:05:54.643 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f6 (link operstate: down)
sysinv 2023-11-08 22:05:54.644 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f6 after querying link speed
sysinv 2023-11-08 22:05:55.412 3148 WARNING sysinv.agent.pci [-] Enabling device enp138s0f7 to query link speed
sysinv 2023-11-08 22:05:55.538 3148 WARNING sysinv.agent.pci [-] Port speed detected as -1 for: enp138s0f7 (link operstate: down)
sysinv 2023-11-08 22:05:55.539 3148 WARNING sysinv.agent.pci [-] Disabling device enp138s0f7 after querying link speed
sysinv 2023-11-08 22:06:51.742 14066 WARNING oslo_config.cfg [-] Deprecated: Option "auth_uri" from group "keystone_authtoken" is deprecated. Use option "www_authenticate_uri" from group "keystone_authtoken".
sysinv 2023-11-08 22:06:51.743 14066 WARNING oslo_config.cfg [-] Deprecated: Option "auth_uri" from group "keystone_authtoken" is deprecated for removal (The auth_uri option is deprecated in favor of www_authenticate_uri and will be removed in the S release.). Its value may be silently ignored in the future.
sysinv 2023-11-08 22:06:51.745 14066 WARNING keystonemiddleware.auth_token [-] AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True.
sysinv 2023-11-08 22:06:51.748 14066 INFO oslo.service.wsgi [-] sysinv_api listening on 192.168.204.1:6385
sysinv 2023-11-08 22:06:51.748 14066 INFO oslo_service.service [-] Starting 1 workers
sysinv 2023-11-08 22:06:51.755 14066 WARNING keystonemiddleware._common.config [-] The option "auth_url" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware._common.config [-] The option "project_name" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware._common.config [-] The option "project_domain_name" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware._common.config [-] The option "username" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware._common.config [-] The option "user_domain_name" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware._common.config [-] The option "password" is not known to keystonemiddleware
sysinv 2023-11-08 22:06:51.756 14066 WARNING keystonemiddleware.auth_token [-] AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True.
sysinv 2023-11-08 22:06:51.758 14066 ERROR oslo.service.wsgi [-] Could not bind to 192.168.202.1:6385: OSError: [Errno 99] Cannot assign requested address
sysinv 2023-11-08 22:06:51.759 14066 CRITICAL sysinv [-] Unhandled error: OSError: [Errno 99] Cannot assign requested address
2023-11-08 22:06:51.759 14066 ERROR sysinv Traceback (most recent call last):
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/bin/sysinv-api", line 10, in <module>
2023-11-08 22:06:51.759 14066 ERROR sysinv sys.exit(main())
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/sysinv/cmd/api.py", line 81, in main
2023-11-08 22:06:51.759 14066 ERROR sysinv launcher_pxe = sysinv_pxe()
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/sysinv/cmd/api.py", line 65, in sysinv_pxe
2023-11-08 22:06:51.759 14066 ERROR sysinv server = wsgi_service.WSGIService('sysinv_api_pxe',
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/sysinv/common/wsgi_service.py", line 60, in __init__
2023-11-08 22:06:51.759 14066 ERROR sysinv self.server = wsgi.Server(CONF, name, self.app,
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/oslo_service/wsgi.py", line 113, in __init__
2023-11-08 22:06:51.759 14066 ERROR sysinv self.socket = self._get_socket(host, port, backlog)
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/oslo_service/wsgi.py", line 141, in _get_socket
2023-11-08 22:06:51.759 14066 ERROR sysinv sock = eventlet.listen(bind_addr, family, backlog=backlog)
2023-11-08 22:06:51.759 14066 ERROR sysinv File "/usr/lib/python3/dist-packages/eventlet/convenience.py", line 78, in listen
2023-11-08 22:06:51.759 14066 ERROR sysinv sock.bind(addr)
2023-11-08 22:06:51.759 14066 ERROR sysinv OSError: [Errno 99] Cannot assign requested address
2023-11-08 22:06:51.759 14066 ERROR sysinv
sysinv 2023-11-08 22:06:51.802 14637 INFO oslo_service.service [-] Parent process has died unexpectedly, exiting
Alarms

+----------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+----------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+
| 400.001 | Service group controller-services failure; sysinv-inv(enabling, ) | service_domain=controller.service_group=controller-services.host= | critical | 2023-11-09T01:18:03.786137 |
| | | controller-0 | | |
| | | | | |
| 250.001 | controller-0 Configuration is out-of-date. (applied: install target: 3f912841-f53b-482b-a8ae-e6fb2c065dca) | host=controller-0 | major | 2023-11-08T22:07:06.089775 |
| 400.002 | Service group controller-services has no active members available; expected 1 active member | service_domain=controller.service_group=controller-services | critical | 2023-11-08T22:06:48.640423 |
| 400.002 | Service group vim-services has no active members available; expected 1 active member | service_domain=controller.service_group=vim-services | critical | 2023-11-08T22:06:47.806397 |
+----------+------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------+----------+----------------------------+

Test Activity
-----------------

Regression Testing

Workaround
-----------------

Describe workaround if available

Changed in starlingx:
assignee: nobody → Joshua Kraitberg (jkraitbe-wr)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/902736
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/b44094601f205f11d95ae519430de4ac878f5e4c
Submitter: "Zuul (22348)"
Branch: master

commit b44094601f205f11d95ae519430de4ac878f5e4c
Author: Joshua Kraitberg <email address hidden>
Date: Mon Dec 4 15:26:48 2023 -0500

    Improve optimized restore networking robustness

    If the network configuration is not isolated restore will not work.

    As an example, suppose you have two systems using a MGMT IP of
    192.168.101.2, if system 1 is undergoing a restore and is able to ping
    192.168.101.2 of system 2, it's restore will fail because it will assume
    the MGMT IP is already available.

    To fix this, available IPs will be checked internally instead of via
    the ping command.

    TEST PLAN
    PASS: AIO-SX optimized restore on master
    PASS: subcloud AIO-SX optimized upgrade stx6 to stx8
    PASS: subcloud AIO-SX optimized restore on stx8
    PASS: subcloud AIO-SX optimized restore on stx8, after upgrade

    Closes-Bug: 2045691
    Change-Id: I45a18516ce8a193d7854d03028eb469970d074a7
    Signed-off-by: Joshua Kraitberg <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.networking stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.