Node servicing tear down fails

Bug #2069413 reported by Przemyslaw Szczerbik
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Medium
Przemyslaw Szczerbik

Bug Description

Description:

Node servicing tear down fails due to missing PXE config file when [pxe]enable_netboot_fallback option is enabled.

Actual result:

Node servicing step is executed sucessfully, but because tear down fails the node remains in 'service failed' state.

Expected result:

Node servicing is executed successfully and node returns to active state.

Steps to reproduce:
1. Provision a node with [pxe]enable_netboot_fallback = true in Ironic configuration
2. Execute node servicing with any step. Example:
[
  {
    "interface": "bios",
    "step": "apply_configuration",
    "args": {
      "settings": [
        {
          "name": "option-name",
          "value": "option-value"
        }
      ]
    }
  }
]
3. Check node status / Ironic logs

Logs:

2024-06-14 09:48:27.094 1 INFO ironic.conductor.utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Successfully set node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce power state to power off by power off.ESC[00m
2024-06-14 09:48:27.095 1 DEBUG ironic.common.pxe_utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Cleaning up PXE config for node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce clean_up_pxe_config /usr/lib/python3.9/site-packages/ironic/common/pxe_utils.py:416ESC[00m
2024-06-14 09:48:27.095 1 DEBUG ironic.drivers.modules.boot_mode_utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Deploy boot mode is bios for 42ebe99c-7aee-419e-b5d2-b0148d3c28ce. get_boot_mode_for_deploy /usr/lib/python3.9/site-packages/ironic/drivers/modules/boot_mode_utils.py:250ESC[00m
2024-06-14 09:48:27.104 1 DEBUG ironic.drivers.modules.image_cache [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Starting clean up for master image cache /shared/tftpboot/master_images clean_up /usr/lib/python3.9/site-packages/ironic/drivers/modules/image_cache.py:208ESC[00m
2024-06-14 09:48:27.106 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP request: GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970; headers: {'OData-Version': '4.0', 'Accept-Encoding': 'identity'}; body: None; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/p
ython3.9/site-packages/sushy/connector.py:154ESC[00m
2024-06-14 09:48:27.127 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP response for GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: status code: 200 _op /usr/lib/python3.9/site-packages/sushy/connector.py:303ESC[00m
2024-06-14 09:48:27.128 1 DEBUG sushy.resources.base [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Received representation of System /redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: {'_actions': {'reset': {'allowed_values': ['On', 'ForceOff', 'GracefulShutdown', 'GracefulRestart', 'ForceRestart', 'Nmi', 'ForceOn'], 'operation_ap
ply_time_support': None, 'target_uri': '/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970/Actions/ComputerSystem.Reset'}}, '_oem_vendors': None, '_settings': None, 'asset_tag': None, 'bios_version': None, 'boot': {'allowed_values': ['Pxe', 'Cd', 'Hdd'], 'enabled': <BootSourceOverrideEnabled.CONTINUOUS: 'Continuous'>, 'http_boot_uri': None, 'mod
e': None, 'target': <BootSource.PXE: 'Pxe'>}, 'boot_progress': None, 'description': None, 'hostname': None, 'identity': 'f6d7090b-71b3-4dbf-b5c6-0615118c1970', 'indicator_led': <IndicatorLED.LIT: 'Lit'>, 'links': {'oem_vendors': None}, 'maintenance_window': None, 'manufacturer': 'Sushy Emulator', 'memory_summary': {'health': None, 'size_gib': 16}, 'na
me': 'gh-node-1', 'part_number': None, 'power_state': <PowerState.OFF: 'Off'>, 'serial_number': None, 'sku': None, 'status': {'health': <Health.OK: 'OK'>, 'health_rollup': None, 'state': <State.ENABLED: 'Enabled'>}, 'system_type': None, 'uuid': 'f6d7090b-71b3-4dbf-b5c6-0615118c1970'} refresh /usr/lib/python3.9/site-packages/sushy/resources/base.py:694
ESC[00m
2024-06-14 09:48:27.128 1 DEBUG ironic.drivers.modules.boot_mode_utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Deploy boot mode is bios for 42ebe99c-7aee-419e-b5d2-b0148d3c28ce. get_boot_mode_for_deploy /usr/lib/python3.9/site-packages/ironic/drivers/modules/boot_mode_utils.py:250ESC[00m
2024-06-14 09:48:27.129 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP request: GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970; headers: {'OData-Version': '4.0', 'Accept-Encoding': 'identity'}; body: None; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/p
ython3.9/site-packages/sushy/connector.py:154ESC[00m
2024-06-14 09:48:27.151 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP response for GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: status code: 200 _op /usr/lib/python3.9/site-packages/sushy/connector.py:303ESC[00m
2024-06-14 09:48:27.152 1 DEBUG sushy.resources.base [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Received representation of System /redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: {'_actions': {'reset': {'allowed_values': ['On', 'ForceOff', 'GracefulShutdown', 'GracefulRestart', 'ForceRestart', 'Nmi', 'ForceOn'], 'operation_ap
ply_time_support': None, 'target_uri': '/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970/Actions/ComputerSystem.Reset'}}, '_oem_vendors': None, '_settings': None, 'asset_tag': None, 'bios_version': None, 'boot': {'allowed_values': ['Pxe', 'Cd', 'Hdd'], 'enabled': <BootSourceOverrideEnabled.CONTINUOUS: 'Continuous'>, 'http_boot_uri': None, 'mod
e': None, 'target': <BootSource.PXE: 'Pxe'>}, 'boot_progress': None, 'description': None, 'hostname': None, 'identity': 'f6d7090b-71b3-4dbf-b5c6-0615118c1970', 'indicator_led': <IndicatorLED.LIT: 'Lit'>, 'links': {'oem_vendors': None}, 'maintenance_window': None, 'manufacturer': 'Sushy Emulator', 'memory_summary': {'health': None, 'size_gib': 16}, 'na
me': 'gh-node-1', 'part_number': None, 'power_state': <PowerState.OFF: 'Off'>, 'serial_number': None, 'sku': None, 'status': {'health': <Health.OK: 'OK'>, 'health_rollup': None, 'state': <State.ENABLED: 'Enabled'>}, 'system_type': None, 'uuid': 'f6d7090b-71b3-4dbf-b5c6-0615118c1970'} refresh /usr/lib/python3.9/site-packages/sushy/resources/base.py:694
ESC[00m
2024-06-14 09:48:27.153 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP request: GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970; headers: {'OData-Version': '4.0', 'Accept-Encoding': 'identity'}; body: None; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/p
ython3.9/site-packages/sushy/connector.py:154ESC[00m
2024-06-14 09:48:27.176 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP response for GET http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: status code: 200 _op /usr/lib/python3.9/site-packages/sushy/connector.py:303ESC[00m
2024-06-14 09:48:27.176 1 DEBUG sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP request: PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970; headers: {'Content-Type': 'application/json', 'OData-Version': '4.0', 'Accept-Encoding': 'identity'}; body: {'Boot': {'BootSourceOverrideMode':
 'Legacy'}}; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.9/site-packages/sushy/connector.py:154ESC[00m
2024-06-14 09:48:27.374 1 DEBUG sushy.exceptions [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] HTTP response for PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970: status code: 500, error: Base.1.0.GeneralError: Error changing boot mode at libvirt URI "qemu:///system": no loader path specified and fir
mware auto selection disabled, extended: [{'@odata.type': '/redfish/v1/$metadata#Message.1.0.0.Message', 'MessageId': 'Base.1.0.GeneralError'}] __init__ /usr/lib/python3.9/site-packages/sushy/exceptions.py:122ESC[00m
2024-06-14 09:48:27.374 1 WARNING sushy.connector [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Initial request with eTag failed: HTTP PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970 returned code 500. Base.1.0.GeneralError: Error changing boot mode at libvirt URI "qemu:///system": no loader path sp
ecified and firmware auto selection disabled Extended information: [{'@odata.type': '/redfish/v1/$metadata#Message.1.0.0.Message', 'MessageId': 'Base.1.0.GeneralError'}]: sushy.exceptions.ServerSideError: HTTP PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970 returned code 500. Base.1.0.GeneralError: Error changing
 boot mode at libvirt URI "qemu:///system": no loader path specified and firmware auto selection disabled Extended information: [{'@odata.type': '/redfish/v1/$metadata#Message.1.0.0.Message', 'MessageId': 'Base.1.0.GeneralError'}]ESC[00m
2024-06-14 09:48:27.375 1 ERROR ironic.drivers.modules.redfish.management [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Setting boot mode to bios failed for node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce. Error: HTTP PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4dbf-b5c6-0615118c1970 returned code 500. Base.1.0.GeneralError
: Error changing boot mode at libvirt URI "qemu:///system": no loader path specified and firmware auto selection disabled Extended information: [{'@odata.type': '/redfish/v1/$metadata#Message.1.0.0.Message', 'MessageId': 'Base.1.0.GeneralError'}]: sushy.exceptions.ServerSideError: HTTP PATCH http://<ip>:<port>/redfish/v1/Systems/f6d7090b-71b3-4d
bf-b5c6-0615118c1970 returned code 500. Base.1.0.GeneralError: Error changing boot mode at libvirt URI "qemu:///system": no loader path specified and firmware auto selection disabled Extended information: [{'@odata.type': '/redfish/v1/$metadata#Message.1.0.0.Message', 'MessageId': 'Base.1.0.GeneralError'}]ESC[00m
2024-06-14 09:48:27.375 1 INFO ironic.drivers.modules.redfish.management [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Attempt to set boot mode on node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce failed to set boot mode as the node does not appear to support overriding the boot mode. Possibly partial Redfish implementation?ESC[00m
2024-06-14 09:48:27.376 1 DEBUG ironic.drivers.modules.boot_mode_utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Baremetal node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce boot mode is not set to boot mode bios. Assuming baremetal node is already in bios or driver set boot mode via some other mechanism: Driver redfish does not support set_b
oot_mode (disabled or not implemented). _set_boot_mode_on_bm /usr/lib/python3.9/site-packages/ironic/drivers/modules/boot_mode_utils.py:52ESC[00m
2024-06-14 09:48:27.377 1 DEBUG ironic.drivers.modules.boot_mode_utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Deploy boot mode is bios for 42ebe99c-7aee-419e-b5d2-b0148d3c28ce. get_boot_mode_for_deploy /usr/lib/python3.9/site-packages/ironic/drivers/modules/boot_mode_utils.py:250ESC[00m
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Failed to tear down from service for node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce, reason: [Errno 2] No such file or directory: '/shared/html/42ebe99c-7aee-419e-b5d2-b0148d3c28ce/config': FileNotFoundError: [Errno 2] No such file or director
y: '/shared/html/42ebe99c-7aee-419e-b5d2-b0148d3c28ce/config'
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils Traceback (most recent call last):
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/servicing.py", line 223, in _tear_down_node_service
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils task.driver.deploy.tear_down_service(task)
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils result = f(*args, **kwargs)
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/agent_base.py", line 827, in tear_down_service
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils deploy_utils.tear_down_inband_service(
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 830, in tear_down_inband_service
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils task.driver.boot.prepare_instance(task)
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils result = f(*args, **kwargs)
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/pxe_base.py", line 314, in prepare_instance
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils pxe_utils.build_service_pxe_config(
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/common/pxe_utils.py", line 1033, in build_service_pxe_config
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils deploy_utils.switch_pxe_config(
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 181, in switch_pxe_config
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils _replace_boot_line(path, boot_mode, is_whole_disk_image, iscsi_boot,
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 145, in _replace_boot_line
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils _replace_lines_in_file(path, pattern, boot_line)
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 107, in _replace_lines_in_file
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils with open(path) as f:
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils FileNotFoundError: [Errno 2] No such file or directory: '/shared/html/42ebe99c-7aee-419e-b5d2-b0148d3c28ce/config'
2024-06-14 09:48:27.377 1 ERROR ironic.conductor.utils ESC[00m
2024-06-14 09:48:27.407 1 DEBUG ironic.common.states [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Exiting old state 'servicing' in response to event 'fail' on_exit /usr/lib/python3.9/site-packages/ironic/common/states.py:360ESC[00m
2024-06-14 09:48:27.408 1 DEBUG ironic.common.states [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Entering new state 'service failed' in response to event 'fail' on_enter /usr/lib/python3.9/site-packages/ironic/common/states.py:366ESC[00m
2024-06-14 09:48:27.424 1 ERROR ironic.conductor.task_manager [None req-e0711340-c627-4608-af03-10d9353e2e11 - - - - - -] Node 42ebe99c-7aee-419e-b5d2-b0148d3c28ce moved to provision state "service failed" from state "servicing"; target provision state is "active": FileNotFoundError: [Errno 2] No such file or directory: '/shared/html/42ebe99c-7aee-419
e-b5d2-b0148d3c28ce/config'ESC[00m

Tags: pxe bug triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ironic/+/922006

Changed in ironic:
status: New → In Progress
Changed in ironic:
assignee: nobody → Przemyslaw Szczerbik (pszczerbik)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ironic/+/922024

Changed in ironic:
importance: Undecided → Medium
tags: added: bug
tags: added: triaged
tags: added: pxe
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/922006
Committed: https://opendev.org/openstack/ironic/commit/78f54dfefa6e184c8e77f0c42682e6b3a02199fe
Submitter: "Zuul (22348)"
Branch: master

commit 78f54dfefa6e184c8e77f0c42682e6b3a02199fe
Author: Przemyslaw Szczerbik <email address hidden>
Date: Fri May 24 02:23:28 2024 -0700

    Build PXE config for node in SERVICING state

    When [pxe]enable_netboot_fallback option is enabled, it's necessary to
    build PXE config for nodes in SERVICING provisioning state. Otherwise
    node servicing tear down will fail and node will be placed into
    servicing failed state.

    Closes-Bug: #2069413

    Change-Id: Ib00504563f9fa7bed99a0fa1949ac99ea6870875
    Signed-off-by: Przemyslaw Szczerbik <email address hidden>

Changed in ironic:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.