Xena: Managed inspection fails with error "Data too long for column 'user' at row 1"

Bug #2054594 reported by Michael Sherman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Afonne-CID

Bug Description

When running managed, in-band inspection with ironic inspector, we observe failures on some, but not all, nodes.

We trigger an inspection from the CLI via, e.g. "openstack baremetal node inspect p3-ssd-007"
After that point, the node stays in state "inspecting" until ironic_conductor is restarted. At that point, the node moves from "inspecting" to "error".

In the logs, we observe:

ironic.common.exception.HardwareInspectionFailure: Failed to inspect hardware. Reason: unable to start inspection: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error\n']: oslo_db.exception.DBDataError: (pymysql.err.DataError) (1406, "Data too long for column 'user' at row 1")

My guess, but I cannot confirm, is that this is caused by the 32 character limit for the "user" field in the node history model? https://github.com/openstack/ironic/blob/febf9d895d9d4c62a713e88a3d2af089e8831094/ironic/db/sqlalchemy/models.py#L436

Our keystone is using 64 character UUIDs for users, with OIDC federated login.

We are running Ironic and Ironic-Conductor built with Kolla, and deployed with Kolla-ansible, with the respective codebases matching the "stable/xena" branch.
We are using networking-generic-switch for tenant networking, and the same network is used for cleaning, provisioning, inspecting, and rescuing.

Ironic-conductor logs for the relevant request-id:

```
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:27.964 7 INFO ironic.conductor.task_manager [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Node 45dfcaba-0816-46b7-aa50-c62a32b1c1df moved to provision state "inspecting" from state "manageable"; target provision state is "manageable"
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:28.389 7 WARNING ironic.conductor.utils [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Not going to change node 45dfcaba-0816-46b7-aa50-c62a32b1c1df power state because current state = requested state = 'power off'.
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:28.585 7 INFO ironic.common.neutron [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Successfully removed node 45dfcaba-0816-46b7-aa50-c62a32b1c1df neutron ports.
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:28.586 7 INFO ironic.drivers.modules.network.neutron [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Adding inspection network to node 45dfcaba-0816-46b7-aa50-c62a32b1c1df
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:45.691 7 INFO ironic.common.neutron [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] For node 45dfcaba-0816-46b7-aa50-c62a32b1c1df in network a6bdcfb7-08d8-4e14-8b60-0fc9d3e8ce6c, successfully created ports (ironic ID: neutron ID): {'7fe76da6-c008-4f31-aa9c-097c7929016f': 'bbb9ceb8-6d68-4a0d-aa68-238720311eec', 'e9203c47-7e62-4ca7-b4a1-b2707f2d05ef': '0eca3c91-2fab-4ba3-ba91-0ed307b4fc5b'}.
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:52.826 7 ERROR ironic.drivers.modules.inspector [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Unable to start managed inspection for node 45dfcaba-0816-46b7-aa50-c62a32b1c1df: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error: openstack.exceptions.HttpException: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:07:52.944 7 INFO ironic.drivers.modules.network.neutron [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Removing ports from inspection network for node 45dfcaba-0816-46b7-aa50-c62a32b1c1df
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:08:08.076 7 INFO ironic.common.neutron [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Successfully removed node 45dfcaba-0816-46b7-aa50-c62a32b1c1df neutron ports.
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:08:08.321 7 WARNING ironic.conductor.utils [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Not going to change node 45dfcaba-0816-46b7-aa50-c62a32b1c1df power state because current state = requested state = 'power off'.: openstack.exceptions.HttpException: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error
/var/log/kolla/ironic/ironic-conductor.log:2024-02-21 10:08:08.329 7 ERROR root [req-5a43efbc-6f16-4baa-8616-856ffdf1a7fc d34b62393cdbd07a628635f223217d27640b8a074d04690bbcc6631151539fee 570aad8999f7499db99eae22fe9b29bb - cc362e54175b4798956a3b3d1fb53384 default] Original exception being dropped: ['Traceback (most recent call last):\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/ironic/drivers/modules/inspector.py", line 211, in _start_managed_inspection\n client.start_introspection(task.node.uuid, manage_boot=False)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/openstack/baremetal_introspection/v1/_proxy.py", line 75, in start_introspection\n return res.create(self, **kwargs)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/openstack/resource.py", line 1364, in create\n self._translate_response(response, has_body=has_body)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/openstack/resource.py", line 1177, in _translate_response\n exceptions.raise_from_response(response, error_message=error_message)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/openstack/exceptions.py", line 236, in raise_from_response\n raise cls(\n', 'openstack.exceptions.HttpException: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/ironic/conductor/manager.py", line 3797, in _do_inspect_hardware\n new_state = task.driver.inspect.inspect_hardware(task)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/ironic/drivers/modules/inspector.py", line 283, in inspect_hardware\n _start_managed_inspection(task)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/ironic/drivers/modules/inspector.py", line 217, in _start_managed_inspection\n _inspection_error_handler(task, error, raise_exc=True)\n', ' File "/var/lib/kolla/venv/lib/python3.8/site-packages/ironic/drivers/modules/inspector.py", line 148, in _inspection_error_handler\n raise exception.HardwareInspectionFailure(error=error)\n', 'ironic.common.exception.HardwareInspectionFailure: Failed to inspect hardware. Reason: unable to start inspection: HttpException: 500: Server Error for url: http://10.140.81.2:5050/v1/introspection/45dfcaba-0816-46b7-aa50-c62a32b1c1df?manage_boot=False, Internal server error\n']: oslo_db.exception.DBDataError: (pymysql.err.DataError) (1406, "Data too long for column 'user' at row 1")
```

Revision history for this message
Michael Sherman (msherman-uchicago) wrote :
Dmitry Tantsur (divius)
Changed in ironic:
status: New → Triaged
importance: Undecided → High
Afonne-CID (cidelight)
Changed in ironic:
assignee: nobody → Afonne-CID (cidelight)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/ironic/+/910973

Changed in ironic:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.opendev.org/c/openstack/ironic/+/910973
Committed: https://opendev.org/openstack/ironic/commit/ef8bca007ddcbd10d58e83b1cc73533515bdb180
Submitter: "Zuul (22348)"
Branch: master

commit ef8bca007ddcbd10d58e83b1cc73533515bdb180
Author: CID <email address hidden>
Date: Mon Mar 4 17:11:31 2024 +0000

    Fix data length exceeding limit error

    This commit increases the length of the 'user' column to
    accommodate longer UUIDs, ensuring that the full user UUIDs are stored
    without exceeding the column limit.

    Closes-Bug: #2054594
    Change-Id: I59b435ca2bb5850bb2338228b64868c2003bfea3

Changed in ironic:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic 24.1.0

This issue was fixed in the openstack/ironic 24.1.0 release.

Revision history for this message
Michael Sherman (msherman-uchicago) wrote :

Are there plans to backport this to antelope? I can confirm the error is present in antelope/2023.1 as well.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Hey Michael,

Since this change included a DB migration it's not possible to backport it. I'm feeling that pain right now too.

You can likely manually migrate the column to get a workaround in the meantime.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.