This has been seen on a ovb deployment
Overcloud-prep-images runs forever:
TASK [overcloud-prep-images : Create overcloud prep-images script] ************************************************************************************************************************************************
task path: /var/tmp/tripleo_local/usr/local/share/ansible/roles/overcloud-prep-images/tasks/create-scripts.yml:3
Thursday 15 February 2018 12:50:36 +0100 (0:00:00.137) 1:08:59.809 *****
changed: [undercloud] => {"changed": true, "checksum": "dc3c4046c781a587c2f44612ce51b5ec3868c454", "dest": "/home/stack/overcloud-prep-images.sh", "failed": false, "gid": 1001, "group": "stack", "md5sum": "df5b1b0edf886e830819ec8b6b7e833f", "mode": "0755", "owner": "stack", "secontext": "unconfined_u:object_r:user_home_t:s0", "size": 643, "src": "/home/stack/.ansible/tmp/ansible-tmp-1518695436.2-236182804244208/source", "state": "file", "uid": 1001}
TASK [overcloud-prep-images : Prepare the overcloud images for deploy] ********************************************************************************************************************************************
task path: /var/tmp/tripleo_local/usr/local/share/ansible/roles/overcloud-prep-images/tasks/overcloud-prep-images.yml:1
Thursday 15 February 2018 12:50:39 +0100 (0:00:03.191) 1:09:03.001 *****
At the same time, overcloud_prep_images.log contains:
2018-02-15 11:52:44 | Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: 43b583d1-aa2f-4184-b81b-25936c33ea8d
2018-02-15 11:52:44 |
2018-02-15 11:52:44 |
2018-02-15 11:52:44 | Nodes set to managed.
2018-02-15 11:52:44 | Successfully registered node UUID 506b96b4-e736-4c02-8229-e0d5efbcb0ee
2018-02-15 11:52:44 | Successfully registered node UUID d5f346a7-af70-40b7-9d7b-9a4390383594
2018-02-15 11:52:44 | + openstack overcloud node introspect --all-manageable
2018-02-15 11:52:47 | Waiting for messages on queue 'tripleo' with no timeout.
2018-02-15 12:17:07 | Waiting for introspection to finish...
2018-02-15 12:17:07 | Started Mistral Workflow tripleo.baremetal.v1.introspect_manageable_nodes. Execution ID: 1c12d0e7-18ed-4878-b0c1-6620d01481d0
2018-02-15 12:17:07 | Introspection of node d5f346a7-af70-40b7-9d7b-9a4390383594 timed out.
2018-02-15 12:17:07 | Introspection of node 506b96b4-e736-4c02-8229-e0d5efbcb0ee timed out.
2018-02-15 12:17:07 | Retrying 2 nodes that failed introspection. Attempt 2 of 3
2018-02-15 12:17:07 | Introspection of node 506b96b4-e736-4c02-8229-e0d5efbcb0ee completed. Status:SUCCESS. Errors:None
2018-02-15 12:17:07 | Introspection of node d5f346a7-af70-40b7-9d7b-9a4390383594 completed. Status:SUCCESS. Errors:None
2018-02-15 12:17:07 | Successfully introspected nodes.
2018-02-15 12:17:07 | Nodes introspected successfully.
2018-02-15 12:17:07 | Introspection completed.
2018-02-15 12:17:07 | + openstack overcloud node provide --all-manageable
2018-02-15 12:17:10 | Waiting for messages on queue 'tripleo' with no timeout.
In ironic app.log, I found:
2018-02-15 12:17:12.873 29033 DEBUG wsme.api [req-2d3878c7-04e7-4c03-ab0a-58265156b970 d084dc55b42346e6a2b036a088e2be7d 410a77c57a134350ab430e7886ac3256 - default default] Client-side error: Node d5f346a7-af70-4
0b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 226, in inner
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ironic/conductor/manager.py", line 1385, in do_provisioning_action
% action) as task:
File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 168, in acquire
driver_name=driver_name, purpose=purpose)
File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 244, in __init__
self.release_resources()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 227, in __init__
self._lock()
File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 276, in _lock
reserve_node()
File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/usr/lib/python2.7/site-packages/retrying.py", line 229, in call
raise attempt.get()
File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py", line 269, in reserve_node
self.node_id)
File "/usr/lib/python2.7/site-packages/ironic/objects/node.py", line 290, in reserve
db_node = cls.dbapi.reserve_node(tag, node_id)
File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
ectxt.value = e.inner_exc
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 135, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ironic/db/sqlalchemy/api.py", line 298, in reserve_node
host=node['reservation'])
NodeLocked: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed.
format_exception /usr/lib/python2.7/site-packages/wsme/api.py:222
Apparently, this results in mistral errors
{"error_message": "{\"debuginfo\":null,\"faultcode\":\"Client\",\"faultstring\":\"Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is co
mpleted.\"}"}
log_http_response /usr/lib/python2.7/site-packages/ironicclient/common/http.py:292
2018-02-15 12:17:33.584 31770 ERROR ironicclient.common.http [req-df7226ad-d563-4dcd-9e81-dc2a9b65b562 d084dc55b42346e6a2b036a088e2be7d 410a77c57a134350ab430e7886ac3256 - default default] Error contacting Ironic
server: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409). Attempt 6 of 6: Conflict: Node d5f346a7-af70-40b7-9d
7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409)
2018-02-15 12:17:33.589 31770 WARNING mistral.actions.openstack.base [req-df7226ad-d563-4dcd-9e81-dc2a9b65b562 d084dc55b42346e6a2b036a088e2be7d 410a77c57a134350ab430e7886ac3256 - default default] Traceback (most
recent call last):
File "/usr/lib/python2.7/site-packages/mistral/actions/openstack/base.py", line 117, in run
result = method(**self._kwargs_for_run)
File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 513, in set_provision_state
return self.update(path, body, http_method='PUT')
File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 336, in update
method=http_method)
File "/usr/lib/python2.7/site-packages/ironicclient/common/base.py", line 191, in _update
resp, body = self.api.json_request(method, url, body=patch)
File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 395, in json_request
resp, body_iter = self._http_request(url, method, **kwargs)
File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 188, in wrapper
return func(self, url, method, **kwargs)
File "/usr/lib/python2.7/site-packages/ironicclient/common/http.py", line 376, in _http_request
error_json.get('debuginfo'), method, url)
Conflict: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409)
: Conflict: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409)
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor [req-df7226ad-d563-4dcd-9e81-dc2a9b65b562 d084dc55b42346e6a2b036a088e2be7d 410a77c57a134350ab430e7886ac3256 - default default] Failed to run action [action_ex_id=d8df9bd9-c156-4dfe-a2a2-0e450dc31234, action_cls='<class 'mistral.actions.action_factory.IronicAction'>', attributes='{u'client_method_name': u'node.set_provision_state'}', params='{u'state': u'provide', u'node_uuid': u'd5f346a7-af70-40b7-9d7b-9a4390383594'}']
IronicAction.node.set_provision_state failed: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409): ActionException: IronicAction.node.set_provision_state failed: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409)
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor Traceback (most recent call last):
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/mistral/executors/default_executor.py", line 110, in run_action
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor result = action.run(action_ctx)
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/mistral/actions/openstack/base.py", line 130, in run
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor (self.__class__.__name__, self.client_method_name, str(e))
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor ActionException: IronicAction.node.set_provision_state failed: Node d5f346a7-af70-40b7-9d7b-9a4390383594 is locked by host localhost.localdomain, please retry after the current operation is completed. (HTTP 409)
2018-02-15 12:17:33.589 31770 ERROR mistral.executors.default_executor
in ironic-conductor log, I found: python2. 7/site- packages/ ironic/ common/ utils.py: 76 python2. 7/site- packages/ ironic/ common/ utils.py: 77 loopingcall [-] Dynamic backoff interval looping call 'ironic. conductor. utils._ wait' failed: LoopingCallTimeOut: Looping call timed out after 19.72 seconds loopingcall Traceback (most recent call last): loopingcall File "/usr/lib/ python2. 7/site- packages/ oslo_service/ loopingcall. py", line 141, in _run_loop loopingcall idle = idle_for_ func(result, watch.elapsed()) loopingcall File "/usr/lib/ python2. 7/site- packages/ oslo_service/ loopingcall. py", line 338, in _idle_for loopingcall % self._error_time) loopingcall LoopingCallTimeOut: Looping call timed out after 19.72 seconds loopingcall conductor. utils [req-6184f919- 5874-4f2c- 87a8-0d9b5fde2a 1f 652b46cbc454495 7bf2909991a20b2 9b f5025b207a33405 a96d9f570742c8a 11 - default default] Timed out after 30 secs waiting for power power off on node 128d8c6c- 928a-4100- 8c44-e59ec6cc30 6e.: LoopingCallTimeOut: Looping call timed out after 19.72 seconds conductor. task_manager [req-6184f919- 5874-4f2c- 87a8-0d9b5fde2a 1f 652b46cbc454495 7bf2909991a20b2 9b f5025b207a33405 a96d9f570742c8a 11 - default default] Successfully released exclusive lock for changing node power state on node 128d8c6c- 928a-4100- 8c44-e59ec6cc30 6e (lock was held 39.03 sec) release_resources /usr/lib/ python2. 7/site- packages/ ironic/ conductor/ task_manager. py:352 y.processutils [-] Running cmd (subprocess): ipmitool -I lanplus -H 10.0.1.12 -L ADMINISTRATOR -U admin -R 12 -N 5 -f /tmp/tmpzfNatr power status execute /usr/lib/ python2. 7/site- packages/ oslo_concurrenc y/processutils. py:372 y.processutils [-] CMD "ipmitool -I lanplus -H 10.0.1.12 -L ADMINISTRATOR -U admin -R 12 -N 5 -f /tmp/tmpzfNatr power status" returned: 0 in 0.120s execute /usr/lib/ python2. 7/site- packages/ oslo_concurrenc y/processutils. py:409 python2. 7/site- packages/ ironic/ common/ utils.py: 75 python2. 7/site- packages/ ironic/ common/ utils.py: 76 python2. 7/site- pa...
2018-03-07 10:12:42.116 2922 DEBUG ironic.common.utils [-] Command stdout is: "Chassis Power is on
" execute /usr/lib/
2018-03-07 10:12:42.116 2922 DEBUG ironic.common.utils [-] Command stderr is: "" execute /usr/lib/
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR oslo.service.
2018-03-07 10:12:42.117 2922 ERROR ironic.
2018-03-07 10:12:42.149 2922 DEBUG ironic.
2018-03-07 10:12:42.273 2922 DEBUG oslo_concurrenc
2018-03-07 10:12:42.393 2922 DEBUG oslo_concurrenc
2018-03-07 10:12:42.394 2922 DEBUG ironic.common.utils [-] Execution completed, command line is "ipmitool -I lanplus -H 10.0.1.12 -L ADMINISTRATOR -U admin -R 12 -N 5 -f /tmp/tmpzfNatr power status" execute /usr/lib/
2018-03-07 10:12:42.394 2922 DEBUG ironic.common.utils [-] Command stdout is: "Chassis Power is on
" execute /usr/lib/
2018-03-07 10:12:42.395 2922 DEBUG ironic.common.utils [-] Command stderr is: "" execute /usr/lib/