Comment 4 for bug 1612622

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

I see there really long tasks, usually related to power status and power on/off set, they take from 25 to 36 seconds:

2016-08-24 06:04:33.419 29914 DEBUG ironic.conductor.task_manager [req-00061a52-3003-467c-8841-d6f85945dd18 - - - - -] Successfully released exclusive lock for changing node power state on node 7e7e6bae-605e-4466-8e34-21f3ba120c2b (lock was held 28.70 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:331

2016-08-24 10:14:58.671 28362 DEBUG ironic.conductor.task_manager [req-95d5865b-8dc9-4680-8363-6c15b43e79a8 - - - - -] Successfully released exclusive lock for provision action manage on node 153f5340-6c69-4c89-84f2-c86186dc0cbf (lock was held 36.23 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:331

But seems like we have a timeout about 20-25 seconds:
from ironic-api.log:

2016-08-23 23:44:08.016 29520 DEBUG wsme.api [req-34c76cc4-ceb0-4d42-842d-06f6a1ffbe27 admin admin - - -] Client-side error: Node 0ed88b39-4eb5-438f-aa27-80bdb47e0eba is locked by host localhost.localdomain, please retry after the current operation is completed.
Traceback (most recent call last):
.......
2016-08-23 23:44:24.779 29520 INFO eventlet.wsgi.server [req-4e4788ce-33f5-41ce-a780-9c220d345a8a admin admin - - -] 192.0.2.1 "PUT /v1/nodes/0ed88b39-4eb5-438f-aa27-80bdb47e0eba/states/provision HTTP/1.1" status: 409 len: 556 time: 2.1693511
2016-08-23 23:44:28.975 29520 INFO eventlet.wsgi.server [req-d00a9cdb-620b-4470-a1ca-64dd68ba4b3a admin admin - - -] 192.0.2.1 "PUT /v1/nodes/0ed88b39-4eb5-438f-aa27-80bdb47e0eba/states/provision HTTP/1.1" status: 409 len: 556 time: 2.1934800

So it's a race, I would go for increasing timeouts (where?) to 40-60 seconds at least. Operations themselves succeeds, it's always a timeout issue.