Currently if you build a pxc cluster and possibly other datastore clusters, the taskmanager waits for the instances in the cluster to the active. If one of them never reaches the active state and goes in an ERROR state the taskmanager continues to poll until timeout. The Cluster should fail fast if there is an instance that errors when creating a cluster.
taskmanager logs
2015-12-11 07:36:04.116 INFO trove.taskmanager.models [-] Created instance 8f457133-afaf-48a9-b4ce-822ed21438ce successfully.
2015-12-11 07:36:04.440 INFO trove.taskmanager.models [-] Created instance 33385f84-69a3-4e5e-8a5b-5cc946eee203 successfully.
2015-12-11 07:36:14.478 INFO trove.taskmanager.models [-] Created instance 8aeeb130-7158-450b-b416-06fb84c99cfd successfully.
2015-12-11 07:36:14.578 INFO trove.taskmanager.models [-] Created instance 82c8d676-82ec-462c-8ca7-16965a883cd4 successfully.
2015-12-11 07:36:16.634 INFO trove.taskmanager.models [-] Created instance 33167a34-5526-46c1-91f0-c22eae3d2f86 successfully.
2015-12-11 07:36:19.594 DEBUG trove.taskmanager.models [-] Checking service status of instance ids: [u'142e2e3b-d8e8-4fa4-9caf-8743e13d072c', u'33167a34-5526-46c1-91f0-c22eae3d2f86', u'33385f84-69a3-4e5e-8a5b-5cc946eee203', u'82c8d676-82ec-462c-8ca7-16965a883cd4', u'8aeeb130-7158-450b-b416-06fb84c99cfd', u'8f457133-afaf-48a9-b4ce-822ed21438ce', u'd39b7f8b-592b-42cd-8688-791d66e0f3c6'] from (pid=26327) _all_status_ready /opt/stack/trove/trove/taskmanager/models.py:207
...
continues to poll...
ubuntu@devstack2:~$ trove list --in
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
| ID | Name | Datastore | Datastore Version | Status | Flavor ID | Size |
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
| 142e2e3b-d8e8-4fa4-9caf-8743e13d072c | mongo-cluster-configsvr-3 | mongodb | 3.0 | ERROR | 7 | 2 |
| 33167a34-5526-46c1-91f0-c22eae3d2f86 | mongo-cluster-rs1-1 | mongodb | 3.0 | BUILD | 7 | 2 |
| 33385f84-69a3-4e5e-8a5b-5cc946eee203 | mongo-cluster-configsvr-2 | mongodb | 3.0 | BUILD | 7 | 2 |
| 82c8d676-82ec-462c-8ca7-16965a883cd4 | mongo-cluster-rs1-2 | mongodb | 3.0 | BUILD | 7 | 2 |
| 8aeeb130-7158-450b-b416-06fb84c99cfd | mongo-cluster-configsvr-1 | mongodb | 3.0 | BUILD | 7 | 2 |
| 8f457133-afaf-48a9-b4ce-822ed21438ce | mongo-cluster-rs1-3 | mongodb | 3.0 | BUILD | 7 | 2 |
| d39b7f8b-592b-42cd-8688-791d66e0f3c6 | mongo-cluster-mongos-1 | mongodb | 3.0 | ERROR | 7 | 2 |
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
ubuntu@devstack2:~$ trove cluster-list
+--------------------------------------+---------------+-----------+-------------------+-----------+
| ID | Name | Datastore | Datastore Version | Task Name |
+--------------------------------------+---------------+-----------+-------------------+-----------+
| 9d1a64a0-910f-4734-b39b-a861e36d584f | mongo-cluster | mongodb | 3.0 | BUILDING |
+--------------------------------------+---------------+-----------+-------------------+-----------+
Related to this bug... when all the instances you build in the cluster all goto ERROR state the cluster times out waiting for them and the cluster never gets out of the BUILDING state.
LOGS: loopingcall [-] Fixed interval looping call 'trove. common. utils.poll_ and_check' failed loopingcall Traceback (most recent call last): loopingcall File "/usr/local/ lib/python2. 7/dist- packages/ oslo_service/ loopingcall. py", line 135, in _run_loop loopingcall result = func(*self.args, **self.kw) loopingcall File "/opt/stack/ trove/trove/ common/ utils.py" , line 192, in poll_and_check loopingcall raise exception. PollTimeOut loopingcall PollTimeOut: Polling request timed out. loopingcall er.models [-] Timeout for all instance service statuses to become ready. er.models Traceback (most recent call last): er.models File "/opt/stack/ trove/trove/ taskmanager/ models. py", line 244, in _all_instances_ ready er.models time_out= CONF.usage_ timeout) er.models File "/opt/stack/ trove/trove/ common/ utils.py" , line 208, in poll_until er.models sleep_time= sleep_time, time_out= time_out) .wait() er.models File "/usr/local/ lib/python2. 7/dist- packages/ eventlet/ event.py" , line 121, in wait er.models return hubs.get_ hub().switch( ) er.models File "/usr/local/ lib/python2. 7/dist- packages/ eventlet/ hubs/hub. py", line 294, in switch er.models return self.greenlet. switch( ) er.models File "/usr/local/ lib/python2. 7/dist- packages/ oslo_service/ loopingcall. py", line 135, in _run_loop er.models result = func(*self.args, **self.kw) er.models File "/opt/stack/ trove/trove/ common/ utils.py" , line 192, in poll_and_check er.models raise exception. PollTimeOut er.models PollTimeOut: Polling request timed out. er.models 6d67-471a- b810-4d3b353189 ad', u'shard_id': None, u'deleted_at': None, u'id': u'00b91f8e- edc5-4248- 8c56-010b958794 17', u'datastore_ version_ id': u'93962cb1- 9566-44f8- 8187-ef37f351c0 ef', 'errors': {}, u'hostname': None, u'server_status': None, u'task_ description' : 'Build error: Server.', u'volume_size': 1, u'typ...
2015-12-16 18:01:28.506 ERROR oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.506 TRACE oslo.service.
2015-12-16 18:01:28.508 ERROR trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.508 TRACE trove.taskmanag
2015-12-16 18:01:28.514 DEBUG trove.db.models [-] Saving DBInstance: {u'cluster_id': u'565a9eea-