cant delete mongo cluster if an instance fails to build the replica set

Bug #1525347 reported by Craig Vyvial
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack DBaaS (Trove)
New
Undecided
Unassigned

Bug Description

I created a mongodb cluster and there was an issue where the volume wasnt large enough to build the replia set and this caused me to manually set the instances to an error state to be able to delete the cluster.

ubuntu@devstack2:~$ trove cluster-list
+--------------------------------------+---------------+-----------+-------------------+-----------+
| ID | Name | Datastore | Datastore Version | Task Name |
+--------------------------------------+---------------+-----------+-------------------+-----------+
| 683116a4-f056-4aac-926f-d05af5ec3af2 | mongo-cluster | mongodb | 3.0 | NONE |
+--------------------------------------+---------------+-----------+-------------------+-----------+
ubuntu@devstack2:~$ trove list --in
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
| ID | Name | Datastore | Datastore Version | Status | Flavor ID | Size |
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
| 09621303-4413-4923-bd2d-f7212fb72fd2 | mongo-cluster-configsvr-1 | mongodb | 3.0 | BUILD | 7 | 1 |
| 270e5a5f-8412-4eef-8bfa-10ebaaac66de | mongo-cluster-configsvr-2 | mongodb | 3.0 | BUILD | 7 | 1 |
| 6a291131-c55a-4763-93d9-f31263406051 | mongo-cluster-rs1-1 | mongodb | 3.0 | ERROR | 7 | 1 |
| 739d3791-dcff-4c8d-9138-7c8e8c86fca0 | mongo-cluster-mongos-1 | mongodb | 3.0 | BUILD | 7 | 1 |
| 73a12432-ea1f-4057-b23e-85c053a087d6 | mongo-cluster-rs1-2 | mongodb | 3.0 | ERROR | 7 | 1 |
| 9360f194-9078-48b6-8b62-f7baa55b6e71 | mongo-cluster-configsvr-3 | mongodb | 3.0 | BUILD | 7 | 1 |
| ae52d74b-a8fa-40c2-ba12-bedf53090f50 | mongo-cluster-rs1-3 | mongodb | 3.0 | ERROR | 7 | 1 |
+--------------------------------------+---------------------------+-----------+-------------------+--------+-----------+------+
ubuntu@devstack2:~$ trove cluster-delete mongo-cluster
ERROR: Instance 09621303-4413-4923-bd2d-f7212fb72fd2 is not ready. (HTTP 422)

Taskmanager logs:

2015-12-11 09:02:26.900 ERROR trove.guestagent.api [-] Error calling add_members
2015-12-11 09:02:26.900 TRACE trove.guestagent.api Traceback (most recent call last):
2015-12-11 09:02:26.900 TRACE trove.guestagent.api File "/opt/stack/trove/trove/guestagent/api.py", line 62, in _call
2015-12-11 09:02:26.900 TRACE trove.guestagent.api result = cctxt.call(self.context, method_name, **kwargs)
2015-12-11 09:02:26.900 TRACE trove.guestagent.api File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2015-12-11 09:02:26.900 TRACE trove.guestagent.api retry=self.retry)
2015-12-11 09:02:26.900 TRACE trove.guestagent.api File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-12-11 09:02:26.900 TRACE trove.guestagent.api timeout=timeout, retry=retry)
2015-12-11 09:02:26.900 TRACE trove.guestagent.api File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
2015-12-11 09:02:26.900 TRACE trove.guestagent.api retry=retry)
2015-12-11 09:02:26.900 TRACE trove.guestagent.api File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 455, in _send
2015-12-11 09:02:26.900 TRACE trove.guestagent.api raise result
2015-12-11 09:02:26.900 TRACE trove.guestagent.api RemoteError: Remote error: OperationFailure command SON([('replSetInitiate', 1)]) on namespace admin.$cmd failed: exception: new file allocation failure
2015-12-11 09:02:26.900 TRACE trove.guestagent.api [u'Traceback (most recent call last):\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply\n executor_callback))\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch\n executor_callback)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch\n result = func(ctxt, **new_args)\n', u' File "/usr/local/lib/python2.7/dist-packages/osprofiler/profiler.py", line 105, in wrapper\n return f(*args, **kwargs)\n', u' File "/home/ubuntu/trove/trove/guestagent/datastore/experimental/mongodb/manager.py", line 206, in add_members\n self.app.add_members(members)\n', u' File "/home/ubuntu/trove/trove/guestagent/datastore/experimental/mongodb/service.py", line 399, in add_members\n MongoDBAdmin().rs_initiate()\n', u' File "/home/ubuntu/trove/trove/guestagent/datastore/experimental/mongodb/service.py", line 825, in rs_initiate\n return admin_client.admin.command(\'replSetInitiate\')\n', u' File "/usr/local/lib/python2.7/dist-packages/pymongo/database.py", line 454, in command\n codec_options, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/pymongo/database.py", line 366, in _command\n allowable_errors)\n', u' File "/usr/local/lib/python2.7/dist-packages/pymongo/pool.py", line 201, in command\n check_keys, self.listeners, self.max_bson_size)\n', u' File "/usr/local/lib/python2.7/dist-packages/pymongo/network.py", line 94, in command\n helpers._check_command_response(response_doc, msg, allowable_errors)\n', u' File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 193, in _check_command_response\n raise OperationFailure(msg % errmsg, code, response)\n', u"OperationFailure: command SON([('replSetInitiate', 1)]) on namespace admin.$cmd failed: exception: new file allocation failure\n"].
2015-12-11 09:02:26.900 TRACE trove.guestagent.api
2015-12-11 09:02:26.902 ERROR trove.common.strategies.cluster.experimental.mongodb.taskmanager [-] error initializing replica set
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager Traceback (most recent call last):
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager File "/opt/stack/trove/trove/common/strategies/cluster/experimental/mongodb/taskmanager.py", line 330, in _init_replica_set
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager self.get_guest(primary_member).add_members(other_members_ips)
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager File "/opt/stack/trove/trove/common/strategies/cluster/experimental/mongodb/guestagent.py", line 52, in add_members
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager self.version_cap, members=members)
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager File "/opt/stack/trove/trove/guestagent/api.py", line 68, in _call
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager raise exception.GuestError(original_message=r.value)
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager GuestError: An error occurred communicating with the guest: command SON([('replSetInitiate', 1)]) on namespace admin.$cmd failed: exception: new file allocation failure.
2015-12-11 09:02:26.902 TRACE trove.common.strategies.cluster.experimental.mongodb.taskmanager

Revision history for this message
Craig Vyvial (cp16net) wrote :

Looks like it only errors on the group of instances the cluster is working on. So if its building the replica set and fails then it sets only those instances to ERROR and leaves the others alone even though they are still BUILDING.

Amrith Kumar (amrith)
tags: added: delete-instance-force
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.