Cassandra node is not removed from the cluster after remove-unit action

Bug #1875455 reported by Vladimir Grevtsev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cassandra Juju Charm
Won't Fix
Undecided
Unassigned

Bug Description

I had two Cassandra units deployed:

ubuntu@OrangeBox84:~/fce-demo/bootstrap-scripts$ j status
Model Controller Cloud/Region Version SLA Timestamp
cassandra orangebox-cloud-RegionOne orangebox-cloud/RegionOne 2.6.10 unsupported 17:17:28Z

App Version Status Scale Charm Store Rev OS Notes
cassandra active 2 cassandra jujucharms 54 ubuntu

Unit Workload Agent Machine Public address Ports Message
cassandra/0* active idle 0 172.27.86.130 9042/tcp,9160/tcp Live seed
cassandra/2 active idle 2 172.27.86.116 9042/tcp,9160/tcp Live seed

Machine State DNS Inst id Series AZ Message
0 started 172.27.86.130 24010209-c1a5-4666-9967-b874e04cf4e6 bionic nova ACTIVE
2 started 172.27.86.116 7350bc4d-09c5-4358-b209-27e288f8a19d bionic nova ACTIVE

$ nodetool status
Datacenter: juju
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.0.0.28 174.64 KiB 256 100.0% 046c1c82-51f2-408c-8cb8-203a2ca7aae8 cassandra
UN 10.0.0.111 247.91 KiB 256 100.0% b3f44618-d795-489e-8cc3-01ff5e7647ac cassandra

But after I did a "juju remove-unit cassandra/2", my "nodetool status" on the remaining node started to look like this:

ubuntu@juju-35e7bb-cassandra-0:~$ nodetool status
Datacenter: juju
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 10.0.0.28 174.64 KiB 256 100.0% 046c1c82-51f2-408c-8cb8-203a2ca7aae8 cassandra
UN 10.0.0.111 237.77 KiB 256 100.0% b3f44618-d795-489e-8cc3-01ff5e7647ac cassandra

According to https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/architecture/archDataDistributeFailDetect.html, it is an expected behaviour when node is offline for some unexpected reasons (e.g outage), but this node removal was triggered by the operator explicitly, thus it would had to be removed - and it didn't happen.

Probably, a relation-departed hooks might be improved to include automatic node unregistration (or, at least, let the operator be aware of a unavailable node so he could take some actions?)

Revision history for this message
Stuart Bishop (stub) wrote :

Per https://jaas.ai/cassandra, 'nodes must be manually decommissioned before dropping a unit'. Per https://bugs.launchpad.net/juju-core/+bug/1417874, it is not possible with Juju to cleanly remove the node by destroying the unit. Decomissioning a node cleanly will need to be done via an action (along with most cluster operations, turning uncontrollable magic into explicit operations under user control)

Changed in cassandra-charm:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.