replication ,eject-replica-source ,validate_can_perform_action

Bug #1519278 reported by gang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack DBaaS (Trove)
Invalid
Undecided
Unassigned

Bug Description

In replication, when a master down, call eject_replica_source to eject the old master and promote a slave as a new master.
but in func eject_relica_source in trove/instances/models.py it call self.validate_can_perform_action() to test master not in "not active" state ,that is contradictory

Revision history for this message
Amrith Kumar (amrith) wrote :

Please elaborate on why you think this is contradictory, what problem(s) do you face?

Changed in trove:
status: New → Incomplete
Revision history for this message
Morgan Jones (6-morgan) wrote :

I believe the code is correct - you should only be able to eject a master if it is active, because if there is a detectable issue with the master, it should be corrected rather than ejected. The reason to eject a master is that it is not running or reachable, but that would not cause it to not be marked ACTIVE (since we would have no way to tell if it was indeed active, but simply not reachable from the controller).

Revision history for this message
gang (zhg0517) wrote :

In my understanding,a master can be ejected when it is out of service, if it healthy we can use promote_to_replica_source to exchange master and slave.

I test eject function in the follow steps:
1. stop mysql service on master, the vm is active and mysql is down, but I met an error
"if last_heartbeat_delta < agent_expiry_interval:
            raise exception.BadRequest(_("Replica Source %s cannot be ejected"
                                         " as it has a current heartbeat")
                                       % self.id)
"
2. shut down the vm,and the status of database instance is SHUTDOWN, then I get an error from self.validate_can_perform_action()
"msg = (_("Instance %(instance_id)s is not currently available for an "
                 "action to be performed (status was %(action_status)s).") %
               {'instance_id': self.id, 'action_status': status})
"

I think the above two condictions could use eject operation, but I met errors.
Could you please list one condiction the use the eject function correctlly, thanks.

Revision history for this message
Amrith Kumar (amrith) wrote :

Just verified and code behaves as expected. reduce your heartbeat timeout and try again.

Changed in trove:
status: Incomplete → Invalid
Revision history for this message
Amrith Kumar (amrith) wrote :

Morgan, maybe the thing that makes your description confusing is the sentence, "The reason to eject a master is that it is not running or reachable, but that would not cause it to not be marked ACTIVE (since we would have no way to tell if it was indeed active, but simply not reachable from the controller)."

three negatives with an interesting nesting make it a little harder to parse than one would expect.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.