new-slave.py should refuse to run if cluster is lagged

Bug #491748 reported by Stuart Bishop
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
High
Stuart Bishop

Bug Description

If the cluster is lagged, there may be a problem. Even if there is not a problem, the script will block until lag is cleared. We should just abort if the cluster is lagged so the lag can be diagnosed and corrected.

Stuart Bishop (stub)
Changed in launchpad-foundations:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Stuart Bishop (stub)
milestone: none → 3.1.12
Revision history for this message
Stuart Bishop (stub) wrote :

Also check for events to or from the new node_id in sl_event and sl_confirm (these might be laying around if the cleanup thread hasn't been run since a node with the same id has been dropped).

Revision history for this message
Stuart Bishop (stub) wrote :

Checking for long running transactions on all nodes good too.

Ideally all tools in database/replication get these checks.

Revision history for this message
Gary Poster (gary) wrote :

We would like this to be done within the week, so it can be part of the last 2009 release in two weeks. Moving to "High".

Changed in launchpad-foundations:
importance: Medium → Critical
importance: Critical → High
Stuart Bishop (stub)
Changed in launchpad-foundations:
milestone: 3.1.12 → 10.01
Gary Poster (gary)
Changed in launchpad-foundations:
milestone: 10.01 → 10.03
Revision history for this message
Gary Poster (gary) wrote :

Stuart reports that this should no longer be a problem once we are able to no longer do the auth database replication. This will greatly simplify our scripts.

Changed in launchpad-foundations:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.