i saw this: 2009/10/27 17:07:22 ERROR Check 'rep_threads' on 'rkdb1' has failed for 10 seconds! Message: ERROR: Replication is broken 2009/10/27 17:07:22 DEBUG Pinging checker 'mysql'... 2009/10/27 17:07:22 DEBUG Checker 'mysql' is OK (OK: Pong!) 2009/10/27 17:07:22 DEBUG Pinging checker 'mysql'... 2009/10/27 17:07:22 DEBUG Checker 'mysql' is OK (OK: Pong!) 2009/10/27 17:07:22 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:22 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:23 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:23 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:23 DEBUG Pinging checker 'ping'... 2009/10/27 17:07:23 DEBUG Checker 'ping' is OK (OK: Pong!) 2009/10/27 17:07:23 DEBUG Pinging checker 'ping'... 2009/10/27 17:07:23 DEBUG Checker 'ping' is OK (OK: Pong!) 2009/10/27 17:07:24 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.71.183),reader(192.168.71.184),writer(192.168.71.182), rkdb2)' to rkdb2 (192.168.71.181:9989) 2009/10/27 17:07:24 DEBUG Received Answer: OK: Status applied successfully!|UP:338.45 2009/10/27 17:07:24 DEBUG Sending command 'SET_STATUS(AWAITING_RECOVERY, , rkdb2)' to rkdb1 (192.168.71.180:9989) 2009/10/27 17:07:24 DEBUG Received Answer: OK: Status applied successfully!|UP:123.61 2009/10/27 17:07:24 DEBUG Listener: Connect! 2009/10/27 17:07:24 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.71.183),reader(192.168.71.184),writer(192.168.71.182), rkdb2)' to rkdb2 (192.168.71.181:9989) 2009/10/27 17:07:24 DEBUG Listener: Disconnect! 2009/10/27 17:07:24 DEBUG Listener: Waiting for connection... 2009/10/27 17:07:24 DEBUG Received Answer: OK: Status applied successfully!|UP:338.87 2009/10/27 17:07:24 DEBUG Sending command 'SET_STATUS(AWAITING_RECOVERY, , rkdb2)' to rkdb1 (192.168.71.180:9989) 2009/10/27 17:07:24 DEBUG Received Answer: OK: Status applied successfully!|UP:124.02 2009/10/27 17:07:24 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:24 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:25 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:25 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:26 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:26 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.71.183),reader(192.168.71.184),writer(192.168.71.182), rkdb2)' to rkdb2 (192.168.71.181:9989) 2009/10/27 17:07:27 DEBUG Received Answer: OK: Status applied successfully!|UP:341.45 2009/10/27 17:07:27 DEBUG Sending command 'SET_STATUS(AWAITING_RECOVERY, , rkdb2)' to rkdb1 (192.168.71.180:9989) 2009/10/27 17:07:27 DEBUG Received Answer: OK: Status applied successfully!|UP:126.61 2009/10/27 17:07:27 DEBUG Pinging checker 'rep_backlog'... 2009/10/27 17:07:27 DEBUG Checker 'rep_backlog' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Pinging checker 'rep_backlog'... 2009/10/27 17:07:27 DEBUG Checker 'rep_backlog' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Pinging checker 'rep_threads'... 2009/10/27 17:07:27 DEBUG Checker 'rep_threads' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Pinging checker 'rep_threads'... 2009/10/27 17:07:27 DEBUG Checker 'rep_threads' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Pinging checker 'mysql'... 2009/10/27 17:07:27 DEBUG Checker 'mysql' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Pinging checker 'mysql'... 2009/10/27 17:07:27 DEBUG Checker 'mysql' is OK (OK: Pong!) 2009/10/27 17:07:27 DEBUG Listener: Waiting for connection... 2009/10/27 17:07:27 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:27 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:28 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:28 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:28 DEBUG Pinging checker 'ping'... 2009/10/27 17:07:28 DEBUG Checker 'ping' is OK (OK: Pong!) 2009/10/27 17:07:28 DEBUG Pinging checker 'ping'... 2009/10/27 17:07:28 DEBUG Checker 'ping' is OK (OK: Pong!) 2009/10/27 17:07:29 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:29 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:30 DEBUG Sending command 'SET_STATUS(ONLINE, reader(192.168.71.183),reader(192.168.71.184),writer(192.168.71.182), rkdb2)' to rkdb2 (192.168.71.181:9989) 2009/10/27 17:07:30 DEBUG Received Answer: OK: Status applied successfully!|UP:344.52 2009/10/27 17:07:30 DEBUG Sending command 'SET_STATUS(AWAITING_RECOVERY, , rkdb2)' to rkdb1 (192.168.71.180:9989) 2009/10/27 17:07:30 DEBUG Received Answer: OK: Status applied successfully!|UP:129.67 2009/10/27 17:07:30 DEBUG Listener: Waiting for connection... 2009/10/27 17:07:30 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:30 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:31 DEBUG Pinging checker 'ping_ip'... 2009/10/27 17:07:31 DEBUG Checker 'ping_ip' is OK (OK: Pong!) 2009/10/27 17:07:32 DEBUG Pinging checker 'rep_backlog'... 2009/10/27 17:07:32 DEBUG Checker 'rep_backlog' is OK (OK: Pong!) 2009/10/27 17:07:32 DEBUG Pinging checker 'rep_backlog'... 2009/10/27 17:07:32 DEBUG Checker 'rep_backlog' is OK (OK: Pong!) 2009/10/27 17:07:32 DEBUG Pinging checker 'rep_threads'... 2009/10/27 17:07:32 DEBUG Checker 'rep_threads' is OK (OK: Pong!) 2009/10/27 17:07:32 DEBUG Pinging checker 'rep_threads'... 2009/10/27 17:07:32 DEBUG Checker 'rep_threads' is OK (OK: Pong!) The latest checks are "ok" but now i have a doubt because i don't know if are checks of the online node. But are 2 checks, i suposse that one per node. Pascal Hofmann escribió: > (I'm talking about debug_mmm_mon.txt, there rep_threads does not change to OK for host rkdb1.) > The monitor does not take care of the checks rep_threads and rep_backlog when switching from HARD_OFFLINE to AWAITING_RECOVERY in version 2.0.9 - but it will in the next release. Then the problem should be gone. > -- -------------------------- Uxio Faria Giraldez -------------------------- SISTEMAS PRISACOM S.A. +34 913 537 770 ==========================