sync_with_master and monitor restart

Bug #473962 reported by ufaria
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mysql-mmm
Confirmed
High
Pascal Hofmann

Bug Description

Sometimes when i change the monitor to active state and on few sconds send a show command, some agent
 node stalls in the script sync_with_master waiting for mysql to send something. If i try to restart monitor or
 agent it can't since the socket is up. if i restart mysql the socket ends and the script sync_with_master too.

Also the monitor doesn't check well the nodes since don't restart.

Revision history for this message
ufaria (ufaria-prisacom) wrote :

i'm using mysql 5.1.39 on debian linux 5.0

Revision history for this message
Pascal Hofmann (pascalhofmann) wrote :

I think the agent on the new active master stalls? Cause this is the host where sync_with_master should be called on.

Could you attach the output of "show slave status" and "show full processlist" on the new active master and the output of "show master status" on the passive master in this situation?

> Also the monitor doesn't check well the nodes since don't restart.
I don't understand this sentence.

Revision history for this message
ufaria (ufaria-prisacom) wrote : Re: [Bug 473962] Re: sync_with_master and monitor restart

Pascal Hofmann escribió:
> I think the agent on the new active master stalls? Cause this is the
> host where sync_with_master should be called on.
>
> Could you attach the output of "show slave status" and "show full
> processlist" on the new active master and the output of "show master
> status" on the passive master in this situation?

I saw the nodes by hand and were well

>
>> Also the monitor doesn't check well the nodes since don't restart.
> I don't understand this sentence.
>

sorry i don't speak well english :). If i restart mysql on the node that the
agent did stall, the monitor can work again but doesn't work well, it doesn't
see the nodes online until restart the monitor daemon.

--
--------------------------
    Uxio Faria Giraldez
--------------------------
    SISTEMAS
    PRISACOM S.A.
    +34 913 537 770
==========================

Changed in mysql-mmm:
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Pascal Hofmann (pascalhofmann) wrote :

This happens if the replication is messed up: sync_with_master executes SELECT MASTER_POS_WAIT(...) on a host where the replication is not running (e.g. Slave_IO_Running: no). In my situation the slave wanted to continue replication from a binlog that was already deleted on the master host.

Suggested Fix: I don't know whether we should fix this in MMM. When this happens something is wrong with the replication setup and it should be fixed by the admin. Though MMM should detect this situation and it shouldn't stall.

Changed in mysql-mmm:
assignee: nobody → Pascal Hofmann (pascalhofmann)
status: Incomplete → Confirmed
Revision history for this message
myriapod (s-myriapod) wrote :

Hi there,
I think i'm having the same issue.
"mmm_control show" hangs indefinitely. same thing for "mmm_control checks".
the only clear evidence of something going wrong is this log line in mmm_agentd.log:

2010/05/05 19:51:43 FATAL Couldn't sync with master: undef

however the show slave status output looks good on both masters, i'm attaching it for reference.
btw i'm running mmm 2.2.0-1 on ubuntu hardy 8.0.4 LTS with mysql-server-5.1=5.1.42-xtradb-1.0.6-9-50.hardy.3

cheers!
sergio

Changed in mysql-mmm:
milestone: none → 2.2.2
David Beveridge (dage)
Changed in mysql-mmm:
milestone: 2.2.2 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.