Comment 7 for bug 1372501

Revision history for this message
Jose Cavieres (jcavieres) wrote :

Valerii, we are trying to create the test case, but the incorrect timestamp problem only happens once every few days in a heavy load system.
In the meantime, we have some more information :

1) The affected table is the only one in the schema with a trigger defined for it:

CREATE TABLE `estadisticos_ws` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `tx` char(20) NOT NULL,
  `msisdn` bigint(20) NOT NULL,
  `nodo` smallint(6) NOT NULL,
  `stamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `duracion` smallint(6) NOT NULL,
  `error` int(11) NOT NULL,
  `wsid` smallint(6) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`,`stamp`),
  KEY `msisdn` (`msisdn`),
  KEY `stamp` (`stamp`)
) ENGINE=InnoDB AUTO_INCREMENT=4546612238 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP( stamp ))
(PARTITION p1428465600 VALUES LESS THAN (1428465600) ENGINE = InnoDB,
 PARTITION p1429070400 VALUES LESS THAN (1429070400) ENGINE = InnoDB,
 PARTITION p1429675200 VALUES LESS THAN (1429675200) ENGINE = InnoDB,
 PARTITION p1430280000 VALUES LESS THAN (1430280000) ENGINE = InnoDB,
 PARTITION p1430884800 VALUES LESS THAN (1430884800) ENGINE = InnoDB,
 PARTITION p1431489600 VALUES LESS THAN (1431489600) ENGINE = InnoDB,
 PARTITION p1432094400 VALUES LESS THAN (1432094400) ENGINE = InnoDB)

show triggers\G
*************************** 1. row ***************************
             Trigger: estadisticos_ws_trg
               Event: INSERT
               Table: estadisticos_ws
           Statement: BEGIN
    INSERT INTO contadores_ws ( `nodo`, `stamp`, `wsid`, `error`, `contador` ) VALUES ( NEW.nodo, STR_TO_DATE( DATE_FORMAT( NEW.stamp, "%Y/%m/%d %H:%i:00" ), "%Y/%m/%d %H:%i:%s" ) , NEW.wsid, NEW.error, 1 ) ON DUPLICATE KEY UPDATE contador = contador + 1, stamp = VALUES( stamp );
    INSERT INTO estadisticos_ws_sec ( wsid, stamp, valor ) VALUES (new.wsid, new.stamp, 1) ON DUPLICATE KEY UPDATE valor=valor+1;
END
              Timing: AFTER
             Created: NULL
            sql_mode:
             Definer: root@localhost
character_set_client: latin1
collation_connection: latin1_swedish_ci
  Database Collation: latin1_swedish_ci

CREATE TABLE `contadores_ws` (
  `nodo` smallint(6) NOT NULL,
  `stamp` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `error` int(11) NOT NULL,
  `contador` int(11) NOT NULL,
  `wsid` smallint(6) NOT NULL DEFAULT '0',
  PRIMARY KEY (`nodo`,`stamp`,`wsid`,`error`),
  KEY `stamp` (`stamp`,`nodo`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

2) The only querys sent to that table have the structure:

SELECT id into @a FROM servicios where nombre IN ('WS_ValidaEstado', 'unknown') ORDER BY id DESC LIMIT 1;INSERT INTO estadisticos_ws ( tx, msisdn, nodo, stamp, duracion, wsid, error ) VALUES ('0xc70748a05bb2', 56988659040,47, now(), 1, @a, 0);

3) If the certification errors are logged and before the problem happens, the following is logged for EVERY such query in the galera node receiving the query (we are sure the query is received only once):

150513 20:49:31 [Note] WSREP: cluster conflict due to certification failure for threads:
150513 20:49:31 [Note] WSREP: Victim thread:
   THD: 2450210, mode: local, state: executing, conflict: cert failure, seqno: 48653922
   SQL: INSERT INTO estadisticos_ws ( tx, msisdn, nodo, stamp, duracion, wsid, error ) VALUES ('0xdd0702c6b758', 56953876137,47, now(), 48, @a, 0)

*** Priority TRANSACTION:
TRANSACTION 100254405, ACTIVE 0 sec starting index read
mysql tables in use 20, locked 20
2 lock struct(s), heap size 360, 0 row lock(s), undo log entries 1
MySQL thread id 7, OS thread handle 0x7f0cec92b700, query id 108515295 Update_rows_log_event::find_row(48653921)

*** Victim TRANSACTION:
TRANSACTION 100254404, ACTIVE 0 sec rollback
mysql tables in use 20, locked 20
ROLLING BACK 5 lock struct(s), heap size 1184, 2 row lock(s)
MySQL thread id 2450210, OS thread handle 0x7f0b8abb6700, query id 108515290 192.168.111.58 ws_user query end
INSERT INTO estadisticos_ws ( tx, msisdn, nodo, stamp, duracion, wsid, error ) VALUES ('0xdd0702c6b758', 56953876137,47, now(), 48, @a, 0)
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 165638 n bits 168 index `PRIMARY` of table `WSDB`.`contadores_ws` trx table locks 3 total table locks 2 trx id 100254404 lock_mode X locks rec but not gap lock hold time 0 wait time before grant 0
150513 20:49:31 [Note] WSREP: cluster conflict due to high priority abort for threads:
150513 20:49:31 [Note] WSREP: Winning thread:
   THD: 7, mode: applier, state: executing, conflict: no conflict, seqno: 48653921
   SQL: (null)
150513 20:49:31 [Note] WSREP: Victim thread:
   THD: 2450210, mode: local, state: executing, conflict: cert failure, seqno: -1
   SQL: INSERT INTO estadisticos_ws ( tx, msisdn, nodo, stamp, duracion, wsid, error ) VALUES ('0xdd0702c6b758', 56953876137,47, now(), 48, @a, 0)

4) After the problem happens nothing else is logged until the client application is restarted

The certification error in 3) shouldn't happen, so we believe is related to the fixed timestamp problem