update record via sp kill cluster node

Bug #1180020 reported by Sladjan Avlijas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Galera
New
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Won't Fix
Medium
Unassigned
5.6
Fix Committed
Medium
Unassigned

Bug Description

HI,

i have a problem with percona mysql server. I have 3 node (VM with 4 core Intel processor, 7 GB RAM, GB Ethernet), ubuntu 12.04LTS with 5.5.30-30.2 Percona Server (GPL), Release 30.2, wsrep_23.7.4.r38. On this i have few DB, and problem is that database killed by my stored procedure. In store procedure i update some collumn in (innodb) table with 700k rows, and this update crash mysql node. SP is simple, like this:

PROCEDURE `proc_replace_characters`()
BEGIN
    UPDATE table1
    SET col1 = func(col1), col2 = func(col2)
END

where func is simple char replace function. By the way, when this procedure update table with 200k rows, procedure work ok, but with 700k rows procedure kill percona and i must restart server (no data change on table).

my.cnf:

#
# The MySQL database server configuration file.
#
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

# Here is entries for some specific programs
# The following values assume you have at least 32M ram
# This was formally known as [safe_mysqld]. Both versions are currently parsed.

[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
#
# * Basic Settings
#
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /mnt/data/mysql
tmpdir = /tmp/mysqltmp
language = /usr/share/mysql/english
skip-external-locking
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address = node1

binlog_format = ROW

######## Galera ########

# Full path to wsrep provider library or 'none' to disable galera
wsrep_provider = /usr/lib/libgalera_smm.so

# set to "gcom://"" to reinitialise (reset) a node
#wsrep_cluster_address = "gcomm://"

# once the nodes have mounted, we will set this to the floating ip
wsrep_cluster_address = "gcomm://node2:4567,node3:4567"

wsrep_cluster_name=Percona-XtraDB-Cluster
wsrep_node_name = mysql-node1

#### BOF : State Snapshot Transfer method

wsrep_sst_method = rsync

# alternative methods to-do SST
# experimental, wait for RC release
#wsrep_sst_method=xtrabackup
# not recommended, transfers the ENTIRE database to re-sync nodes.
#wsrep_sst_method=mysqldump

# Set to number of cpu cores.

wsrep_slave_threads = 1

#### for MyISAM support

wsrep_replicate_myisam = 1

####END of MyISAM

# to enable debug level logging, set this to 1

wsrep_debug = 1

# how many times to retry deadlocked autocommits
wsrep_retry_autocommit = 1

# convert locking sessions into transactions
wsrep_convert_LOCK_to_trx = 1

# Generate fake primary keys for non-PK tables (required for multi-master and parallel applying operation)
wsrep_certify_nonPK = 1

#### Required for Galera
innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2
default_storage_engine = InnoDB
query_cache_size = 0
query_cache_type = 0

######## EOF : Galera ########
#
# * Fine Tuning
#
key_buffer = 32M
max_allowed_packet = 32M
thread_stack = 192K
thread_cache_size = 8
#
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 32M
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!

general_log_file = /mnt/data/mysql/mysql.log
general_log = 0

#
# Error logging goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf.
#
# Here you can see queries with especially long duration
#log_slow_queries = /var/log/mysql/mysql-slow.log
#long_query_time = 2
#log-queries-not-using-indexes
#

innodb_log_file_size = 64M
innodb_buffer_pool_size = 4000M
innodb_thread_concurrency = 4

federated

event_scheduler=ON

[mysqldump]
quick
quote-names
max_allowed_packet = 16M

[mysql]
#no-auto-rehash # faster start of mysql but no tab completition
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/

mysql-node1.err:

130514 11:29:14 [Note] WSREP: TO END: 1572149
130514 11:29:14 [Note] WSREP: TO BEGIN: 1572149, 0 : DROP FUNCTION `tmp_ems_proc_105`
130514 11:29:14 [Note] WSREP: TO BEGIN: 1572150, 2
130514 11:29:14 [Note] WSREP: TO END: 1572150, 2 : DROP FUNCTION `tmp_ems_proc_105`
130514 11:29:14 [Note] WSREP: TO END: 1572150
09:39:50 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=5
max_threads=153
thread_count=5
connection_count=4
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 367619 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xa41a9b0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb424112e70 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7da13e]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x6b0c74]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fb428cd3cb0]
/lib/x86_64-linux-gnu/libc.so.6(+0x8c59b)[0x7fb42835859b]
/usr/lib/libgalera_smm.so(_ZN6galera9TrxHandle5flushEm+0x76)[0x7fb424be0996]
/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM9replicateEPNS_9TrxHandleE+0xc9)[0x7fb424bf6409]
/usr/lib/libgalera_smm.so(galera_pre_commit+0xef)[0x7fb424c0e93f]
/usr/sbin/mysqld(_Z22wsrep_run_wsrep_commitP3THDP10handlertonb+0x756)[0x666b06]
/usr/sbin/mysqld[0x6679d3]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDb+0x228)[0x6b3a38]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x25)[0x65c715]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x52c)[0x5a915c]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x1e)[0x77f3fe]
/usr/sbin/mysqld(_ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr+0x97)[0x785257]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x1bc)[0x78578c]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x65c)[0x781ecc]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x6bf)[0x78365f]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x49d0)[0x5ad600]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x314)[0x5b08b4]
/usr/sbin/mysqld[0x5b11d8]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1a80)[0x5b3270]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x167)[0x5b36c7]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x14f)[0x64ecff]
/usr/sbin/mysqld(handle_one_connection+0x51)[0x64eed1]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fb428ccbe9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb4283bfccd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fb2e40213d0): is an invalid pointer
Connection ID (thread ID): 45
Status: NOT_KILLED

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
130514 11:39:51 mysqld_safe Number of processes running now: 0
130514 11:39:51 mysqld_safe WSREP: not restarting wsrep node automatically
130514 11:39:51 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

If you do not see the problem on a non-cluster Percona Server, then this is more a Percona XtraDB Cluster problem it seems.

no longer affects: percona-server
Revision history for this message
Alex Yurchenko (ayurchen) wrote :

4 no-no's in a single config:

wsrep_replicate_myisam = 1
wsrep_convert_LOCK_to_trx = 1
query_cache_size = 32M
event_scheduler=ON

The last one is probably the culprit.

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :

I disable replicate_myisam, LOCK_to_trx and query_cache_size=0 and Event but problem is not fixed.

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Can you cause the same crash by running the same statement manually from CLI?

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :

Yes, this is start procedure via CLI:

mysql> call `proc_replace_characters`();
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>
mysql> show databases;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run
/mysqld/mysqld.sock' (111)
ERROR:
Can't connect to the server

mysql> exit
Bye

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

What if you just run that UPDATE statement manually?

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :

Now i try manuely to run UPDATE, and i see result of query (765901 affected rows), but mysql service crash. When i start this node (sudo service mysql start), node start ok but record is not updatet. When i run this query on 200k rows table service work ok, but on 700k is proble.

query:

 UPDATE `pasiva`
    SET `pasiva`.`P007` = `fun_replace_characters`(`pasiva`.`P007`)
    , `pasiva`.`P008` = `fun_replace_characters`(`pasiva`.`P008`)
    , `pasiva`.`P009` = `fun_replace_characters`(`pasiva`.`P009`)
    , `pasiva`.`P095` = `fun_replace_characters`(`pasiva`.`P095`)
    , `pasiva`.`P096` = `fun_replace_characters`(`pasiva`.`P096`);

CREATE FUNCTION `fun_replace_characters`(
        original varchar(256)
    )
    RETURNS varchar(256) CHARSET utf8
    NOT DETERMINISTIC
    CONTAINS SQL
    SQL SECURITY DEFINER
    COMMENT ''
begin
 declare translit varchar(512) default '';
 declare len int(3) default 0;
 declare pos int(3) default 1;
 declare letter char(2);
 set original = trim(original);
 set len = char_length(original);
 while (pos <= len) do
 set letter = substring(original, pos, 1);
 case true
 when letter = '@' then set letter = 'Ž';
 when letter = '`' then set letter = 'ž';
 when letter = '^' then set letter = 'Č';
 when letter = '~' then set letter = 'č';
 when letter = '[' then set letter = 'Š';
 when letter = '{' then set letter = 'š';
 when letter = ']' then set letter = 'Ć';
 when letter = '}' then set letter = 'ć';
 when letter = '\\' then set letter = 'Đ';
 when letter = '|' then set letter = 'đ';
 else
 begin
 set letter = letter;
 end;
 end case;
 set translit = concat(translit, letter);
 set pos = pos + 1;
 end while;
 set translit = replace(translit, '_', ' ');
 return translit;
end;

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :

I dont understand how transaction flow from one to another node, but i see thet something is wrong when update try to commit (and send) large set of data. Is there any tuning for it to try setup some variables in my.cnf?

By the way (i forget to say), my tmpfs for mysql ( /tmp/mysqltmp) is 2GB (1GB is small for this query).

Revision history for this message
Alex Yurchenko (ayurchen) wrote :

Yes, looks like a segfault in Galera before the actual replication happens. And it is good to know that it is not related to stored procedures. But it looks like it is related to the size of transaction. I guess we need the table definition then.

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :
Download full text (3.9 KiB)

Table is mirror (imported every day) from dbf table, and mysql structure is:

CREATE TABLE `pasiva` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `P001` int(11) DEFAULT NULL,
  `P002` int(11) DEFAULT NULL,
  `P003` int(11) DEFAULT NULL,
  `P004` char(1) DEFAULT NULL,
  `P005` int(11) DEFAULT NULL,
  `P006` int(11) DEFAULT NULL,
  `P007` varchar(20) DEFAULT NULL,
  `P008` varchar(20) DEFAULT NULL,
  `P009` varchar(20) DEFAULT NULL,
  `P010` varchar(13) DEFAULT NULL,
  `P011` int(11) DEFAULT NULL,
  `P012` int(11) DEFAULT NULL,
  `P013` int(11) DEFAULT NULL,
  `P014` int(11) DEFAULT NULL,
  `P015` int(11) DEFAULT NULL,
  `P016` int(11) DEFAULT NULL,
  `P017` int(11) DEFAULT NULL,
  `P018` int(11) DEFAULT NULL,
  `P019` int(11) DEFAULT NULL,
  `P020` int(11) DEFAULT NULL,
  `P021` int(11) DEFAULT NULL,
  `P022` int(11) DEFAULT NULL,
  `P023` varchar(2) DEFAULT NULL,
  `P024` int(11) DEFAULT NULL,
  `P025` int(11) DEFAULT NULL,
  `P026` int(11) DEFAULT NULL,
  `P027` int(11) DEFAULT NULL,
  `P028` int(11) DEFAULT NULL,
  `P029` int(11) DEFAULT NULL,
  `P030` int(11) DEFAULT NULL,
  `P031` varchar(10) DEFAULT NULL,
  `P032` int(11) DEFAULT NULL,
  `P033` int(11) DEFAULT NULL,
  `P034` int(11) DEFAULT NULL,
  `P035` int(11) DEFAULT NULL,
  `P036` int(11) DEFAULT NULL,
  `P037` int(11) DEFAULT NULL,
  `P038` varchar(10) DEFAULT NULL,
  `P039` int(11) DEFAULT NULL,
  `P040` int(11) DEFAULT NULL,
  `P041` int(11) DEFAULT NULL,
  `P042` int(11) DEFAULT NULL,
  `P043` int(11) DEFAULT NULL,
  `P044` int(11) DEFAULT NULL,
  `P045` int(11) DEFAULT NULL,
  `P046` int(11) DEFAULT NULL,
  `P047` int(11) DEFAULT NULL,
  `P048` int(11) DEFAULT NULL,
  `P049` int(11) DEFAULT NULL,
  `P050` int(11) DEFAULT NULL,
  `P051` int(11) DEFAULT NULL,
  `P052` int(11) DEFAULT NULL,
  `P053` int(11) DEFAULT NULL,
  `P053A` varchar(6) DEFAULT NULL,
  `P054` int(11) DEFAULT NULL,
  `P055` int(11) DEFAULT NULL,
  `P056` int(11) DEFAULT NULL,
  `P057` int(11) DEFAULT NULL,
  `P057A` varchar(10) DEFAULT NULL,
  `P057B` varchar(10) DEFAULT NULL,
  `P058` int(11) DEFAULT NULL,
  `P059` int(11) DEFAULT NULL,
  `P060` int(11) DEFAULT NULL,
  `P061` int(11) DEFAULT NULL,
  `P062` int(11) DEFAULT NULL,
  `P063` int(11) DEFAULT NULL,
  `P064` int(11) DEFAULT NULL,
  `P065` int(11) DEFAULT NULL,
  `P066` int(11) DEFAULT NULL,
  `P067` varchar(10) DEFAULT NULL,
  `P068` varchar(10) DEFAULT NULL,
  `P069` int(11) DEFAULT NULL,
  `P070` int(11) DEFAULT NULL,
  `P071` int(11) DEFAULT NULL,
  `P072` int(11) DEFAULT NULL,
  `P073` int(11) DEFAULT NULL,
  `P074` int(11) DEFAULT NULL,
  `P075` int(11) DEFAULT NULL,
  `P076` int(11) DEFAULT NULL,
  `P077` int(11) DEFAULT NULL,
  `P078` int(11) DEFAULT NULL,
  `P079` varchar(10) DEFAULT NULL,
  `P080` int(11) DEFAULT NULL,
  `P081` int(11) DEFAULT NULL,
  `P082` int(11) DEFAULT NULL,
  `P083` int(11) DEFAULT NULL,
  `P084` int(11) DEFAULT NULL,
  `P085` int(11) DEFAULT NULL,
  `P086` varchar(10) DEFAULT NULL,
  `P087` int(11) DEFAULT NULL,
  `P088` int(11) DEFAULT NULL,
  `P089` int(11) DEFAULT NULL,
  `P090` int(11) DEFAULT NULL,
  `P091` int(11) DEFAULT NULL,
  `P092` int(11) DEFAULT NULL,
  `P093` int(11) DEFAULT NULL,
  `P094` varchar(10) DEFA...

Read more...

Revision history for this message
Sladjan Avlijas (t-admin-w) wrote :

show table status;

Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment
pasiva InnoDB 10 Compact 773410 638 493879296 0 113623040 229638144 2343841 2013-05-13 11:06:41 utf8_general_ci "" ""

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Adding galera as per #9.

Revision history for this message
Ramesh Sivaraman (rameshvs02) wrote :

Not able to reproduce this issue with PXC-5.6

mysql> call `proc_replace_characters`();
Query OK, 786432 rows affected (14 min 58.36 sec)

mysql> select count(1) from pasiva;
+----------+
| count(1) |
+----------+
| 786432 |
+----------+
1 row in set (0.31 sec)

mysql> select version();
+-------------------+
| version() |
+-------------------+
| 5.6.25-73.1-25.12 |
+-------------------+
1 row in set (0.08 sec)

mysql>

Revision history for this message
Krunal Bauskar (krunal-bauskar) wrote :

As per #13 issue is not more reproducible so will close the bug as fix_committed. Some of the fix in due-course may have fixed this issue.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1065

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.