percona crashes when issued "start slave"

Bug #1258016 reported by carlos campos
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Expired
Undecided
Unassigned
5.6
Expired
Undecided
Unassigned

Bug Description

2013-12-04 12:35:22 22359 [Note] Slave I/O thread: connected to master 'replication@x.x.x.x:3306',replication started in log 'mysql-bin.001260' at position 670418057
20:35:23 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

when restarting slave

this shows up in the mysqld.log

key_buffer_size=5368709120
read_buffer_size=2097152
max_used_connections=1
max_threads=502
thread_count=4
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 10390608 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f8824000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f884c4fdd60 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8dee25]
/usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x6659e4]
/lib64/libpthread.so.0(+0xf710)[0x7fa416a03710]
/usr/sbin/mysqld(hp_rec_hashnr+0x63)[0x8ef0c3]
/usr/sbin/mysqld(hp_write_key+0x106)[0x8f2226]
/usr/sbin/mysqld(heap_write+0x103)[0x8f2783]
/usr/sbin/mysqld(_ZN7ha_heap9write_rowEPh+0x3a)[0x8edaba]
/usr/sbin/mysqld(_ZN7handler12ha_write_rowEPh+0xff)[0x5b123f]
/usr/sbin/mysqld[0x6c21f8]
/usr/sbin/mysqld[0x6c5041]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x1c1)[0x6c5571]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x2aa)[0x6c5d8a]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x250)[0x7097d0]
/usr/sbin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x187)[0x70a057]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4943)[0x6e97c3]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x608)[0x6ea3c8]
/usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_infoPKcj+0xa93)[0x8825e3]
/usr/sbin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x68)[0x884a88]
/usr/sbin/mysqld(_Z26apply_event_and_update_posPP9Log_eventP3THDP14Relay_log_info+0x23b)[0x8aebfb]
/usr/sbin/mysqld[0x8b4ea3]
/usr/sbin/mysqld(handle_slave_sql+0xb3e)[0x8b7f4e]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0x91e4ea]
/lib64/libpthread.so.0(+0x79d1)[0x7fa4169fb9d1]
/lib64/libc.so.6(clone+0x6d)[0x7fa41531bb6d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f8824008bd0): is an invalid pointer
Connection ID (thread ID): 6
Status: NOT_KILLED

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: x.x.x.x
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.001260
          Read_Master_Log_Pos: 671232581
               Relay_Log_File: relay-bin-admin1113306.000105
                Relay_Log_Pos: 32023272
        Relay_Master_Log_File: mysql-bin.001260
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table: clothing_com_-_db.usercartdatasync,clothing_com_-_db.shoppingprefdata,clothing_com_-_db.lastvisited2,clothing_com_-_db.usercartdata
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 658931054
              Relay_Log_Space: 44328914
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 0
                  Master_UUID:
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
1 row in set (0.00 sec)

mysql>

then i tried to get log info

[root@khan mysql]# mysqlbinlog --start-position=32023262 --stop-position=32023282 relay-bin-admin1113306.000105
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#131120 12:01:13 server id 1113306 end_log_pos 120 CRC32 0x5c3ba9b7 Start: binlog v 4, server v 5.6.14-56-log created 131120 12:01:13
BINLOG '
CRWNUg/a/BAAdAAAAHgAAABAAAQANS42LjE0LTU2LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAXAAEGggAAAAICAgCAAAACgoKGRkAAbep
O1w=
'/*!*/;
# at 167
#131120 15:00:20 server id 2373306 end_log_pos 0 Start: binlog v 4, server v 5.1.61-log created 131120 15:00:20
BINLOG '
BD+NUg+6NiQAZgAAAAAAAAAAAAQANS4xLjYxLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC
'/*!*/;
ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 2372600105, event_type: 55
ERROR: Could not read entry at offset 32023262: Error in log format or read error.
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Please, send your my.cnf file content from this slave and upload the entire error log (I want to find out what happens during restart following this crash, also I'll see exact server version there).

Based on these:

        Relay_Master_Log_File: mysql-bin.001260
          Exec_Master_Log_Pos: 658931054

I'd say that we need to know the content of the master's binary log starting for the position listed above. Also don't forget to add options to show row-based events to the mysqlbinlog command line.

Changed in percona-server:
status: New → Incomplete
Revision history for this message
carlos campos (carlos-7) wrote :
Download full text (3.2 KiB)

/etc/my.cnf [[contents]]

[mysqld]
event_scheduler=ON
# 20131104 disabled large pages
#large-pages
skip-name-resolve

query_cache_type = 1
query_cache_size = 1024M
#ignore-builtin-innodb

#plugin-load=innodb=ha_innodb_plugin.so
#plugin_dir=/usr/lib64/mysql/plugin

datadir=/var/lib/mysql
#datadir=/media/raid0/mysql
socket=/var/lib/mysql/mysql.sock
#socket=/media/raid0/mysql/mysql.sock
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=0
max_connections=500
max_connect_errors=10000
#innodb_buffer_pool_size=60400M
#temp change 60G
#innodb_buffer_pool_size=60G
innodb_buffer_pool_size=100G
max_allowed_packet=32M

#innodb
innodb_log_file_size = 1024M
innodb_log_buffer_size = 64M
#innodb_additional_mem_pool_size = 20M
#faster performance -- this is a slave -- use 2 instead of 0
innodb_flush_log_at_trx_commit = 2
innodb_flush_method=O_DIRECT

#innodb_buffer_pool_instances=8
#innodb_write_io_threads=16
#innodb_read_io_threads=16
innodb_file_per_table=1
#innodb_lock_wait_timeout=1073741824
innodb_lock_wait_timeout=240

#sort_buffer_size and read_buffer_size are per connection
sort_buffer_size = 8M
read_buffer_size = 2M
read_rnd_buffer_size = 32M
join_buffer_size = 8M
#join_cache_level = 4
#tmp_table_size = 512M #20131104
#tmp_table_size = 100G
tmp_table_size = 600G

max_heap_table_size=600G
transaction-isolation=repeatable-read

#myisam
#key_buffer_size = 8589934592
#key_buffer_size = 24884901888
#temp change og 24G
#key_buffer_size = 24G
#key_buffer_size = 6G # 20131104
key_buffer_size = 5G
thread_cache_size = 32
table_open_cache = 1024
#myisam_sort_buffer_size = 4294967295
#temp change 4G
#myisam_sort_buffer_size = 4G
myisam_sort_buffer_size = 2G

server-id=1113306
low_priority_updates=1
#disable log-bin to suppress binary logs
#log-bin=mysql-bin
binlog-do-db=revolveclothing_com_-_db
binlog-ignore-db=mysql
binlog-ignore-db=do_not_replicate
#log-slow-queries=/var/log/mysql/mysql-slow.log
slow_query_log=1
slow_query_log_file=/var/log/mysql/mysql-slow.log
long_query_time=1
skip-slave-start
relay-log=relay-bin-admin1113306
relay-log-index=relay-bin-admin1113306.index
replicate-ignore-table=revolveclothing_com_-_db.shoppingprefdata
replicate-ignore-table=revolveclothing_com_-_db.usercartdata
replicate-ignore-table=revolveclothing_com_-_db.usercartdatasync
#replicate-ignore-table=revolveclothing_com_-_db.lastvisited
replicate-ignore-table=revolveclothing_com_-_db.lastvisited2
# Disabling symbolic-links is recommended to prevent assorted security risks .. might be needed for partition
symbolic-links=1
# disable log warning
log_warnings=0

# Using too big key for internal temp tables
# DONT CHANGE YET -- optimizer_switch=derived_with_keys=off

#temporary -- skip missing replication records
#1062 dup errors
#1146
slave-skip-errors = 1032,1034,1062,1146

[mysql.server]
user=mysql
#basedir=/var/lib
pid-file=/var/run/mysqld/mysqld.pid

[mysqld_safe]
log-error=/var/log/mysql/mysqld.log
#pid-file=/var/run/mysqld/mysqld.pid

[mysql]

port = 3306
#socket = /media/raid0/mysql/mysql.sock
socket ...

Read more...

Revision history for this message
carlos campos (carlos-7) wrote :

this is our entire log file, thank you for taking the time to have a look.

Revision history for this message
carlos campos (carlos-7) wrote :

[[ Relay_Master_Log_File: mysql-bin.001260
          Exec_Master_Log_Pos: 658931054

I'd say that we need to know the content of the master's binary log
starting for the position listed above. Also don't forget to add options
to show row-based events to the mysqlbinlog command line.
]]

>>>> from the master db binlogs

[root@admin237 mysql]# mysqlbinlog -v --start-position=658931054 --stop-position=658931055 mysql-bin.001260
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#131120 15:00:20 server id 2373306 end_log_pos 106 Start: binlog v 4, server v 5.1.61-log created 131120 15:00:20
BINLOG '
BD+NUg+6NiQAZgAAAGoAAAAAAAQANS4xLjYxLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC
'/*!*/;
# at 658931054
#131120 18:08:01 server id 2373306 end_log_pos 658931432 Query thread_id=1010612 exec_time=8 error_code=0
use revolveclothing_com_-_db/*!*/;
SET TIMESTAMP=1384999681/*!*/;
SET @@session.pseudo_thread_id=1010612/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.time_zone='SYSTEM'/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
replace into id_shipcost select mid(productcode, locate('-',productcode)+1,2) as GCat, round(sum(shippingcost)/count(*),2) as ShipCost from shipment where boxtime between date_sub(current_date, interval 180 day) and date_sub(current_date, interval 1 day) and shippingcost>0 group by GCat
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

Changed in percona-server:
status: Incomplete → New
Revision history for this message
carlos campos (carlos-7) wrote : Re: [Bug 1258016] Re: percona crashes when issued "start slave"
Download full text (8.6 KiB)

Hi Valerii

I have uploaded/posted all the information you have requested, I believe.

Thank you for taking the time to look into this problem. aside from
replication breaking; we can sometimes "workaround" this by skipping the
relay log or worse case moving to the next one; *however, *this is an
experimental box for us with 1 terabyte of ram to be mainly used as memory
database using the very nice dynamic row options :), and when we get the
signal 11... mysql dies and all the memory that is held in memory goes away
(takes about 1 day to load)

kindly offer a suggestion I you can think of one to prevent the mysql
process from restarting and loosing all the data in memory tables,

thank you again

Carlos.

On Thu, Dec 5, 2013 at 12:30 AM, Valerii Kravchuk <
<email address hidden>> wrote:

> Please, send your my.cnf file content from this slave and upload the
> entire error log (I want to find out what happens during restart
> following this crash, also I'll see exact server version there).
>
> Based on these:
>
> Relay_Master_Log_File: mysql-bin.001260
> Exec_Master_Log_Pos: 658931054
>
> I'd say that we need to know the content of the master's binary log
> starting for the position listed above. Also don't forget to add options
> to show row-based events to the mysqlbinlog command line.
>
> ** Changed in: percona-server
> Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1258016
>
> Title:
> percona crashes when issued "start slave"
>
> Status in Percona Server with XtraDB:
> Incomplete
>
> Bug description:
> 2013-12-04 12:35:22 22359 [Note] Slave I/O thread: connected to master
> 'replication@x.x.x.x:3306',replication started in log 'mysql-bin.001260'
> at position 670418057
> 20:35:23 UTC - mysqld got signal 11 ;
> This could be because you hit a bug. It is also possible that this binary
> or one of the libraries it was linked against is corrupt, improperly
> built,
> or misconfigured. This error can also be caused by malfunctioning
> hardware.
> We will try our best to scrape up some info that will hopefully help
> diagnose the problem, but since we have already crashed,
> something is definitely wrong and this may fail.
> Please help us make Percona Server better by reporting any
> bugs at http://bugs.percona.com/
>
>
> when restarting slave
>
> this shows up in the mysqld.log
>
>
> key_buffer_size=5368709120
> read_buffer_size=2097152
> max_used_connections=1
> max_threads=502
> thread_count=4
> connection_count=1
> It is possible that mysqld could use up to
> key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads =
> 10390608 K bytes of memory
> Hope that's ok; if not, decrease some variables in the equation.
>
> Thread pointer: 0x7f8824000990
> Attempting backtrace. You can use the following information to find out
> where mysqld died. If you see no messages after this, something went
> terribly wrong...
> stack_bottom = 7f884c4fdd60 thread_stack 0x40000
> /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8dee25]
> /usr/sbin/mysqld(han...

Read more...

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Sorry for the delay with reply, but I really think that trying to keep everything in memory in HEAP/MEMORY tables (up to 600G in size each!, see max_heap_table_size) is a bad idea. You should better use InnoDB table and big buffer pool to keep "hot" part of their data in memory (with limited size).

If these crashes still happen, I think you may hit out of memory conditions. Please, check /var/log/messages for any mysqld-related messages at the moment and send the output of pt-summary script for this box.

Assuming that crash happens when statement like this is processed on slave (and you still use similar setup):

replace into id_shipcost select mid(productcode, locate('-',productcode)+1,2) as GCat, round(sum(shippingcost)/count(*),2) as ShipCost from shipment where boxtime between date_sub(current_date, interval 180 day) and date_sub(current_date, interval 1 day) and shippingcost>0 group by GCat

please, send the output of SHOW CREATE TABLE and SHOW TABLE STATUS for tables used in it, id_shipcost and shipment.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona Server 5.6 because there has been no activity for 60 days.]

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3078

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.