Percona/InnoDB crashes "InnoDB: Failing assertion: rec"

Bug #1298937 reported by Artem Sheremet
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Invalid
Undecided
Unassigned

Bug Description

We were using Percona for quite a long time now (2 years). It worked fine until 2 weeks ago when it started crashing (restarting automatically).
Attached is a recent munin graph. Memory usage drops indicate a restart. Each of them causes a delay in service.

We have about 300-400 queries per second, ~45GB sized database, 64 GB RAM (30 dedicated to Percona), mysql directory resides on an SSD. Percona version is 5.6.16-64.2-569.squeeze-log (Debian 6.0.9).

Please let me know if you need any additional details. Looking forward to your reply.

Here's a part of mysql-error.log:

--- quote ---

InnoDB: Page may be an index page where index id is 4586
InnoDB: (index "PRIMARY" of table "ejabberd_syriatalk_org"."last")
InnoDB: Error: page old data size 10071 new data size 10087
InnoDB: Error: page old max ins size 6058 new max ins size 6042
InnoDB: Submit a detailed bug report to http://bugs.mysql.com
2014-03-28 06:22:50 7f8e8043d700 InnoDB: Assertion failure in thread 140250014013184 in file btr0cur.cc line 2440
InnoDB: Failing assertion: rec
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
03:22:50 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=134217728
read_buffer_size=4194304
max_used_connections=43
max_threads=514
thread_count=42
connection_count=42
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6454570 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

--- end quote ---

Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Please, send your my.cnf file content.

Send also the output of:

show create table ejabberd_syriatalk_org.last\G

It seems this table got corrupted for some reason, and I'd like to check what kind of table is it.

Changed in percona-server:
status: New → Incomplete
Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

Here is output from "SHOW CREATE TABLE":

[Fri Mar 28 18:45:59 2014][localhost:mysqld.sock][5.6.16-64.2-569.squeeze-log]
<root@localhost:(none)> mysql 1 ; show create table ejabberd_syriatalk_org.last\G
*************************** 1. row ***************************
       Table: last
Create Table: CREATE TABLE `last` (
  `username` varchar(250) NOT NULL,
  `seconds` text NOT NULL,
  `state` text NOT NULL,
  PRIMARY KEY (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

Attached is my.cnf. Additionally:

% sudo su -c 'cat /etc/mysql/conf.d/*'
[mysqld_safe]
syslog

Changed in percona-server:
status: Incomplete → New
Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

Additional

Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

Additional information that might appear useful:

It always fails on "last" table.
A stacktrace is seen in the log right after a crash:

stack_bottom = 7f36ae267e38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x7f3d79554cf5]
/usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x7f3d792887c4]
/lib/libpthread.so.0(+0xeff0)[0x7f3d78a00ff0]
/lib/libc.so.6(gsignal+0x35)[0x7f3d76cba1b5]
/lib/libc.so.6(abort+0x180)[0x7f3d76cbcfc0]
/usr/sbin/mysqld(+0x88edf5)[0x7f3d796bcdf5]
/usr/sbin/mysqld(+0x837abd)[0x7f3d79665abd]
/usr/sbin/mysqld(+0x838e47)[0x7f3d79666e47]
/usr/sbin/mysqld(+0x83a441)[0x7f3d79668441]
/usr/sbin/mysqld(+0x81e6bb)[0x7f3d7964c6bb]
/usr/sbin/mysqld(+0x78ad3e)[0x7f3d795b8d3e]
/usr/sbin/mysqld(_ZN7handler13ha_update_rowEPKhPh+0xca)[0x7f3d791ba9ba]
/usr/sbin/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesbPySB_+0x1b56)[0x7f3d79394e96]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x31d1)[0x7f3d793210b1]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x65d)[0x7f3d79323bdd]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x2453)[0x7f3d79326843]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x22d)[0x7f3d792eb08d]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x7f3d792eb117]
/usr/sbin/mysqld(pfs_spawn_thread+0x127)[0x7f3d7978b577]
/lib/libpthread.so.0(+0x68ca)[0x7f3d789f88ca]
/lib/libc.so.6(clone+0x6d)[0x7f3d76d57b6d]

Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

We have copied the table using 'CREATE TABLE LIKE' and 'INSERT INTO SELECT FROM', then swapped the tables using 'RENAME TABLE'. After that, mysqld works fine for about 24h now.

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

It probably means that the table was corrupted for some reason. Do you see anything unusual in /var/log/messages (or syslog) related to hardware?

Please, send the output of SHOW TABLE STATUS for the table after re-creation.

Changed in percona-server:
status: New → Incomplete
Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

Hardware failure was our first guess, but there's nothing unusual (which doesn't exclude the possibility though). S.M.A.R.T. short and long tests were successful on all drives.

Table status is:

+------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| last | InnoDB | 10 | Compact | 934482 | 74 | 69730304 | 0 | 0 | 6291456 | NULL | 2014-03-29 17:49:09 | NULL | NULL | utf8_general_ci | NULL | | |
+------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

You have very unsafe settings in my.cnf:

innodb_checksums = 0
innodb_doublewrite = 0

So, corruption could really happen. Had you got any crash or unplanned restart of MySQL server before you've started to get this kind of assertion failures?

Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

No crashes for the last 2 months. Could be possible that some of the table data has not been accessed for a long time, so we had it hidden. I will enable checksums and doublewrite, thanks - those were optimization options for the times we were running on HDD.

Changed in percona-server:
status: Incomplete → New
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Had you seen similar assertions in the meantime?

I still can not find any known and confirmed bug affecting 5.6.16 that ends up with this:

Failing assertion: rec

so I have to assume corruption of some kind caused by hardware faults.

Changed in percona-server:
status: New → Incomplete
Revision history for this message
Artem Sheremet (dot-doom+launchpad) wrote :

Hi Valerii,

  After recreating the table there were no more assertion failures in the log.
  I don't like this kind of bugs either, and according to your comment about unsafe options above I find it very likely to be the hardware failure.

  From my point of view, it is better to consider this one to be a hardware issue and assume it's status as resolved.

-Artem

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Thank you for status update. So, I'll set Invalid status for this (assuming "Not a bug").

Changed in percona-server:
status: Incomplete → Invalid
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3117

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.