Percona Server with XtraDB

Percona server crashes on ALTER TABLE on temporary table

Reported by Bart Verwilst on 2012-04-10
32
This bug affects 8 people
Affects Status Importance Assigned to Milestone
Percona Server
High
Laurynas Biveinis
5.1
High
Laurynas Biveinis
5.5
High
Laurynas Biveinis

Bug Description

We have a 3.2TB big production server running SLES 11.1 and Mysql Server 5.1.54.

I created a backup with xtrabackup ( and cp, rsync, ... on other occasions ), and used this data to setup a new server on CentOS 6.2 with Percona server 5.5.21 ( and 5.1.61 later on to make sure it wasn't only related to 5.5, which it isnt, crashes on PS 5.1 as well. ).

Setting up a slave to the 'old' server from the CentOS server works like a charm, 25000 seconds worth of transactions completed just fine, and the PS 5.5 instance caught up with the master.

Then i wanted to create a temporary table, fill it with some data, and then needed to add a primary key to it. It crashed.
I have consistently been able to reproduce it on this data by doing:

mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create temporary table t1 (id int);
Query OK, 0 rows affected (1.97 sec)

mysql> alter table t1 engine=myisam;
ERROR 2013 (HY000): Lost connection to MySQL server during query

Tried both with engine=myisam and innodb.

I've resolved the stacktrace to this:

    0x3188a0f4a0 _end + -2020000832
    0x9149db fil_space_is_corrupt + 391995
    0x8fdefa fil_space_is_corrupt + 299098
    0x86c48a _ZN21ha_innobase_add_indexD0Ev + 427834
    0x81d241 _ZN21ha_innobase_add_indexD0Ev + 103665
    0x822310 _ZN21ha_innobase_add_indexD0Ev + 124352
    0x823e51 _ZN21ha_innobase_add_indexD0Ev + 131329
    0x9117f8 fil_space_is_corrupt + 379224
    0x911f76 fil_space_is_corrupt + 381142
    0x816cbd _ZN21ha_innobase_add_indexD0Ev + 77677
    0x817ceb _ZN21ha_innobase_add_indexD0Ev + 81819
    0x8e9e92 fil_space_is_corrupt + 217074
    0x834bc6 _ZN21ha_innobase_add_indexD0Ev + 200310
    0x7f6b7b innobase_get_trx + 5419
    0x5910aa _Z16check_valid_pathPKcm + 5322
    0x51654a unireg_abort + 2858
    0x50fa79 _start + 41

Full mysqld.err attached ( disregard the beginning of the file, that was before my backup was restored there ;) ).

Using a clean /var/lib/mysql doesn't trigger the crash, so it must be some kind of corruption in the live data..

Good luck!

Bart Verwilst (verwilst) wrote :
Bart Verwilst (verwilst) wrote :

I have just tried the same on a SLES 11.1p1 system running 5.1.54-community-log with the same 3.2TB dataset, runs just fine.

mysql> use test
Database changed
mysql> create temporary table t1 (id int);
Query OK, 0 rows affected (0.06 sec)

mysql> alter table t1 engine=myisam;
Query OK, 0 rows affected (0.10 sec)
Records: 0 Duplicates: 0 Warnings: 0

Bart Verwilst (verwilst) wrote :

On Centos 6.2 ( same system that previously crashed ) the create + alter works fine with 5.1.62-community-log and 5.5.23-log. I guess this must be some Percona-added bug.. ;)

The issue is that InnoDB dictionary header points to XtraDB-specific SYS_STATS table at a tablespace location that is taken by something else. Since this was replicated from MySQL and not Percona Server, it shows that original database has been "touched" by XtraDB at some point and thus got SYS_STATS table and a pointer to it in the dictionary header. When the original database went back to InnoDB, it ignored the dictionary header fields it does not recognize but overwrote the SYS_STATS table. Upon the replication Percona Server tried to access SYS_STATS again at an overwritten location, causing the crash.

The workaround is to clear XtraDB-specific fields on all instances. For Percona Server:
1) Stop mysqld
2) $ printf '\0\0\0\0' | dd of=ibdata1 bs=1 seek=114778 count=4 conv=notrunc
3) $ printf '\0\0\0\0\0\0\0\0' | dd of=ibdata1 bs=1 seek=114982 count=8 conv=notrunc
4) Start mysqld with --skip-innodb-checksum --innodb_use_sys_stats_table
5) Stop mysqld
6) Start mysqld with regular options.

Bart Verwilst (verwilst) wrote :

Hi,

I guess the 114778 offset of ibdata1 in the comment above is strictly for our database, and cannot be applied with the same offset on other ibdata files for other users that might stumble onto this bugreport right? Just to make sure that nobody totally corrupts their data by copy-pasting from comment #4 :)

Download full text (3.5 KiB)

Bart -

Yes, good catch, thank you. These offsets work are for databases that
do not use compression and use the default 16K page size.

In other cases they will need to be calculated manually for space = 0,
page = 7, 4 bytes at page offset 52 (DICT_HDR_STATS) and 8 bytes at
page offset 256 (DICT_HDR_XTRADB_MARK).

2012 m. balandis 25 d. 14:33, Bart Verwilst <email address hidden> rašė:
> Hi,
>
> I guess the 114778 offset of ibdata1 in the comment above is strictly
> for our database, and cannot be applied with the same offset on other
> ibdata files for other users that might stumble onto this bugreport
> right? Just to make sure that nobody totally corrupts their data by
> copy-pasting from comment #4 :)
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/978036
>
> Title:
>  Percona server crashes on ALTER TABLE on temporary table
>
> Status in Percona Server with XtraDB:
>  In Progress
> Status in Percona Server 5.1 series:
>  Triaged
> Status in Percona Server 5.5 series:
>  In Progress
>
> Bug description:
>  We have a 3.2TB big production server running SLES 11.1 and Mysql
>  Server 5.1.54.
>
>  I created a backup with xtrabackup ( and cp, rsync, ... on other
>  occasions ), and used this data to setup a new server on CentOS 6.2
>  with Percona server 5.5.21 ( and 5.1.61 later on to make sure it
>  wasn't only related to 5.5, which it isnt, crashes on PS 5.1 as well.
>  ).
>
>  Setting up a slave to the 'old' server from the CentOS server works
>  like a charm, 25000 seconds worth of transactions completed just fine,
>  and the PS 5.5 instance caught up with the master.
>
>  Then i wanted to create a temporary table, fill it with some data, and then needed to add a primary key to it. It crashed.
>  I have consistently been able to reproduce it on this data by doing:
>
>  mysql> use test
>  Reading table information for completion of table and column names
>  You can turn off this feature to get a quicker startup with -A
>
>  Database changed
>  mysql> create temporary table t1 (id int);
>  Query OK, 0 rows affected (1.97 sec)
>
>  mysql> alter table t1 engine=myisam;
>  ERROR 2013 (HY000): Lost connection to MySQL server during query
>
>
>  Tried both with engine=myisam and innodb.
>
>  I've resolved the stacktrace to this:
>
>      0x3188a0f4a0 _end + -2020000832
>      0x9149db fil_space_is_corrupt + 391995
>      0x8fdefa fil_space_is_corrupt + 299098
>      0x86c48a _ZN21ha_innobase_add_indexD0Ev + 427834
>      0x81d241 _ZN21ha_innobase_add_indexD0Ev + 103665
>      0x822310 _ZN21ha_innobase_add_indexD0Ev + 124352
>      0x823e51 _ZN21ha_innobase_add_indexD0Ev + 131329
>      0x9117f8 fil_space_is_corrupt + 379224
>      0x911f76 fil_space_is_corrupt + 381142
>      0x816cbd _ZN21ha_innobase_add_indexD0Ev + 77677
>      0x817ceb _ZN21ha_innobase_add_indexD0Ev + 81819
>      0x8e9e92 fil_space_is_corrupt + 217074
>      0x834bc6 _ZN21ha_innobase_add_indexD0Ev + 200310
>      0x7f6b7b innobase_get_trx + 5419
>      0x5910aa _Z16check_valid_pathPKcm + 5322
>      0x51654a unireg_abort + 2858
>      0x50fa79 _start + 41
>
>  Full mysqld.err attached ( disregard the beginning of the...

Read more...

Stewart Smith (stewart) wrote :

This is the commit comment, should help in documenting the issue:

 Fix bug 978036: Percona Server crashes if SYS_STATS was corrupted or
    overwritten.

    The fix is to detect this corruption and create new SYS_STATS. On
    server startup with existing database, call new function
    dict_verify_xtradb_sys_stats() to scan the SYS_STATS clustered index.
    For the scan, set temporarily srv_pass_corrupt_table = 1 to avoid
    hitting any fatal asserts.

    Also for this scan adjust btr_validate_index(),
    btr_root_fseg_validate(), btr_root_block_get() not to assert when
    srv_pass_corrupt_table != 0. This might also make the
    --innodb_corrupt_table_action=warn less likely to crash too. It is
    possible that larger SYS_STATS table would hit other asserts, but that
    will be easy to diagnose and fix as necessary.

    If a SYS_STATS corruption is detected, invoke new function
    dict_recreate_xtradb_sys_stats() that creates the new clustered index,
    rewrites the dictionary header, purges the old SYS_STATS info
    from the dictionary cache and creates it again. For its
    implementation split out two XtraDB-specific functions of dict_boot():
    dict_create_xtradb_sys_stats() and
    dict_add_to_cache_xtradb_sys_stats().

    For testing, add a new debug-only server variable
    --innodb-sys-stats-root-page that overrides the SYS_STATS root page id
    found in the header with a specified value.
    dict_add_to_cache_xtradb_sys_stats() uses it on its first invocation.

    Add new test innodb/percona_corrupted_sys_stats and
    sys_vars/innodb_sys_stats_root_page_basic.

    Re-record percona_server_variables_debug tests.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments