Comment 23 for bug 1193350

Revision history for this message
Nickolay Ihalainen (ihanick) wrote :

It's not possible to reproduce this bug in recent versions of Percona Server 5.6 and 5.5, it's fixed by:
https://bugs.launchpad.net/percona-server/+bug/1342494

Basically, the problem happens if a single bitmap file was removed in the middle of select query from information_schema.innodb_changed_pages delete could be done with RESET CHANGED_PAGE_BITMAPS; or purge command.

I have emulated deletion with:
os_file_delete_if_exists(innodb_file_bmp_key, "ib_modified_log_1_0.xdb");

inside log_online_setup_bitmap_file_range function.

The line should be added around:
/* 2nd pass: get the file names in the file_seq_num order */

but before
while (!os_file_readdir_next_file(srv_data_home, bitmap_dir,
       &bitmap_dir_file_info)) {

This will force bitmap iterator files structure to contain all zeros
around: /* 2nd pass: get the file names in the file_seq_num order */
 memset(bitmap_files->files, 0,
        bitmap_files->count * sizeof(bitmap_files->files[0]));

In debug mode inconsistency will be found with bitmap_files->files[0].seq_num check:

#ifdef UNIV_DEBUG
 if (!bitmap_files->files[0].seq_num) {

  log_online_diagnose_inconsistent_dir(bitmap_files);
  return FALSE;
 }
 ut_ad(bitmap_files->files[0].seq_num == first_file_seq_num);
 {
  size_t i;
  for (i = 1; i < bitmap_files->count; i++) {
   if (!bitmap_files->files[i].seq_num) {
    break;
   }
   ut_ad(bitmap_files->files[i].seq_num
         > bitmap_files->files[i - 1].seq_num);
   ut_ad(bitmap_files->files[i].start_lsn
         >= bitmap_files->files[i - 1].start_lsn);
  }
 }
#endif

In fresh percona server builds
this code is no longer debug-only:

 if (!bitmap_files->files[0].seq_num
     || bitmap_files->files[0].seq_num != first_file_seq_num) {

  log_online_diagnose_inconsistent_dir(bitmap_files);
  return FALSE;
 }

And it's not possible to crash mysqld by deleting bitmap file at certain point of time.

The change was done at 06069d9e342bcb3277fccf586cc6b9181c98536c

-#ifdef UNIV_DEBUG
- if (!bitmap_files->files[0].seq_num) {
+ if (!bitmap_files->files[0].seq_num
+ || bitmap_files->files[0].seq_num != first_file_seq_num) {

                log_online_diagnose_inconsistent_dir(bitmap_files);
                return FALSE;
        }
- ut_ad(bitmap_files->files[0].seq_num == first_file_seq_num);
+

Why we are seeing "system error number 21"?

https://launchpadlibrarian.net/143030765/gdb_87_210613-2239_FULL.txt
#9 0x000000000094b428
in_files = {
            count = 1,
            files = 0x7f711c01c720
          },
          in_i = 0,
          in = {
            name = "./\000

MySQL is trying to read "./" and It's only a single file in our list (count=1).
If there is a single file it could be opened at the end of log_online_bitmap_iterator_init function by log_online_open_bitmap_file_read_only.

This function contains a debug-only check: ut_ad(name[0] != '\0');
and this function adding srv_data_home to filename. In our case srv_data_home = "./", thus the file name should be "\0".

The file name is filled by log_online_setup_bitmap_file_range, the count variable is also filled by this function.
The function reads directory twice and if the file will be removed after second run, we will have count 1 (from first try) and files structure will be empty (after memset 0).