Unstable changed page tracking tests due to trying to delete a non-existent file | Valgrind error at log_online_should_overwrite()

Bug #1182793 reported by Laurynas Biveinis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Critical
Laurynas Biveinis
5.1
Invalid
Medium
Unassigned
5.5
Invalid
Medium
Unassigned
5.6
Fix Released
Critical
Laurynas Biveinis

Bug Description

A number of bitmap tests fail on certain Jenkins slaves as follows:

http://jenkins.percona.com/job/percona-server-5.6-param/109/BUILD_TYPE=release,Host=centos5-64/consoleFull

innodb.percona_changed_page_bmp w3 [ fail ]
innodb.percona_changed_page_bmp_1105726 w3 [ fail ]
innodb.percona_changed_page_bmp_requests w4 [ fail ]
innodb.percona_changed_page_bmp_flush w4 [ fail ]
innodb.percona_changed_page_bmp_log_block_size w4 [ fail ]

        Test ended at 2013-05-21 10:50:42

Server log is:
2013-05-21 17:50:41 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2013-05-21 17:50:42 25398 [Note] Plugin 'FEDERATED' is disabled.
2013-05-21 17:50:42 2aeb6bfcb170 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
2013-05-21 17:50:42 25398 [Note] InnoDB: The InnoDB memory heap is disabled
2013-05-21 17:50:42 25398 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2013-05-21 17:50:42 25398 [Note] InnoDB: Compressed tables use zlib 1.2.3
2013-05-21 17:50:42 25398 [Note] InnoDB: CPU does not support crc32 instructions
2013-05-21 17:50:42 25398 [Note] InnoDB: Using Linux native AIO
2013-05-21 17:50:42 25398 [Note] InnoDB: Initializing buffer pool, size = 8.0M
2013-05-21 17:50:42 25398 [Note] InnoDB: Completed initialization of buffer pool
2013-05-21 17:50:42 25398 [Note] InnoDB: Highest supported file format is Barracuda.
2013-05-21 17:50:42 2aeb6bfcb170 InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2013-05-21 17:50:42 2aeb6bfcb170 InnoDB: File name ./ib_modified_log_1_0.xdb
2013-05-21 17:50:42 2aeb6bfcb170 InnoDB: File operation call: 'delete' returned OS error 71.
2013-05-21 17:50:42 2aeb6bfcb170 InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2013-05-21 17:50:42 25398 [ERROR] InnoDB: cannot create './ib_modified_log_1_0.xdb'

Tags: bitmap ci

Related branches

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

The code path in question must be this

- a call to log_online_should_overwrite(bitmap file) finding a 0-sized file :

 /* Currently, it's OK to overwrite 0-sized files only */
 success = os_file_get_status(path, &file_info, false);
 return success && file_info.size == 0LL;

- then log_online_start_bitmap_file() trying to delete it and failing because the file is not found, and then aborting the file create.

 /* Check for an old file that should be deleted first */
 if (log_online_should_overwrite(log_bmp_sys->out.name)) {

  success = static_cast<ibool>(
   os_file_delete(innodb_file_bmp_key,
           log_bmp_sys->out.name));
 }

...

 if (UNIV_UNLIKELY(!success)) {

  /* The following call prints an error message */
  os_file_get_last_error(TRUE);
  ib_logf(IB_LOG_LEVEL_ERROR,
   "cannot create \'%s\'\n", log_bmp_sys->out.name);
  log_bmp_sys->out.file = -1;
  return FALSE;
 }

The immediate fix is to ignore file-not-found for the delete attempt. This would be correct, but the fact that the file disappears between checking and deleting is still very suspicious and needs investigating.

tags: added: bitmap ci
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

The reason is that os_file_get_status() is returning dberr_t instead of ibool in 5.6. Which casts to ibool without a warning, but "return success" becomes a wrong check.

Also causes Valgrind errors as in

Conditional jump or move depends on uninitialised value(s)
   at 0xB9CBEB: log_online_should_overwrite(char const*)
          (log0online.cc:511)
   by 0xB9CC41: log_online_start_bitmap_file() (log0online.cc:526)
   by 0xB9CDCB: log_online_rotate_bitmap_file(unsigned long)
          (log0online.cc:573)
   by 0xB9D324: log_online_read_init() (log0online.cc:725)
   by 0xC86881: init_log_online() (srv0start.cc:1456)

Thus the fix is to
- fix os_file_get_status() return code check; and
- the previous partial fix of ignoring file not founds for os_file_delete().

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Split off race between file status check and delete to bug 1184517, leaving this bug to deal with the incorrect os_file_get_status() return value handling.

summary: Unstable changed page tracking tests due to trying to delete a non-
- existent file
+ existent file | Valgrind error at log_online_should_overwrite()
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-361

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.