Failure to start with "InnoDB: Warning: problems renaming '(name not specified)' to ..."

Bug #1356914 reported by Bart
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
New
Undecided
Unassigned

Bug Description

It seems our database attempted to restart when it received the following error:

...
140814 5:18:13 InnoDB: Error creating file './something_staging/trucker_check.ibd'.
140814 5:18:13 InnoDB: Operating system error number 17 in a file operation.
InnoDB: Error number 17 means 'File exists'.
...

Details are attached or here: http://pastebin.com/raw.php?i=EzH51eBr

After the restart, it attempts a recovery but will crash again:

...
InnoDB: Doing recovery: scanned up to log sequence number 109613855949217
140814 9:53:38 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 140814 9:54:07 InnoDB: Warning: problems renaming '(name not specified)' to 'something_staging/enChanted_stg', 1000 iterations
140814 9:54:27 InnoDB: Warning: problems renaming '(name not specified)' to 'something_staging/enChanted_stg', 2000 iterations
...
140814 10:02:10 InnoDB: Warning: problems renaming '(name not specified)' to 'something_staging/enChanted_stg', 25000 iterations
140814 10:02:10 InnoDB: Assertion failure in thread 140083498571520 in file fil0fil.c line 2312
InnoDB: We intentionally generate a memory trap.
...

After hitting the assertion it dies. The stack trace is attached or here: http://pastebin.com/raw.php?i=B5zwM67a

Previously we had to recover from backup to get back up and running, but maybe this is a bug that should not restart the database when it finds an existing file and just move it aside? Alternatively, is there a way to identify '(name not specified)' and get recovery to complete?

This is on CentOS release 6.2 (Final) with Percona-Server-server-55-5.5.38-rel35.2.el6.x86_64.

Revision history for this message
Bart (bart-axelabs) wrote :
Bart (bart-axelabs)
description: updated
Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Thank you for the problem report. Please, upload your my.cnf file content and the entire error log (or at least tail of it since last successful startup before the problem happened) from that instance (if you still have it).

I wonder what happened to that trucker_check table in the something_staging database immediately before the problem.

Changed in percona-server:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona Server because there has been no activity for 60 days.]

Changed in percona-server:
status: Incomplete → Expired
Changed in percona-server:
status: Expired → New
status: New → Incomplete
Revision history for this message
Tabor Wells (twells) wrote :

This Bug Report was from my site although from someone who has since left the company.

Regardless -- we've found the cause for this which I thought I'd add here.

1) Newer kernel set vm.swappiness = 0
2) Heavy RAM usage by MySQL during a large InnoDB table rebuild
3) Lack of tuning of oom_score_adj for mysqld

Basically OOMkiller was shooting mysqld right in the middle of a large fact table rebuild in our db server which resulted in InnoDB being unable to recover on restart.

It would loop errors like:

141125 17:22:57 InnoDB: Warning: problems renaming '(name not specified)' to 'database_name/table_fact_stg', 24000 iterations

until it eventually gave up trying to recover and shut down. Increasing the innodb_force_recovery values and retrying never succeeds (through at least innodb_force_recovery=3)

I'm guessing that OOMkiller was killing mysqld right in the middle of one of the alters being done on this table resulting in the loss of the temp table being used for the alter.

We've implemented something along the lines of the suggestion here: http://lists.mysql.com/mysql/227187

I'm not sure there's an actual bug to be fixed here although I wish there was a way for us to skip the attempted operation on that table so that we could have rebuilt it separately rather than having to rebuild the entire MySQL database from backup (which in our case is almost 0.5 Tb of data)

Changed in percona-server:
status: Incomplete → New
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3175

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.