Comment 4 for bug 1356914

Revision history for this message
Tabor Wells (twells) wrote :

This Bug Report was from my site although from someone who has since left the company.

Regardless -- we've found the cause for this which I thought I'd add here.

1) Newer kernel set vm.swappiness = 0
2) Heavy RAM usage by MySQL during a large InnoDB table rebuild
3) Lack of tuning of oom_score_adj for mysqld

Basically OOMkiller was shooting mysqld right in the middle of a large fact table rebuild in our db server which resulted in InnoDB being unable to recover on restart.

It would loop errors like:

141125 17:22:57 InnoDB: Warning: problems renaming '(name not specified)' to 'database_name/table_fact_stg', 24000 iterations

until it eventually gave up trying to recover and shut down. Increasing the innodb_force_recovery values and retrying never succeeds (through at least innodb_force_recovery=3)

I'm guessing that OOMkiller was killing mysqld right in the middle of one of the alters being done on this table resulting in the loss of the temp table being used for the alter.

We've implemented something along the lines of the suggestion here: http://lists.mysql.com/mysql/227187

I'm not sure there's an actual bug to be fixed here although I wish there was a way for us to skip the attempted operation on that table so that we could have rebuilt it separately rather than having to rebuild the entire MySQL database from backup (which in our case is almost 0.5 Tb of data)