Comment 2 for bug 1433432

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Confirmed 100% CPU with either page cleaner, LRU manager, or both threads.

We have a small buffer pool whose useful size is decreasing (until it hits the ER_LOCK_TABLE_FULL as noted in MDEV-7758 with InnoDB). Both LRU and flush list threads have heuristics to flush furiously if either checkpoint age close to log capacity or free lists empty. In this case both become true, and the same time the attempted flushing by threads won't result in any flushing. InnoDB is not affected because they sleep 1s if the last iteration did not flush anything.

I am not sure what's the best fix here. We can sleep if the last iteration did not flush, but this needs further consideration. We can also sleep if the buffer pool is approaching ER_LOCK_TABLE_FULL.

There are some immediately fixable things: 1) for active server n_flushed is being added to, and reset only by idle server flush, resulting in wrong counter values (typo += instead of =). 2) The page cleaner heuristics will return without any flushing if LSN did not advance since the last heuristic adjustment. This situation must be made incompatible with 0 ms cleaner sleep time.