InnoDB adaptive flushing: Incorrect calculation of the lsn_limit

Bug #1237702 reported by Alexey Stroganov on 2013-10-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server
Status tracked in 5.7
5.1
Undecided
Unassigned
5.5
Undecided
Unassigned
5.6
Wishlist
Unassigned
5.7
Wishlist
Unassigned

Bug Description

InnoDB adaptive flushing routine calculates two factors that are used by flush list flusher:
- number of pages to flush
- LSN limit up to which flushing must happen

There are 2 issues re: calculation/usage of LSN limit, but first some background information:

LSN limit calculated as following:
          lsn_limit = oldest_lsn + lsn_avg_rate * (age_factor + 1);

oldest lsn - the oldest page in the buffer pool
lsn_avg_rate - average LSN rate
age_factor - multiplier that should help to adjust LSN limit in case of spikes, and calculated as following

        if (last_pages && cur_lsn - last_lsn > lsn_avg_rate / 2) {
                age_factor = prev_pages / last_pages;
        }

where prev_pages - is number of requested pages to flush and last_pages - number of actually flushed pages

Issue #1

Currently age_factor is used for calculation of LSN limit for the every flush request that causes notable fluctuation in flushing. However It should be applied only in case when flusher reached LSN limit during flushing and there is a _really_ needs in increasing of LSN limit.

In other cases when flusher stops before LSN limit as it either managed to find sufficient number of pages to flush or due to timeout - that means LSN limit was not reached so there is no needs to increase it.

Proposal is to modify above condition as following:

        /* If last time all pages were flushed up to lsn_limit and
           LSN increased more than on half of lsn_avg_rate(?)
           we calculate additional age factor that will help to adjust lsn_limit */
        if (oldest_lsn >= lsn_limit &&
            cur_lsn - last_lsn > lsn_avg_rate / 2) {
                age_factor = prev_pages / last_pages;
        }

Issue #2

There are 2 cases when we should omit lsn_limit and flush with LSN_MAX:

- when we at async point
- when AF is requesting pages but lsn_avg_rate still not calculated. lsn_avg_rate will get value only after srv_flushing_avg_loops.

       if (++n_iterations >= srv_flushing_avg_loops) {
       ....
                lsn_avg_rate = (lsn_avg_rate + lsn_rate) / 2;
       ....
       }

Proposal is to add following condition:

        if (age >= log_get_max_modified_age_async() || (n_pages && !lsn_avg_rate)) {
            lsn_limit = LSN_MAX;
        }

We still have the same code in current PS 5.6 from bzr, see storage/innobase/buf/buf0flu.cc. So, makes perfect sense to consider this as a Wishlist.

This is upstream. Alexey, can you report it there and give a link to it?

tags: added: innodb upstream
tags: added: performance
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers