Periodical stalls in high memory write intensive tpcc-mysql workload

Bug #1041700 reported by Vadim Tkachenko
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.1
Won't Fix
Undecided
Unassigned
5.5
Expired
Medium
Unassigned
5.6
Invalid
Undecided
Unassigned
5.7
Invalid
Undecided
Unassigned

Bug Description

This bug is created to report a problem I observe and to track progress of research.

In tpcc-mysql workloads on boxes with big memory / big datasize I still see periodical stalls
(throughput drops to 0).

It appears for example in 2000 W (200G) datasize + 100GB buffer pool.

Attached png with 16 (no stalls) vs 64 (stalls) threads.

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :
Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :
Download full text (5.3 KiB)

There is a stack trace of most threads during stalls

Thread 20 (Thread 0x7f728497c700 (LWP 21781)):
#0 0x000000313a20b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000008f8207 in os_cond_wait (event=0x2c878e0, reset_sig_count=12)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/os/os0sync.c:207
#2 os_event_wait_low (event=0x2c878e0, reset_sig_count=12)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/os/os0sync.c:609
#3 0x0000000000821135 in srv_suspend_mysql_thread (thr=0x7f71dc014d28)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/srv/srv0srv.c:1837
#4 0x0000000000803c6a in row_mysql_handle_errors (new_err=0x7f7284979f18, trx=0x7f723c010e48, thr=0x7f71dc014d28, savept=0x0)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0mysql.c:598
#5 0x0000000000810b37 in row_search_for_mysql (buf=0x7f71dc013d60 "\377", mode=2, prebuilt=0x7f71dc0145a8, match_mode=1,
    direction=0)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0sel.c:4745
#6 0x00000000007e6edb in ha_innobase::index_read (this=0x7f71dc012130, buf=0x7f71dc013d60 "\377", key_ptr=<value optimized out>,
    key_len=<value optimized out>, find_flag=HA_READ_KEY_EXACT)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/handler/ha_innodb.cc:6664
#7 0x000000000069709f in handler::read_range_first (this=0x7f71dc012130, start_key=<value optimized out>,
    end_key=<value optimized out>, eq_range_arg=<value optimized out>, sorted=<value optimized out>)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/sql/handler.cc:4857
#8 0x0000000000698ae3 in handler::read_multi_range_first (this=0x7f71dc012130, found_range_p=0x7f728497a010,
    ranges=0x7f723c0d2c40, range_count=1, sorted=<value optimized out>, buffer=<value optimized out>)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/sql/handler.cc:4731
#9 0x0000000000765a14 in QUICK_RANGE_SELECT::get_next (this=0x7f723c0d4920)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/sql/opt_range.cc:8711
#10 0x000000000077a034 in rr_quick (info=0x7f728497a2e0)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/sql/records.cc:344
#11 0x00000000005f2456 in mysql_update (thd=0xac7d3050, table_list=0x7f723c052a20, fields=..., values=..., conds=0x7f723c053d10,
    order_num=<value optimized out>, order=<value optimized out>, limit=18446744073709551615, handle_duplicates=DUP_ERROR,
    ignore=false, found_return=0x7f728497b020, updated_return=0x7f728497b018)
    at /home/jenkins/workspace/percona-server-5.5-...

Read more...

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

The stack trace about corresponds to source code (file row0sel.c)

lock_table_wait:
        mtr_commit(&mtr);
        mtr_has_extra_clust_latch = FALSE;

        trx->error_state = err;

        /* The following is a patch for MySQL */

        que_thr_stop_for_mysql(thr);

        thr->lock_state = QUE_THR_LOCK_ROW;

        if (row_mysql_handle_errors(&err, trx, thr, NULL)) {
                /* It was a lock wait, and it ended */

which hints to lock_table_wait

It is unclear what does lock table do there.

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :
Download full text (5.6 KiB)

Another stack trace is

Thread 10 (Thread 0x7fe55e796700 (LWP 12896)):
#0 0x0000003aa800b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000008f8207 in os_cond_wait (event=0x100a62a50, reset_sig_count=3975)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/os/os0sync.c:207
#2 os_event_wait_low (event=0x100a62a50, reset_sig_count=3975)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/os/os0sync.c:609
#3 0x000000000087648c in buf_flush_wait_batch_end (buf_pool=<value optimized out>, type=BUF_FLUSH_LIST)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/buf/buf0flu.c:1951
#4 0x00000000008e672c in log_preflush_pool_modified_pages ()
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/log/log0log.c:1697
#5 log_checkpoint_margin ()
---Type <return> to continue, or q <return> to quit---
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/log/log0log.c:2192
#6 log_check_margins ()
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/log/log0log.c:3087
#7 0x0000000000911c8b in log_free_check (mode=2, index=0x7fbd88030338, entry=0x7fbd980d3f08, n_ext=0, thr=0x7fbdac0549c8)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/include/log0log.ic:443
#8 row_ins_index_entry_low (mode=2, index=0x7fbd88030338, entry=0x7fbd980d3f08, n_ext=0, thr=0x7fbdac0549c8)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0ins.c:2005
#9 0x0000000000916436 in row_ins (thr=0x7fbdac0549c8)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0ins.c:2278
#10 row_ins_step (thr=0x7fbdac0549c8)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0ins.c:2612
#11 0x0000000000804cc6 in row_insert_for_mysql (mysql_rec=<value optimized out>, prebuilt=0x7fbdac054058)
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/row/row0mysql.c:1255
#12 0x00000000007ee38a in ha_innobase::write_row (this=0x7fbdac051be0, record=0x7fbdac053810 "\211\001\f")
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/storage/innobase/handler/ha_innodb.cc:5911
#13 0x000000000069b682 in handler::ha_write_row (this=0x7fbdac051be0, buf=0x7fbdac053810 "\211\001\f")
    at /home/jenkins/workspace/percona-server-5.5-binaries/label_exp/centos5-64/Percona-Server-5.5.27-rel28.0/sql/handler.cc:5334
#14 0x00000000005668d5 in write_record (thd=0x100bc9690, table=0x7fbdac0512e0, info=0x7...

Read more...

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

My general conclusion so far is that current flushing algorithm (keep_average), even has improvement, still can't handle
such workload. Basically we can perform changes in memory much faster than to flush to storage (even fast one, like PCIe SSD).
Having big buffer pool (150-200GB) we need to flush a lot of pages, as see that often we fall into "async" flushing mode.

I think the proper solution will be new (yeah, one more) flushing algorithm. I need to check what InnoDB team has in MySQL 5.6.6.

Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

Further research gives me an idea that something is still wrong in InnoDB.
When I push to flush 40000 pages, under high load InnoDB still writes ~2500 pages/sec.
I guess it is because InnoDB has to walk through list and apply mutexes several times on each of 40000 pages.
In result it is able to process only 2500 pages/sec.
Having 150+GB buffer pool it is far not enough to prevent stalls.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Vadim, is this still relevant?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Percona Server 5.5 because there has been no activity for 60 days.]

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-2794

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.