Inadequate background LRU flushing for write workloads with InnoDB compression

Bug #1295268 reported by Jan Lindström
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Laurynas Biveinis
5.6
Fix Released
High
Laurynas Biveinis

Bug Description

I have run 6h tests using both Percona Server 5.6.16-64 as MariaDB 10.0.9 using XtraDB 5.6.15-63.0. Benchmark used is Linkbench using 10x database, i.e. maxid = 100000001 and database is ~100G. Buffer pool used is 50G. I will attach results from LinkBench measure phase where the performance decrease is evident. Using ROW_FORMAT=compressed this decracation if severe and using uncompressed tables not so significant but clear. I have run compressed also using MariaDB 10.0.9 using Oracle InnoDB and there I can't see similar performance decrease. Similarly, I have run same test using MariaDB 10.0.9 and XtraDB using multi-threaded flush and atomic writes using uncompressed tables and withouth multi-threaded flush and not using atomic writes with uncompressed tables. Again using ROW_FORMAT=compressed I have seen performance degraration.

Here is what files on attached package are from

stats/percona_xtradb_uncompressed_6h.out :: Results from percona server and uncompressed tables 6h run
stats/my.cnf :: Used my.cnf
stats/percona_xtradb_uncompressed.out :: Another run with uncomp 3h
stats/xmtfluncomp.out :: Run with multi-threaded flush and atomic_writes MariaDB 10.0.9
stats/xnomtuncomp.out :: Run without multi-threaded flush MariaDB 10.0.9
stats/innodb.out :: Run with innodb compressed tables, MariaDB 10.0.9
stats/xtradb_compressed.out :: Run with MariaDB 10.0.9 and XtraDB compressed tables

If requested we can provide more data on compressed runs with Percona Server using compressed tables, difference to InnoDB is significant.

Revision history for this message
Jan Lindström (jplindst) wrote :
Revision history for this message
Jan Lindström (jplindst) wrote :

Used environment: Linux 3.4.12, Intel Xeon E5-2690 2.90GHz, 32 cores, 132G memory.
Storage: Fusion-io ioDrive2 Duo 2.41TB, Firmware 7.2.5 rev 110646, fortatted to nvmfs

Revision history for this message
Jan Lindström (jplindst) wrote :

Another round of test firstly with Percona Server 5.6.16-64 and ROW_FORMAT=COMPRESSED using linkbench with 10x database and 6h time limit. Similar run with MariaDB 10.0.10 (unofficial). End results:

Percona: 1998 ops/sec
MariaDB: 21074 ops/sec

This basically currently means that we can't compare page compression and multi-threaded flush performance affects on XtraDB because numbers look too unrealistic. Attached, another package nstat.tgz containing files percona_server_compressed.out and mariadb10010_innodb_compressed.out to show current difference between XtraDB and InnoDB on servers where no changes are done.

Revision history for this message
Jan Lindström (jplindst) wrote :
tags: added: performance xtradb
Revision history for this message
Jan Lindström (jplindst) wrote :

Just for the record, same test using page compressed tables, multi-threaded flush and lz4 gives 31360 requests/second on 6h time limit. This is better than with uncompressed.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

To begin, buf_free_from_unzip_LRU_list_batch needs updating for the XtraDB flushing.

summary: - Performance of XtraDB slows down significantly on long benchmarks
+ Inadequate background LRU flushing for write workloads with InnoDB
+ compression
Revision history for this message
Jan Lindström (jplindst) wrote :

Hi,

The provided fix candidate significantly improved the situation but not fixed it fully. Now the performance remains quite a stable for whole 6h LinkBench run. However, still performance numbers measured using MariaDB 10.0.9 containing the fix candidate contain a significant difference between XtraDB and InnoDB.

XtraDB: Requests/second = 15700
InnoDB: Requests/second = 20485

These number without using multi-threaded flush or atomic writes and doublewrite on. I will attach full outputs from both LinkBench measure phases and used configuration file and used load and run commands for LinkBench.

R: Jan

Revision history for this message
Jan Lindström (jplindst) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Jan -

For "the provided fix candidate" have you used a patch by Alexey S sent by e-mail or the branch linked to this bug?

Revision history for this message
Jan Lindström (jplindst) wrote :

Hi,

I used patch from Alexey S.

R: Jan

Revision history for this message
Jan Lindström (jplindst) wrote :

I see, I will test with this fix candidate, it looks even better than what Alexey proposed...problem is that testing just takes 8h machine time and 15min person time between.

Revision history for this message
Jan Lindström (jplindst) wrote :

Hi, I'm now running with the changes on branch. It started with 17K, I will get back about 5h when it is finished, but still not 20K as InnoDB.

Revision history for this message
Jan Lindström (jplindst) wrote :

Hi,

After the 6h LinkBench run I can say that again improved to 16-17K ops/sec but still we are missing about 3-5K ops/sec when comparing InnoDB on same build and same parameters. I can provide the full output from measure phase if requested.

End results was: 16683 ops/sec.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Thanks Jan. I am submitting the attached branch for code review and I have split off the remaining regression of 3-5K ops/sec to bug 1298812 for further analysis.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Fix for this caused bug 1326348.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-772

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.