tokudb: thread contention with small block size on cachetable access

Bug #1674267 reported by Rick Pizzi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Status tracked in 5.7
5.5
Invalid
Undecided
Unassigned
5.6
Opinion
Undecided
Unassigned
5.7
Opinion
Undecided
Unassigned

Bug Description

We have noticed that running an INSERT benchmark with tokudb_block_size=16384 causes the load average of the server to skyrocket up to insane values.

Our test hardware is a 56x Intel Xeon E5-2660 v4 @ 2.00GHz, and 2 x800G Intel SSD DC P3608.

Running the benchmark with default Toku blocksize doesn't show anything weird in the system metrics, but if we create the same table with 16k blocksize and run the benchmark, load average goes over 600.

We did some experiments and have found that if we reduce the number of available threads for cachetable access, the problem is mitigated without much impact on the benchmark QPS.

By default, tokudb_cachetable_pool_threads is computed as 2 x NCPU - in our test machine this would be 112. Leaving it to default exhibits the problem. Please note that other metrics are just fine (user CPU, system CPU, iowait show reasonable values for the workload) - it is just the load average which goes crazy, so we suspect there must be some contention around the cachetable access when block size is small.

We tried reducing the value of tokudb_cachetable_pool_threads and when we tried with a value of 16, our benchmark ran fine, with identical CPU metrics (and a bit less iowait), and with very similar QPS values.

FWIW, we tried this with and without a PK and the effect is the same.

This does not seem related to value of tokudb_cache_size as we have tried both with small (1G) and large (10G) cache to the same effect. In our tests, we used DIRECTIO and NO compression.

Thanks
Rick

Tags: tokudb
Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Hey Rick,
Thanks for the report, I am moving this over to https://jira.percona.com/browse/TDB-37 for future tracking and marking it as opinion here as there is no other accurate matching state.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3665

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.