Percona Server moved to https://jira.percona.com/projects/PS

tokudb: thread contention with small block size on cachetable access

Bug #1674267 reported by Rick Pizzi on 2017-03-20

This bug affects 1 person

	Status	Importance	Assigned to
Percona Server moved to https://jira.percona.com/projects/PS	Status tracked in 5.7
5.5	Invalid	Undecided	Unassigned
5.6	Opinion	Undecided	Unassigned
5.7	Opinion	Undecided	Unassigned

Bug Description

We have noticed that running an INSERT benchmark with tokudb_block_size=16384 causes the load average of the server to skyrocket up to insane values.

Our test hardware is a 56x Intel Xeon E5-2660 v4 @ 2.00GHz, and 2 x800G Intel SSD DC P3608.

Running the benchmark with default Toku blocksize doesn't show anything weird in the system metrics, but if we create the same table with 16k blocksize and run the benchmark, load average goes over 600.

We did some experiments and have found that if we reduce the number of available threads for cachetable access, the problem is mitigated without much impact on the benchmark QPS.

By default, tokudb_cachetable_pool_threads is computed as 2 x NCPU - in our test machine this would be 112. Leaving it to default exhibits the problem. Please note that other metrics are just fine (user CPU, system CPU, iowait show reasonable values for the workload) - it is just the load average which goes crazy, so we suspect there must be some contention around the cachetable access when block size is small.

We tried reducing the value of tokudb_cachetable_pool_threads and when we tried with a value of 16, our benchmark ran fine, with identical CPU metrics (and a bit less iowait), and with very similar QPS values.

FWIW, we tried this with and without a PK and the effect is the same.

This does not seem related to value of tokudb_cache_size as we have tried both with small (1G) and large (10G) cache to the same effect. In our tests, we used DIRECTIO and NO compression.

Thanks
Rick

Tags:

Revision history for this message

George Ormond Lorch III (gl-az) wrote on 2017-05-25:

Hey Rick,
Thanks for the report, I am moving this over to https://jira.percona.com/browse/TDB-37 for future tracking and marking it as opinion here as there is no other accurate matching state.

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-3665

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.