Poor performance on HDD environments (wsrep_slave_threads, tuning-level)
Bug #1822903 reported by
James Troup
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Percona Cluster Charm |
Fix Released
|
Critical
|
Trent Lloyd |
Bug Description
A default charm install of a percona cluster is unusably slow unless your install involves no spinning rust whatsoever and this is not OK.
Trent details the problem in great detail here (Canonical only, sorry):
https:/
In short, I believe we should make the charm implement something like the following logic:
if ubuntu_release <= xenial:
if tuning_level == default:
tuning_level = fast
if wsrep-slave-threads > 1:
scream bloody murder into log and status... possibly refuse to start?
else:
if wsrep-slave-threads == default:
wsrep_
Changed in charm-percona-cluster: | |
status: | New → Confirmed |
importance: | Undecided → Critical |
Changed in charm-percona-cluster: | |
assignee: | nobody → Trent Lloyd (lathiat) |
Changed in charm-percona-cluster: | |
status: | Confirmed → In Progress |
Changed in charm-percona-cluster: | |
milestone: | none → 19.07 |
status: | In Progress → Fix Released |
To post a comment you must log in.
Trent's write-up on the issue (from https:/ /pastebin. canonical. com/p/nsnjg4qH6 h/)
======
For Bionic (Percona 5.7) I found that increasing wsrep-slave-threads to 48 results in a reasonably good performance boost, even on HDD storage. The reason is that this allows multiple queries to execute on the slave servers at the same time, just like they do on the master server (as all the queries from the different clients execute at the same time). When this happens, it is also able to merge multiple SQL commits in a small timeframe into a single fsync call (an optimization called 'group commit). This likely leaves a very reduced need to set innodb- tuning- level=fast for Bionic.
Unfortunately for Xenial this is not the case, it seems with the Galera backend it lacks the ability to do any kind of group commit (even though InnoDB itself will do it) - and even worse - I found that for xenial only the slave threads appear to issue 2 fsync calls for every query (as opposed to 1 on the master). The master can only run as fast as the slaves, so even though this technically only affects the slaves, it holds the master back to the same speed though does not submit quite as many fsyncs to the underlying storage on that node. Thus for Xenial the best option is likely to set innodb- tuning- level=fast which removes the fsync calls - definitely on HDD only environments but maybe even on SSD environments as the number of IOPS submitted in a busy cloud could grow quite large since they total double the number of queries per second.
Unfortunately wsrep-slave-threads > 1 also occasionally triggers a bug on Xenial (Percona 5.6) that will likely never be fixed because Percona 5.6 is long end of life upstream. In this case you sometimes (maybe once a day in the environment where we tried it) get a foreign key violation, which causes the slave to exit, restart, and clone fresh from another node. In theory this is not catastrophic since we don't send any queries to the slave thread running servers so production queries shouldn't be impacted.. Although during the SST process I think the server generating the SST does stop responding to queries while the SST is generated (I think? need to double check that is still true). If both slaves crashed out at the same time, that might cause an outage? Would need to check this further. I have not tested whether this same foreign key error happens on Bionic (Percona 5.7) however it is much more likely to have been fixed there as it is a much newer code base.
Secondly it seems that for most new cloud deployments, we are deploying bcache at least for /var (which means the Percona containers are included) which somewhat mitigates the need for tuning-level=fast on Xenial. This may depend slightly on whether or not the bcache sequential threshold has been tuned. I have not tested this but if the sequential threshold is not reduced, and the server is busy, it's likely a large write to the innodb log file could skip the SSD and thus still not get 'cached'.
As to your question about losing 1 seconds worth of transactions. With the 3-node cluster, the transaction is committed to all 3 nodes before returning to the client. For this reason, w...