Comment 3 for bug 1840043

Revision history for this message
Heitor Alves de Siqueira (halves) wrote :

From the comparison graphs, we can see that the impact of the sysfs query is very significant for both read and write workloads. Taking just the "sysfs" test from my previous comment into account:

In the random write tests:
* For 512b block sizes, we see about an 85% reduction in BW (down to ~2MB/s from ~16MB/s)
* For 4k block sizes, the reduction in BW is also about 85% (~30MB/s compared to ~240MB/s)
* For 512k block sizes (== bucket size) and higher, BW is reduced by about 64% (~90MB/s vs ~250MB/s)

In the random read tests:
* For 512b block sizes, BW goes down to ~3MB/s from ~25MB/s
* For 4k block sizes, the BW reduction is around ~90% (~10MB/s compared to ~160MB/s)
* For 512k block sizes and higher, BW is reduced by about 90% (150MB/s vs 1.6GB/s)

We can see similar results as the above for the IOPS measures, and the latency measures are also much worse in the sysfs test. We observe frequent latency spikes (150ms+) when running fio together with the priority_stats query, and latency averages increase by about 50ms at least.

Surprisingly, the "mutex" patch didn't improve the test results much. This was the case for both read and write workloads, which suggests that the bucket locking doesn't have that much impact on the system compared to the sorting.

The cond_resched() patch showed great results, even though it causes the sysfs queries to take a bit longer. The write throughput of the bcache device is _much_ better with it, and the system doesn't stall anymore (even when pinning processes to the same CPU as the sysfs query). In some cases, it brings performance back to values close to the "raw" tests (i.e. without any sysfs queries). This patch seems like the best short-term solution for now, as the sysfs query taking a bit longer shouldn't really be a problem in most setups (whereas the IO performance and other issues are much more noticeable).