bcache: performance regression without tuning under bionic

Bug #1806015 reported by James Page on 2018-11-30
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned
Bionic
Medium
Unassigned

Bug Description

Whilst diagnosing a disk performance issue on our QA cloud, I did some performance testing of bcache fronted spindles to compare xenial (4.4 kernel) and bionic (4.14 kernel) installs on the same hardware.

A vanilla install (with no tuning of bcache configuration) resulted in the following performance metrics (using the sysbench fileio rndrw benchmark):

Xenial

4 threads:
  45.77 MiB/sec read
  30.52 MiB/sec write

48 threads (matching core count):
  138.72 MiB/sec read
  92.52 MiB/sec write

Bionic

4 threads:
  29.51 MiB/sec
  19.67 MiB/sec

48 threads (matching core count):
  41.35 MiB/sec
  27.59 MiB/sec

After tuning (disabling the congested_{read|write}_threshold_us and disabling sequential cutoff)

Xenial

48 threads (matching core count):
  153.60 MiB/sec
  102.40 MiB/sec

Bionic

48 threads (matching core count):
  161.49 MiB/sec
  107.67 MiB/sec

suggestion is this might be something todo with the move from deadline to cfq as the default IO scheduler between 4.4 and 4.13 but as you can see
the baseline vanilla performance is significantly slower.

As a further reference point, the IO performance on the NVMe device supporting the bcache device is:

4 threads:
  554.56 MiB/sec read
  369.71 MiB/sec write

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-generic 4.15.0.39.41
ProcVersionSignature: Ubuntu 4.15.0-39.42-generic 4.15.18
Uname: Linux 4.15.0-39-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Nov 29 11:38 seq
 crw-rw---- 1 root audio 116, 33 Nov 29 11:38 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Fri Nov 30 10:11:54 2018
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 002: ID 8087:8002 Intel Corp.
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 003: ID 413c:a001 Dell Computer Corp. Hub
 Bus 001 Device 002: ID 8087:800a Intel Corp.
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Dell Inc. PowerEdge R630
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 mgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-39-generic root=UUID=a361a524-47eb-46c3-8a04-e5eaa65188c9 ro hugepages=103117 iommu=pt intel_iommu=on
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-39-generic N/A
 linux-backports-modules-4.15.0-39-generic N/A
 linux-firmware 1.173.2
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/08/2016
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.3.4
dmi.board.name: 02C2CP
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.3.4:bd11/08/2016:svnDellInc.:pnPowerEdgeR630:pvr:rvnDellInc.:rn02C2CP:rvrA03:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R630
dmi.sys.vendor: Dell Inc.

James Page (james-page) wrote :
James Page (james-page) wrote :

Script used to tune bcache devices

description: updated

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Ryan Harper (raharper) wrote :

Can you collect the following:

Backing device baseline (it's possible the underlying disks regressed instead of the bcache layer). the same fio randrw test against the underlying backing device with bcache disabled

And with bcache enabled on both setups before and after tuning:

1) grep -r . /sys/class/block/bcache*/bcache/ > bdev_stats
2) grep -r . /sys/fs/bcache/*-*-*-*-*/ > cdev_stats
3) grep -r . /sys/class/block/sd?/queue/scheduler > bdev_schedulers

Ryan Harper (raharper) wrote :

I confirmed that the underlying block devices (SAS, NVME) perform the same
on the 4.4 and 4.15 kernels. Roughly 170 IOP/s direct to the SAS device
and 570 IOP/s direct to the bcache infront of the SAS device. The block
scheduler has no effect, due to the use of O_DIRECT.

I can reproduce the sysbench difference in performance on 4.15 versus 4.4.
The tuning helps though, only disabling sequential_cutoff really matters as
this enables bcache to also cache reads; in general the faster reads allow
additional writes.

During testing, I believe the core issue we're seeing between 4.4 and 4.15
is around two things:

1) ext4 fs on bionic + enable metadata_csum by default on the filesystems,
which will result in additional latency and IO as the csum is calculated and
then embedded into the journal

2) fsync performance on 4.15 is measurably slower than on 4.4, even without
csum_metadata enabled

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu Bionic):
status: New → Triaged
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers