Activity log for bug #1840043

Date Who What changed Old value New value Message
2019-08-13 14:02:30 Heitor Alves de Siqueira bug added bug
2019-08-13 14:03:39 Heitor Alves de Siqueira description [Impact] Performance degradation for read/write workloads in bcache devices, occasional system stalls [Description] In the latest bcache drivers, there's a sysfs attribute that calculates bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying this file has a big performance impact on tasks that run in the same CPU, and also affects read/write performance of the bcache device itself. This is due to the way the driver calculates the stats: the bcache buckets are locked and iterated through, collecting information about each individual bucket. An array of nbucket elements is constructed and sorted afterwards, which can cause very high CPU contention in cases of larger bcache setups. From our tests, the sorting step of the priority_stats query causes the most expressive performance reduction, as it can hinder tasks that are not even doing any bcache IO. If a task is "unlucky" to be scheduled in the same CPU as the sysfs query, its performance will be harshly reduced as both compete for CPU time. We've had users report systems stalls of up to ~6s due to this, as a result from monitoring tools that query the priority_stats periodically (e.g. Prometheus Node Exporter from [0]). These system stalls have triggered several other issues such as ceph-mon re-elections, problems in percona-cluster and general network stalls, so the impact is not isolated to bcache IO workloads. [0] https://github.com/prometheus/node_exporter [Test Case] Note: As the sorting step has the most noticeable performance impact, the test case below pins a workload and the sysfs query to the same CPU. CPU contention issues still occur without any pinning, this just removes the scheduling factor of landing in different CPUs and affecting different tasks. 1) Start a read/write workload on the bcache device with e.g. fio or dd, pinned to a certain CPU: # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress 2) Start a sysfs query loop for the priority_stats attribute pinned to the same CPU: # for i in {1..100000}; do taskset 0x10 cat /sys/fs/bcache/*/cache0/priority_stats 3) Monitor the read/write workload for any performance impact [Impact] Performance degradation for read/write workloads in bcache devices, occasional system stalls [Description] In the latest bcache drivers, there's a sysfs attribute that calculates bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying this file has a big performance impact on tasks that run in the same CPU, and also affects read/write performance of the bcache device itself. This is due to the way the driver calculates the stats: the bcache buckets are locked and iterated through, collecting information about each individual bucket. An array of nbucket elements is constructed and sorted afterwards, which can cause very high CPU contention in cases of larger bcache setups. From our tests, the sorting step of the priority_stats query causes the most expressive performance reduction, as it can hinder tasks that are not even doing any bcache IO. If a task is "unlucky" to be scheduled in the same CPU as the sysfs query, its performance will be harshly reduced as both compete for CPU time. We've had users report systems stalls of up to ~6s due to this, as a result from monitoring tools that query the priority_stats periodically (e.g. Prometheus Node Exporter from [0]). These system stalls have triggered several other issues such as ceph-mon re-elections, problems in percona-cluster and general network stalls, so the impact is not isolated to bcache IO workloads. [0] https://github.com/prometheus/node_exporter [Test Case] Note: As the sorting step has the most noticeable performance impact, the test case below pins a workload and the sysfs query to the same CPU. CPU contention issues still occur without any pinning, this just removes the scheduling factor of landing in different CPUs and affecting different tasks. 1) Start a read/write workload on the bcache device with e.g. fio or dd, pinned to a certain CPU: # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress 2) Start a sysfs query loop for the priority_stats attribute pinned to the same CPU: # for i in {1..100000}; do taskset 0x10 cat /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done 3) Monitor the read/write workload for any performance impact
2019-08-13 14:21:51 Heitor Alves de Siqueira attachment added bcache-results.tar.gz https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1840043/+attachment/5282348/+files/bcache-results.tar.gz
2019-08-13 14:30:05 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-08-13 14:56:23 Heitor Alves de Siqueira linux (Ubuntu): status Incomplete Confirmed
2019-08-13 16:16:17 Guilherme G. Piccoli bug added subscriber Guilherme G. Piccoli
2019-08-16 10:39:43 Heitor Alves de Siqueira bug task added linux
2019-08-16 10:41:30 Heitor Alves de Siqueira linux: status New In Progress
2019-08-20 16:29:46 Peter Sabaini tags sts canonical-bootstack sts
2019-08-20 16:30:00 Peter Sabaini bug added subscriber Canonical IS CREs
2019-08-20 17:11:47 Nivedita Singhvi bug added subscriber Nivedita Singhvi
2019-10-10 19:41:16 Heitor Alves de Siqueira description [Impact] Performance degradation for read/write workloads in bcache devices, occasional system stalls [Description] In the latest bcache drivers, there's a sysfs attribute that calculates bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying this file has a big performance impact on tasks that run in the same CPU, and also affects read/write performance of the bcache device itself. This is due to the way the driver calculates the stats: the bcache buckets are locked and iterated through, collecting information about each individual bucket. An array of nbucket elements is constructed and sorted afterwards, which can cause very high CPU contention in cases of larger bcache setups. From our tests, the sorting step of the priority_stats query causes the most expressive performance reduction, as it can hinder tasks that are not even doing any bcache IO. If a task is "unlucky" to be scheduled in the same CPU as the sysfs query, its performance will be harshly reduced as both compete for CPU time. We've had users report systems stalls of up to ~6s due to this, as a result from monitoring tools that query the priority_stats periodically (e.g. Prometheus Node Exporter from [0]). These system stalls have triggered several other issues such as ceph-mon re-elections, problems in percona-cluster and general network stalls, so the impact is not isolated to bcache IO workloads. [0] https://github.com/prometheus/node_exporter [Test Case] Note: As the sorting step has the most noticeable performance impact, the test case below pins a workload and the sysfs query to the same CPU. CPU contention issues still occur without any pinning, this just removes the scheduling factor of landing in different CPUs and affecting different tasks. 1) Start a read/write workload on the bcache device with e.g. fio or dd, pinned to a certain CPU: # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress 2) Start a sysfs query loop for the priority_stats attribute pinned to the same CPU: # for i in {1..100000}; do taskset 0x10 cat /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done 3) Monitor the read/write workload for any performance impact [Impact] Querying bcache's priority_stats attribute in sysfs causes severe performance degradation for read/write workloads and occasional system stalls [Test Case] Note: As the sorting step has the most noticeable performance impact, the test case below pins a workload and the sysfs query to the same CPU. CPU contention issues still occur without any pinning, this just removes the scheduling factor of landing in different CPUs and affecting different tasks. 1) Start a read/write workload on the bcache device with e.g. fio or dd, pinned to a certain CPU: # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress 2) Start a sysfs query loop for the priority_stats attribute pinned to the same CPU: # for i in {1..100000}; do taskset 0x10 cat /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done 3) Monitor the read/write workload for any performance impact [Fix] To fix CPU contention and performance impact, a cond_resched() call is introduced in the priority_stats sort comparison. [Regression Potential] Regression potential is low, as the change is confined to the priority_stats sysfs query. In cases where frequent queries to bcache priority_stats take place (e.g. node_exporter), the impact should be more noticeable as those could now take a bit longer to complete. A regression due to this patch would most likely show up as a performance degradation in bcache-focused workloads. -- [Description] In the latest bcache drivers, there's a sysfs attribute that calculates bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying this file has a big performance impact on tasks that run in the same CPU, and also affects read/write performance of the bcache device itself. This is due to the way the driver calculates the stats: the bcache buckets are locked and iterated through, collecting information about each individual bucket. An array of nbucket elements is constructed and sorted afterwards, which can cause very high CPU contention in cases of larger bcache setups. From our tests, the sorting step of the priority_stats query causes the most expressive performance reduction, as it can hinder tasks that are not even doing any bcache IO. If a task is "unlucky" to be scheduled in the same CPU as the sysfs query, its performance will be harshly reduced as both compete for CPU time. We've had users report systems stalls of up to ~6s due to this, as a result from monitoring tools that query the priority_stats periodically (e.g. Prometheus Node Exporter from [0]). These system stalls have triggered several other issues such as ceph-mon re-elections, problems in percona-cluster and general network stalls, so the impact is not isolated to bcache IO workloads. An example benchmark can be seen in [1], where the read performance on a bcache device suffered quite heavily (going from ~40k IOPS to ~4k IOPS due to priority_stats). Other comparison charts are found under [2]. [0] https://github.com/prometheus/node_exporter [1] https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png [2] https://people.canonical.com/~halves/priority_stats/
2019-10-10 20:16:42 Heitor Alves de Siqueira linux: status In Progress Fix Committed
2019-10-10 20:20:28 Heitor Alves de Siqueira linux (Ubuntu): status Confirmed In Progress
2019-10-17 14:59:45 Kleber Sacilotto de Souza nominated for series Ubuntu Eoan
2019-10-17 14:59:45 Kleber Sacilotto de Souza bug task added linux (Ubuntu Eoan)
2019-10-17 14:59:45 Kleber Sacilotto de Souza nominated for series Ubuntu Disco
2019-10-17 14:59:45 Kleber Sacilotto de Souza bug task added linux (Ubuntu Disco)
2019-10-17 14:59:45 Kleber Sacilotto de Souza nominated for series Ubuntu Xenial
2019-10-17 14:59:45 Kleber Sacilotto de Souza bug task added linux (Ubuntu Xenial)
2019-10-17 14:59:45 Kleber Sacilotto de Souza nominated for series Ubuntu Bionic
2019-10-17 14:59:45 Kleber Sacilotto de Souza bug task added linux (Ubuntu Bionic)
2019-10-17 15:05:27 Kleber Sacilotto de Souza linux (Ubuntu Disco): assignee Heitor Alves de Siqueira (halves)
2019-10-17 15:05:33 Kleber Sacilotto de Souza linux (Ubuntu Bionic): assignee Heitor Alves de Siqueira (halves)
2019-10-17 15:05:39 Kleber Sacilotto de Souza linux (Ubuntu Xenial): assignee Heitor Alves de Siqueira (halves)
2019-10-17 15:13:14 Kleber Sacilotto de Souza linux (Ubuntu Xenial): status New Fix Committed
2019-10-17 15:13:16 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status New Fix Committed
2019-10-17 15:13:18 Kleber Sacilotto de Souza linux (Ubuntu Disco): status New Fix Committed
2019-10-17 15:13:21 Kleber Sacilotto de Souza linux (Ubuntu Eoan): status In Progress Fix Committed
2019-10-22 15:02:17 Ubuntu Kernel Bot tags canonical-bootstack sts canonical-bootstack sts verification-needed-disco
2019-10-22 15:47:40 Ubuntu Kernel Bot tags canonical-bootstack sts verification-needed-disco canonical-bootstack sts verification-needed-bionic verification-needed-disco
2019-10-22 16:04:32 Ubuntu Kernel Bot tags canonical-bootstack sts verification-needed-bionic verification-needed-disco canonical-bootstack sts verification-needed-bionic verification-needed-disco verification-needed-xenial
2019-10-24 16:03:17 Ubuntu Kernel Bot tags canonical-bootstack sts verification-needed-bionic verification-needed-disco verification-needed-xenial canonical-bootstack sts verification-needed-bionic verification-needed-disco verification-needed-eoan verification-needed-xenial
2019-10-25 15:56:30 Heitor Alves de Siqueira tags canonical-bootstack sts verification-needed-bionic verification-needed-disco verification-needed-eoan verification-needed-xenial canonical-bootstack sts verification-done-xenial verification-needed-bionic verification-needed-disco verification-needed-eoan
2019-10-26 09:21:04 Heitor Alves de Siqueira tags canonical-bootstack sts verification-done-xenial verification-needed-bionic verification-needed-disco verification-needed-eoan canonical-bootstack sts verification-done-bionic verification-done-xenial verification-needed-disco verification-needed-eoan
2019-10-26 11:48:31 Heitor Alves de Siqueira tags canonical-bootstack sts verification-done-bionic verification-done-xenial verification-needed-disco verification-needed-eoan canonical-bootstack sts verification-done-bionic verification-done-disco verification-done-xenial verification-needed-eoan
2019-10-26 12:51:23 Heitor Alves de Siqueira tags canonical-bootstack sts verification-done-bionic verification-done-disco verification-done-xenial verification-needed-eoan canonical-bootstack sts verification-done-bionic verification-done-disco verification-done-eoan verification-done-xenial
2019-11-12 22:18:04 Launchpad Janitor linux (Ubuntu Eoan): status Fix Committed Fix Released
2019-11-12 22:18:04 Launchpad Janitor cve linked 2018-12207
2019-11-12 22:18:04 Launchpad Janitor cve linked 2019-0154
2019-11-12 22:18:04 Launchpad Janitor cve linked 2019-0155
2019-11-12 22:18:04 Launchpad Janitor cve linked 2019-11135
2019-11-12 22:18:04 Launchpad Janitor cve linked 2019-15793
2019-11-12 22:18:04 Launchpad Janitor cve linked 2019-17666
2019-11-12 22:21:40 Launchpad Janitor linux (Ubuntu Disco): status Fix Committed Fix Released
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-15098
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-17052
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-17053
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-17054
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-17055
2019-11-12 22:21:40 Launchpad Janitor cve linked 2019-17056
2019-11-12 22:24:59 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2019-11-12 22:33:58 Launchpad Janitor linux (Ubuntu Xenial): status Fix Committed Fix Released
2019-12-06 15:57:44 Launchpad Janitor linux (Ubuntu): status Fix Committed Fix Released
2019-12-06 15:57:44 Launchpad Janitor cve linked 2019-15794