Activity log for bug #1319003

Date Who What changed Old value New value Message
2014-05-13 11:00:58 Felipe Franciosi bug added bug
2014-05-13 11:00:58 Felipe Franciosi attachment added Graph 1 https://bugs.launchpad.net/bugs/1319003/+attachment/4111326/+files/read_3_12_9.png
2014-05-13 11:01:56 Felipe Franciosi attachment added Graph 2 https://bugs.launchpad.net/ubuntu/+bug/1319003/+attachment/4111328/+files/read_3_13_7.png
2014-05-13 11:02:24 Felipe Franciosi description Description of problem: When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only). While persistent grants reduce the stress on the Xen grant table and allow for much better aggregate throughput (at the cost of an extra set of memcpy() operations), adding the copy overhead when the feature is unsupported on the backend combines the worst of both worlds. This is particularly noticeable when intensive storage workloads are active from many guests. The graphs attached show storage throughput numbers for Linux guests using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix XenServer development build. The server had 4 storage repositories (SRs) with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs in total). When using 3.12.9 kernel, the regression is clearly visible for more than 2 VMs per SR and block sizes larger than 64 KiB. The workload consisted of sequential reads on pre-allocated raw LVM logical volumes. [1] Commits by Roger Pau Monné: bfe11d6de1c416cea4f3f0f35f864162063ce3fa fbe363c476afe8ec992d3baf682670a4bd1b6ce6 p.s. I don't know if I will be able to attach graphs/images to this bug report. If not, please contact me directly and I will send you the performance data. Version-Release number of selected component (if applicable): xen-blkfront of Linux kernel 3.11 How reproducible: This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants. Steps to Reproduce: 1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support). 2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11). 3. Measure aggregate storage throughput from all guests. NOTE: The storage infrastructure (e.g. local SSDs, network-attached storage) should not be a bottleneck in itself. If tested on a single SATA disk, for example, the issue will probably be unnoticeable as the infrastructure will be limiting response time and throughput. Actual results: Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12. Expected results: Aggregate storage throughput should be at least as good or better than previous versions of RHEL in cases where the backend doesn't support persistent grants. Additional info: Given that this is fixed on newer kernels, we urge that Red Hat request a backport of the relevant patches to the 3.11 stable branch. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of: - Serious issues as reported by a user of a distribution kernel may also be considered if they fix a notable performance or interactivity issue. As these fixes are not as obvious and have a higher risk of a subtle regression they should only be submitted by a distribution kernel maintainer and include an addendum linking to a bugzilla entry if it exists and additional information on the user-visible impact. Description of problem: When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only). While persistent grants reduce the stress on the Xen grant table and allow for much better aggregate throughput (at the cost of an extra set of memcpy() operations), adding the copy overhead when the feature is unsupported on the backend combines the worst of both worlds. This is particularly noticeable when intensive storage workloads are active from many guests. The graphs attached show storage throughput numbers for Linux guests using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix XenServer development build. The server had 4 storage repositories (SRs) with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs in total). When using 3.12.9 kernel, the regression is clearly visible for more than 2 VMs per SR and block sizes larger than 64 KiB. The workload consisted of sequential reads on pre-allocated raw LVM logical volumes. [1] Commits by Roger Pau Monné:     bfe11d6de1c416cea4f3f0f35f864162063ce3fa     fbe363c476afe8ec992d3baf682670a4bd1b6ce6 Version-Release number of selected component (if applicable): xen-blkfront of Linux kernel 3.11 How reproducible: This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants. Steps to Reproduce: 1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support). 2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11). 3. Measure aggregate storage throughput from all guests. NOTE: The storage infrastructure (e.g. local SSDs, network-attached storage) should not be a bottleneck in itself. If tested on a single SATA disk, for example, the issue will probably be unnoticeable as the infrastructure will be limiting response time and throughput. Actual results: Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12. Expected results: Aggregate storage throughput should be at least as good or better than previous versions of RHEL in cases where the backend doesn't support persistent grants. Additional info: Given that this is fixed on newer kernels, we urge that Red Hat request a backport of the relevant patches to the 3.11 stable branch. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of: - Serious issues as reported by a user of a distribution kernel may also    be considered if they fix a notable performance or interactivity issue.    As these fixes are not as obvious and have a higher risk of a subtle    regression they should only be submitted by a distribution kernel    maintainer and include an addendum linking to a bugzilla entry if it    exists and additional information on the user-visible impact.
2014-05-13 12:28:42 Ubuntu Foundations Team Bug Bot tags bot-comment
2014-05-13 13:44:26 Felipe Franciosi description Description of problem: When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only). While persistent grants reduce the stress on the Xen grant table and allow for much better aggregate throughput (at the cost of an extra set of memcpy() operations), adding the copy overhead when the feature is unsupported on the backend combines the worst of both worlds. This is particularly noticeable when intensive storage workloads are active from many guests. The graphs attached show storage throughput numbers for Linux guests using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix XenServer development build. The server had 4 storage repositories (SRs) with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs in total). When using 3.12.9 kernel, the regression is clearly visible for more than 2 VMs per SR and block sizes larger than 64 KiB. The workload consisted of sequential reads on pre-allocated raw LVM logical volumes. [1] Commits by Roger Pau Monné:     bfe11d6de1c416cea4f3f0f35f864162063ce3fa     fbe363c476afe8ec992d3baf682670a4bd1b6ce6 Version-Release number of selected component (if applicable): xen-blkfront of Linux kernel 3.11 How reproducible: This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants. Steps to Reproduce: 1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support). 2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11). 3. Measure aggregate storage throughput from all guests. NOTE: The storage infrastructure (e.g. local SSDs, network-attached storage) should not be a bottleneck in itself. If tested on a single SATA disk, for example, the issue will probably be unnoticeable as the infrastructure will be limiting response time and throughput. Actual results: Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12. Expected results: Aggregate storage throughput should be at least as good or better than previous versions of RHEL in cases where the backend doesn't support persistent grants. Additional info: Given that this is fixed on newer kernels, we urge that Red Hat request a backport of the relevant patches to the 3.11 stable branch. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of: - Serious issues as reported by a user of a distribution kernel may also    be considered if they fix a notable performance or interactivity issue.    As these fixes are not as obvious and have a higher risk of a subtle    regression they should only be submitted by a distribution kernel    maintainer and include an addendum linking to a bugzilla entry if it    exists and additional information on the user-visible impact. Description of problem: When used as a Xen guest, Ubuntu 13.10 may be slower than older releases in terms of storage performance. This is due to the persistent-grants feature introduced in xen-blkfront on the Linux Kernel 3.8 series. From 3.8 to 3.12 (inclusive), xen-blkfront will add an extra set of memcpy() operations regardless of persistent-grants support in the backend (i.e. xen-blkback, qemu, tapdisk). Many Xen dom0s do not have backend persistent-grants support (such as Citrix XenServer and any Linux distro with Kernel prior to 3.8). This has been identified and fixed in the 3.13 kernel series [1], but was not backported to previous LTS kernels due to the nature of the bug (performance only). While persistent grants reduce the stress on the Xen grant table and allow for much better aggregate throughput (at the cost of an extra set of memcpy() operations), adding the copy overhead when the feature is unsupported on the backend combines the worst of both worlds. This is particularly noticeable when intensive storage workloads are active from many guests. The graphs attached show storage throughput numbers for Linux guests using kernel 3.12.9 (Graph 1) and 3.13.7 (Graph 2) running on a Citrix XenServer development build. The server had 4 storage repositories (SRs) with 1 Micron P320 SSD per SR (i.e. 10 VMs per SR means 40 VMs in total). When using 3.12.9 kernel, the regression is clearly visible for more than 2 VMs per SR and block sizes larger than 64 KiB. The workload consisted of sequential reads on pre-allocated raw LVM logical volumes. [1] Commits by Roger Pau Monné:     bfe11d6de1c416cea4f3f0f35f864162063ce3fa     fbe363c476afe8ec992d3baf682670a4bd1b6ce6 Version-Release number of selected component (if applicable): xen-blkfront of Linux kernel 3.11 How reproducible: This is always reproducible when a Ubuntu 13.10 guest is running on Xen and the storage backend (i.e. xen-blkback, qemu, tapdisk) does not have support for persistent grants. Steps to Reproduce: 1. Install a Xen dom0 running a kernel prior to 3.8 (without persistent-grants support). 2. Install a set of Ubuntu 13.10 guests (which uses kernel 3.11). 3. Measure aggregate storage throughput from all guests. NOTE: The storage infrastructure (e.g. local SSDs, network-attached storage) should not be a bottleneck in itself. If tested on a single SATA disk, for example, the issue will probably be unnoticeable as the infrastructure will be limiting response time and throughput. Actual results: Aggregate storage throughput will be lower than with a xen-blkfront versions prior to 3.8 or newer than 3.12. Expected results: Aggregate storage throughput should be at least as good or better than previous (or newer) versions of Ubuntu in cases where the backend doesn't support persistent grants. Additional info: Given that this is fixed on newer kernels, we urge that a backport of the relevant patches to the 3.11 stable branch is requested. According to the rules in: https://www.kernel.org/doc/Documentation/stable_kernel_rules.txt, the patches would be accepted on the grounds of: - Serious issues as reported by a user of a distribution kernel may also    be considered if they fix a notable performance or interactivity issue.    As these fixes are not as obvious and have a higher risk of a subtle    regression they should only be submitted by a distribution kernel    maintainer and include an addendum linking to a bugzilla entry if it    exists and additional information on the user-visible impact.
2014-05-13 13:45:05 Felipe Franciosi affects ubuntu linux-meta (Ubuntu)
2014-05-13 14:00:07 Brad Figg affects linux-meta (Ubuntu) linux (Ubuntu)
2014-05-13 14:30:12 Brad Figg linux (Ubuntu): status New Incomplete
2014-05-13 14:38:21 Felipe Franciosi linux (Ubuntu): status Incomplete Confirmed
2014-05-13 16:14:57 Joseph Salisbury tags bot-comment bot-comment kernel-da-key saucy
2014-05-13 16:15:00 Joseph Salisbury linux (Ubuntu): importance Undecided Medium
2014-05-13 16:15:03 Joseph Salisbury linux (Ubuntu): status Confirmed Triaged
2014-05-13 18:16:17 Joseph Salisbury nominated for series Ubuntu Saucy
2014-05-13 18:16:17 Joseph Salisbury bug task added linux (Ubuntu Saucy)
2014-05-13 18:16:27 Joseph Salisbury linux (Ubuntu Saucy): importance Undecided Medium
2014-05-13 18:16:30 Joseph Salisbury linux (Ubuntu Saucy): status New Triaged
2014-05-13 19:09:12 Joseph Salisbury linux (Ubuntu Saucy): status Triaged In Progress
2014-05-13 19:09:14 Joseph Salisbury linux (Ubuntu): status Triaged In Progress
2014-05-13 19:09:16 Joseph Salisbury linux (Ubuntu): assignee Joseph Salisbury (jsalisbury)
2014-05-13 19:09:18 Joseph Salisbury linux (Ubuntu Saucy): assignee Joseph Salisbury (jsalisbury)
2014-06-01 00:06:59 Felipe Franciosi attachment added Saucy x86_64 without the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123531/+files/saucy.png
2014-06-01 00:07:22 Felipe Franciosi attachment added Saucy x86_64 with the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123532/+files/saucy-backports.png
2014-06-01 00:33:11 Felipe Franciosi attachment removed Saucy x86_64 without the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123531/+files/saucy.png
2014-06-01 00:33:22 Felipe Franciosi attachment removed Saucy x86_64 with the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123532/+files/saucy-backports.png
2014-06-01 00:33:50 Felipe Franciosi attachment added Saucy x86_64 guests with the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123554/+files/saucy64-backports.png
2014-06-01 00:34:12 Felipe Franciosi attachment added Saucy x86_64 guests without the backports https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1319003/+attachment/4123555/+files/saucy64.png
2014-06-18 18:35:06 Joseph Salisbury linux (Ubuntu Saucy): status In Progress Won't Fix
2014-06-18 18:35:17 Joseph Salisbury linux (Ubuntu): status In Progress Won't Fix