Comment 6 for bug 1662666

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

-- Problem Description --
The following upstream patches are needed for Ubuntu to fix a hang situation reported when executing ppc64_cpu --smt=on that occurs with various disk types. We need whichever ones have not yet been pulled into the base.

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e57690fe009b2ab0cee8a57f53be634540e49c9d
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0e87e58bf60edb6bb28e493c7a143f41b091a5e5
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c02ebfdddbafa9a6a0f52fbd715e6bfa229af9d3
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d1b1cea1e58477dad88ff769f54c0d2dfa56d923
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=36e1f3d107867b25c616c2fd294f5a1c9d4e5d09
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=71f79fb3179e69b0c1448a2101a866d871c66e7f

The hang problem can be reproduced with the following shell script using an NVMe device executed on kernel versions 4.4.0-45 and 4.4.0-59 within 30 minutes. It was also reproduced on a 4.8.0-32-generic kernel, although it took over 3 hours to manifest.

#!/bin/bash

if [[ ${#} -eq 0 ]]; then
${0} breaker &
while true; do
dd if=/dev/nvme0n1 bs=1024k of=/dev/null
done
elif [[ ${1} == "breaker" ]]; then
while true; do
ppc64_cpu --smt=off
sleep 5
ppc64_cpu --smt=on
sleep 5
done
fi

Steve,

Can Foundations take a look at this request, please.

Michael

On 02/07/2017 12:39 PM, Launchpad Bug Tracker wrote:
> bugproxy (bugproxy) has assigned this bug to you for Ubuntu:
>
> -- Problem Description --
> The following upstream patches are needed for Ubuntu to fix a hang situation reported when executing ppc64_cpu --smt=on that occurs with various disk types. We need whichever ones have not yet been pulled into the base.
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e57690fe009b2ab0cee8a57f53be634540e49c9d
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0e87e58bf60edb6bb28e493c7a143f41b091a5e5
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=135e8c9250dd5c8c9aae5984fde6f230d0cbfeaf
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c02ebfdddbafa9a6a0f52fbd715e6bfa229af9d3
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d1b1cea1e58477dad88ff769f54c0d2dfa56d923
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=36e1f3d107867b25c616c2fd294f5a1c9d4e5d09
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=71f79fb3179e69b0c1448a2101a866d871c66e7f
>
> The hang problem can be reproduced with the following shell script using
> an NVMe device executed on kernel versions 4.4.0-45 and 4.4.0-59 within
> 30 minutes. It was also reproduced on a 4.8.0-32-generic kernel,
> although it took over 3 hours to manifest.
>
> #!/bin/bash
>
> if [[ ${#} -eq 0 ]]; then
> ${0} breaker &
> while true; do
> dd if=/dev/nvme0n1 bs=1024k of=/dev/null
> done
> elif [[ ${1} == "breaker" ]]; then
> while true; do
> ppc64_cpu --smt=off
> sleep 5
> ppc64_cpu --smt=on
> sleep 5
> done
> fi
>
> ** Affects: ubuntu
> Importance: Undecided
> Assignee: Taco Screen team (taco-screen-team)
> Status: New
>
>
> ** Tags: architecture-ppc64le bugnameltc-146759 severity-critical targetmilestone-inin16041

--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.

Correction: the hang reproduced by the previous shell script is actually being fixed separately. These commits fix various other problems with NVMe drives and are required as a prerequisite .

On Tue, Feb 07, 2017 at 12:42:39PM -0800, Michael Hohnbaum wrote:
> Can Foundations take a look at this request, please.

The bug is assigned to the linux package, so the kernel team should probably
be looking at it.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Leann,

While the problem is in ppc64-cpu command, it appears the fix is in a
set of kernel patches. Can you have the kernel team take a look at
these. Thanks.

Michael

On 02/07/2017 01:05 PM, Steve Langasek wrote:
> On Tue, Feb 07, 2017 at 12:42:39PM -0800, Michael Hohnbaum wrote:
>> Can Foundations take a look at this request, please.
> The bug is assigned to the linux package, so the kernel team should probably
> be looking at it.
>

--
Michael Hohnbaum
OIL Program Manager
Power (ppc64el) Development Project Manager
Canonical, Ltd.

https://lists.ubuntu.com/archives/kernel-team/2017-February/082383.html
https://lists.ubuntu.com/archives/kernel-team/2017-February/082386.html