2019-11-05 01:12:07 |
Rod Smith |
bug |
|
|
added bug |
2019-11-05 09:03:07 |
Colin Ian King |
stress-ng: importance |
Undecided |
High |
|
2019-11-05 09:03:09 |
Colin Ian King |
stress-ng: assignee |
|
Colin Ian King (colin-king) |
|
2019-11-05 09:03:13 |
Colin Ian King |
stress-ng: status |
New |
In Progress |
|
2019-11-05 13:32:33 |
Andrew Cloke |
bug |
|
|
added subscriber Andrew Cloke |
2019-11-05 19:17:11 |
Colin Ian King |
stress-ng: status |
In Progress |
Fix Committed |
|
2019-11-06 09:27:16 |
Colin Ian King |
description |
Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script):
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5051] dispatching hogs: 4 stack
stress-ng: info: [5051] unsuccessful run completed in 32.01s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5064] dispatching hogs: 4 stack
stress-ng: info: [5064] successful run completed in 32.96s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5077] dispatching hogs: 4 stack
stress-ng: info: [5077] unsuccessful run completed in 33.09s
This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others.
In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers. |
== SRU Justification Eoan ==
Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script):
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5051] dispatching hogs: 4 stack
stress-ng: info: [5051] unsuccessful run completed in 32.01s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5064] dispatching hogs: 4 stack
stress-ng: info: [5064] successful run completed in 32.96s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5077] dispatching hogs: 4 stack
stress-ng: info: [5077] unsuccessful run completed in 33.09s
This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others.
In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers.
== Test case ==
Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot bne triggered.
== Fix ==
Upstream stress-ng commits:
- 10ffe40579c5 stress-stack: return error code in child using
_exit() and not return
- ef18c524df48 stress-stack: don't throw a fatal error when
sigaltstack fails
- 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry
- 8fb67daea592 stress-stack: setup alternative stack in child only
The first 3 fixes are prerequisite fixes, the final fix addresses the main issue.
== Regression Potential ==
This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test. |
|
2019-11-06 11:06:11 |
Colin Ian King |
bug task added |
|
stress-ng (Ubuntu) |
|
2019-11-06 11:06:22 |
Colin Ian King |
nominated for series |
|
Ubuntu Focal |
|
2019-11-06 11:06:22 |
Colin Ian King |
bug task added |
|
stress-ng (Ubuntu Focal) |
|
2019-11-06 11:06:22 |
Colin Ian King |
nominated for series |
|
Ubuntu Eoan |
|
2019-11-06 11:06:22 |
Colin Ian King |
bug task added |
|
stress-ng (Ubuntu Eoan) |
|
2019-11-06 11:06:37 |
Colin Ian King |
stress-ng: status |
Fix Committed |
Fix Released |
|
2019-11-06 11:07:01 |
Colin Ian King |
stress-ng (Ubuntu Eoan): status |
New |
In Progress |
|
2019-11-06 11:07:03 |
Colin Ian King |
stress-ng (Ubuntu Focal): status |
New |
In Progress |
|
2019-11-06 13:14:45 |
Launchpad Janitor |
stress-ng (Ubuntu Focal): status |
In Progress |
Fix Released |
|
2019-11-06 13:47:51 |
Colin Ian King |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2019-11-14 15:11:05 |
Łukasz Zemczak |
stress-ng (Ubuntu Eoan): status |
In Progress |
Fix Committed |
|
2019-11-14 15:11:07 |
Łukasz Zemczak |
bug |
|
|
added subscriber SRU Verification |
2019-11-14 15:11:09 |
Łukasz Zemczak |
tags |
|
verification-needed verification-needed-eoan |
|
2019-11-14 15:33:02 |
Colin Ian King |
description |
== SRU Justification Eoan ==
Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script):
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5051] dispatching hogs: 4 stack
stress-ng: info: [5051] unsuccessful run completed in 32.01s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5064] dispatching hogs: 4 stack
stress-ng: info: [5064] successful run completed in 32.96s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5077] dispatching hogs: 4 stack
stress-ng: info: [5077] unsuccessful run completed in 33.09s
This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others.
In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers.
== Test case ==
Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot bne triggered.
== Fix ==
Upstream stress-ng commits:
- 10ffe40579c5 stress-stack: return error code in child using
_exit() and not return
- ef18c524df48 stress-stack: don't throw a fatal error when
sigaltstack fails
- 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry
- 8fb67daea592 stress-stack: setup alternative stack in child only
The first 3 fixes are prerequisite fixes, the final fix addresses the main issue.
== Regression Potential ==
This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test. |
== SRU Justification Eoan ==
Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script):
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5051] dispatching hogs: 4 stack
stress-ng: info: [5051] unsuccessful run completed in 32.01s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5064] dispatching hogs: 4 stack
stress-ng: info: [5064] successful run completed in 32.96s
ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0
stress-ng: info: [5077] dispatching hogs: 4 stack
stress-ng: info: [5077] unsuccessful run completed in 33.09s
This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others.
In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers.
== Test case ==
Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot be triggered.
== Fix ==
Upstream stress-ng commits:
- 10ffe40579c5 stress-stack: return error code in child using
_exit() and not return
- ef18c524df48 stress-stack: don't throw a fatal error when
sigaltstack fails
- 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry
- 8fb67daea592 stress-stack: setup alternative stack in child only
The first 3 fixes are prerequisite fixes, the final fix addresses the main issue.
== Regression Potential ==
This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test. |
|
2019-11-14 17:46:53 |
Colin Ian King |
tags |
verification-needed verification-needed-eoan |
verification-done verification-done-eoan |
|
2019-11-21 08:08:43 |
Launchpad Janitor |
stress-ng (Ubuntu Eoan): status |
Fix Committed |
Fix Released |
|
2019-11-21 08:08:47 |
Łukasz Zemczak |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|