Activity log for bug #1851316

Date Who What changed Old value New value Message
2019-11-05 01:12:07 Rod Smith bug added bug
2019-11-05 09:03:07 Colin Ian King stress-ng: importance Undecided High
2019-11-05 09:03:09 Colin Ian King stress-ng: assignee Colin Ian King (colin-king)
2019-11-05 09:03:13 Colin Ian King stress-ng: status New In Progress
2019-11-05 13:32:33 Andrew Cloke bug added subscriber Andrew Cloke
2019-11-05 19:17:11 Colin Ian King stress-ng: status In Progress Fix Committed
2019-11-06 09:27:16 Colin Ian King description Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script): ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5051] dispatching hogs: 4 stack stress-ng: info: [5051] unsuccessful run completed in 32.01s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5064] dispatching hogs: 4 stack stress-ng: info: [5064] successful run completed in 32.96s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5077] dispatching hogs: 4 stack stress-ng: info: [5077] unsuccessful run completed in 33.09s This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others. In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers. == SRU Justification Eoan == Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script): ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5051] dispatching hogs: 4 stack stress-ng: info: [5051] unsuccessful run completed in 32.01s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5064] dispatching hogs: 4 stack stress-ng: info: [5064] successful run completed in 32.96s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5077] dispatching hogs: 4 stack stress-ng: info: [5077] unsuccessful run completed in 33.09s This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others. In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers. == Test case == Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot bne triggered. == Fix == Upstream stress-ng commits: - 10ffe40579c5 stress-stack: return error code in child using _exit() and not return - ef18c524df48 stress-stack: don't throw a fatal error when sigaltstack fails - 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry - 8fb67daea592 stress-stack: setup alternative stack in child only The first 3 fixes are prerequisite fixes, the final fix addresses the main issue. == Regression Potential == This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test.
2019-11-06 11:06:11 Colin Ian King bug task added stress-ng (Ubuntu)
2019-11-06 11:06:22 Colin Ian King nominated for series Ubuntu Focal
2019-11-06 11:06:22 Colin Ian King bug task added stress-ng (Ubuntu Focal)
2019-11-06 11:06:22 Colin Ian King nominated for series Ubuntu Eoan
2019-11-06 11:06:22 Colin Ian King bug task added stress-ng (Ubuntu Eoan)
2019-11-06 11:06:37 Colin Ian King stress-ng: status Fix Committed Fix Released
2019-11-06 11:07:01 Colin Ian King stress-ng (Ubuntu Eoan): status New In Progress
2019-11-06 11:07:03 Colin Ian King stress-ng (Ubuntu Focal): status New In Progress
2019-11-06 13:14:45 Launchpad Janitor stress-ng (Ubuntu Focal): status In Progress Fix Released
2019-11-06 13:47:51 Colin Ian King bug added subscriber Ubuntu Stable Release Updates Team
2019-11-14 15:11:05 Łukasz Zemczak stress-ng (Ubuntu Eoan): status In Progress Fix Committed
2019-11-14 15:11:07 Łukasz Zemczak bug added subscriber SRU Verification
2019-11-14 15:11:09 Łukasz Zemczak tags verification-needed verification-needed-eoan
2019-11-14 15:33:02 Colin Ian King description == SRU Justification Eoan == Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script): ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5051] dispatching hogs: 4 stack stress-ng: info: [5051] unsuccessful run completed in 32.01s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5064] dispatching hogs: 4 stack stress-ng: info: [5064] successful run completed in 32.96s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5077] dispatching hogs: 4 stack stress-ng: info: [5077] unsuccessful run completed in 33.09s This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others. In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers. == Test case == Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot bne triggered. == Fix == Upstream stress-ng commits: - 10ffe40579c5 stress-stack: return error code in child using _exit() and not return - ef18c524df48 stress-stack: don't throw a fatal error when sigaltstack fails - 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry - 8fb67daea592 stress-stack: setup alternative stack in child only The first 3 fixes are prerequisite fixes, the final fix addresses the main issue. == Regression Potential == This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test. == SRU Justification Eoan == Doing regression testing of Ubuntu 19.10, we've been seeing frequent failures of stress-ng's stack stressor. These systems have stress-ng 0.10.07-1. For instance (running manually, not in the Checkbox test script): ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5051] dispatching hogs: 4 stack stress-ng: info: [5051] unsuccessful run completed in 32.01s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5064] dispatching hogs: 4 stack stress-ng: info: [5064] successful run completed in 32.96s ubuntu@kzanol:~$ stress-ng -k --aggressive --verify --timeout 30 --stack 0 stress-ng: info: [5077] dispatching hogs: 4 stack stress-ng: info: [5077] unsuccessful run completed in 33.09s This problem seems to affect some systems more than others; in doing my testing, some computers fail a majority of 30-second or greater test runs (our default test run in 300s in length), but others haven't failed once over several such test runs. I haven't yet identified what's causing some systems to fail but not others. In testing this, I installed Ubuntu 19.04, which comes with stress-ng 0.09.57-0ubuntu3, on one affected system, and encountered no problems. Upon upgrading to stress-ng 0.10.07-1 from Ubuntu 19.04, the problems returned. Thus, this appears to either be a problem with stress-ng 0.10.07-1 or this new version of stress-ng is detecting previously-undetected problems on multiple servers. == Test case == Run stress-ng -k --aggressive --verify --timeout 30 --stack 0 multiple times and interrupt it with control-C (SIGINT). This can trigger a segfault. With the fix, the segfault cannot be triggered. == Fix == Upstream stress-ng commits:     - 10ffe40579c5 stress-stack: return error code in child using                    _exit() and not return     - ef18c524df48 stress-stack: don't throw a fatal error when                    sigaltstack fails     - 6245e5f62eae stress-stack: check for ENOMEM fork failure and retry     - 8fb67daea592 stress-stack: setup alternative stack in child only The first 3 fixes are prerequisite fixes, the final fix addresses the main issue. == Regression Potential == This affects just the stress-ng stack stressor. The fixes are already tested upstream fixes found in stress-ng in Ubuntu Focal. The fixes have been regression tested on arm64, amd64, i386, s390x and ppc64el architectures so the test coverage is good on these fixes. The fixes change the stack of the signal handler and also the exit of a child stress process, so the affects of the changes are small in the context of the stress test.
2019-11-14 17:46:53 Colin Ian King tags verification-needed verification-needed-eoan verification-done verification-done-eoan
2019-11-21 08:08:43 Launchpad Janitor stress-ng (Ubuntu Eoan): status Fix Committed Fix Released
2019-11-21 08:08:47 Łukasz Zemczak removed subscriber Ubuntu Stable Release Updates Team