Activity log for bug #1799717

Date Who What changed Old value New value Message
2018-10-24 14:22:45 Bogdan Dobrelya bug added bug
2018-10-24 14:22:53 Bogdan Dobrelya tripleo: importance Undecided High
2018-10-24 14:22:56 Bogdan Dobrelya tripleo: milestone stein-1
2018-10-24 14:22:59 Bogdan Dobrelya tripleo: status New Triaged
2018-10-24 14:23:09 Bogdan Dobrelya tags ci
2018-10-24 14:28:58 Bogdan Dobrelya description The error message is highly likely a red herring pointing out to some other sort of issues, like system under pressure perhaps. Examples: Single error: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_48_00 Multiple errors: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_15_16 dstat: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/dstat.html.gz dstat shows correlation with high CPU wait numbers (>70%) See also the elastic-recheck stats for that error pattern: total hits: 7 build_branch 85% master 14% stable/rocky build_change 14% 591540 610728 582301 14% 611447 608354 14% 582735 14% 610087 14% 610491 build_name 14% tripleo-ci-centos-7-containers-multinode tripleo-ci-centos-7-scenario001-multinode-oooq-container 14% tripleo-ci-centos-7-undercloud-containers tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates tripleo-ci-centos-7-scenario003-multinode-oooq-container 14% tripleo-ci-centos-7-containers-multinode 14% tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades 14% tripleo-ci-centos-7-scenario004-multinode-oooq-container build_node 100% centos-7 build_queue 85% check 14% gate build_status 71% FAILURE 14% FAILURE SUCCESS 14% SUCCESS FAILURE build_zuul_url 100% N/A filename 71% logs/undercloud/var/log/extra/logstash.txt 28% logs/undercloud/var/log/extra/errors.txt log_url 14% http://logs.openstack.org/40/591540/69/check/tripleo-ci-centos-7-undercloud-containers/7c3676b/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/28/610728/5/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a299956/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/logstash.txt 14% http://logs.openstack.org/47/611447/2/check/tripleo-ci-centos-7-containers-multinode/3ede27e/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/54/608354/3/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/9690552/logs/undercloud/var/log/extra/logstash.txt 14% http://logs.openstack.org/15/610515/1/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/02cd3e4/logs/undercloud/var/log/extra/logstash.txt 14% http://logs.openstack.org/35/582735/10/check/tripleo-ci-centos-7-containers-multinode/72c2e19/logs/undercloud/var/log/extra/logstash.txt 14% http://logs.openstack.org/87/610087/3/check/tripleo-ci-centos-7-scenario007-multinode-oooq-container/6002e49/logs/undercloud/var/log/extra/logstash.txt node_provider 57% inap-mtl01 14% inap-mtl01 rax-dfw 14% ovh-gra1 rax-iad inap-mtl01 14% rax-iad port 14% 35486 14% 38788 14% 42428 14% 42552 14% 45124 project 28% openstack/tripleo-heat-templates 28% openstack/tripleo-quickstart-extras 14% openstack/tripleo-heat-templates openstack/congress 14% openstack/tripleo-quickstart openstack/tripleo-quickstart-extras openstack/tripleo-heat-templates 14% openstack/tripleo-common severity 71% INFO 28% ERROR tags 71% logstash.txt console postci multiline _grokparsefailure 28% errors.txt console errors multiline _grokparsefailure voting 57% 1 28% 0 14% 1 0 zuul_executor 28% ze09.openstack.org 14% ze07.openstack.org ze02.openstack.org ze01.openstack.org 14% ze10.openstack.org ze05.openstack.org 14% ze03.openstack.org 14% ze07.openstack.org So jobs not always fail with that error. I think it should be CPU pressure related instead. The error message is highly likely a red herring pointing out to some other sort of issues, like system under pressure perhaps. Examples: Single error: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_48_00 Multiple errors: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_15_16 dstat: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/dstat.html.gz dstat shows correlation with high CPU wait numbers (>70%) and an increased memory use (See 19h 48m 44s and further on) See also the elastic-recheck stats for that error pattern: total hits: 7 build_branch   85% master   14% stable/rocky build_change   14% 591540 610728 582301   14% 611447 608354   14% 582735   14% 610087   14% 610491 build_name   14% tripleo-ci-centos-7-containers-multinode tripleo-ci-centos-7-scenario001-multinode-oooq-container   14% tripleo-ci-centos-7-undercloud-containers tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates tripleo-ci-centos-7-scenario003-multinode-oooq-container   14% tripleo-ci-centos-7-containers-multinode   14% tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades   14% tripleo-ci-centos-7-scenario004-multinode-oooq-container build_node   100% centos-7 build_queue   85% check   14% gate build_status   71% FAILURE   14% FAILURE SUCCESS   14% SUCCESS FAILURE build_zuul_url   100% N/A filename   71% logs/undercloud/var/log/extra/logstash.txt   28% logs/undercloud/var/log/extra/errors.txt log_url   14% http://logs.openstack.org/40/591540/69/check/tripleo-ci-centos-7-undercloud-containers/7c3676b/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/28/610728/5/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a299956/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/47/611447/2/check/tripleo-ci-centos-7-containers-multinode/3ede27e/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/54/608354/3/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/9690552/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/15/610515/1/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/02cd3e4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/35/582735/10/check/tripleo-ci-centos-7-containers-multinode/72c2e19/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/87/610087/3/check/tripleo-ci-centos-7-scenario007-multinode-oooq-container/6002e49/logs/undercloud/var/log/extra/logstash.txt node_provider   57% inap-mtl01   14% inap-mtl01 rax-dfw   14% ovh-gra1 rax-iad inap-mtl01   14% rax-iad port   14% 35486   14% 38788   14% 42428   14% 42552   14% 45124 project   28% openstack/tripleo-heat-templates   28% openstack/tripleo-quickstart-extras   14% openstack/tripleo-heat-templates openstack/congress   14% openstack/tripleo-quickstart openstack/tripleo-quickstart-extras openstack/tripleo-heat-templates   14% openstack/tripleo-common severity   71% INFO   28% ERROR tags   71% logstash.txt console postci multiline _grokparsefailure   28% errors.txt console errors multiline _grokparsefailure voting   57% 1   28% 0   14% 1 0 zuul_executor   28% ze09.openstack.org   14% ze07.openstack.org ze02.openstack.org ze01.openstack.org   14% ze10.openstack.org ze05.openstack.org   14% ze03.openstack.org   14% ze07.openstack.org So jobs not always fail with that error. I think it should be CPU pressure related instead.
2018-10-24 14:34:15 Bogdan Dobrelya description The error message is highly likely a red herring pointing out to some other sort of issues, like system under pressure perhaps. Examples: Single error: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_48_00 Multiple errors: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_15_16 dstat: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/dstat.html.gz dstat shows correlation with high CPU wait numbers (>70%) and an increased memory use (See 19h 48m 44s and further on) See also the elastic-recheck stats for that error pattern: total hits: 7 build_branch   85% master   14% stable/rocky build_change   14% 591540 610728 582301   14% 611447 608354   14% 582735   14% 610087   14% 610491 build_name   14% tripleo-ci-centos-7-containers-multinode tripleo-ci-centos-7-scenario001-multinode-oooq-container   14% tripleo-ci-centos-7-undercloud-containers tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates tripleo-ci-centos-7-scenario003-multinode-oooq-container   14% tripleo-ci-centos-7-containers-multinode   14% tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades   14% tripleo-ci-centos-7-scenario004-multinode-oooq-container build_node   100% centos-7 build_queue   85% check   14% gate build_status   71% FAILURE   14% FAILURE SUCCESS   14% SUCCESS FAILURE build_zuul_url   100% N/A filename   71% logs/undercloud/var/log/extra/logstash.txt   28% logs/undercloud/var/log/extra/errors.txt log_url   14% http://logs.openstack.org/40/591540/69/check/tripleo-ci-centos-7-undercloud-containers/7c3676b/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/28/610728/5/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a299956/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/47/611447/2/check/tripleo-ci-centos-7-containers-multinode/3ede27e/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/54/608354/3/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/9690552/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/15/610515/1/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/02cd3e4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/35/582735/10/check/tripleo-ci-centos-7-containers-multinode/72c2e19/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/87/610087/3/check/tripleo-ci-centos-7-scenario007-multinode-oooq-container/6002e49/logs/undercloud/var/log/extra/logstash.txt node_provider   57% inap-mtl01   14% inap-mtl01 rax-dfw   14% ovh-gra1 rax-iad inap-mtl01   14% rax-iad port   14% 35486   14% 38788   14% 42428   14% 42552   14% 45124 project   28% openstack/tripleo-heat-templates   28% openstack/tripleo-quickstart-extras   14% openstack/tripleo-heat-templates openstack/congress   14% openstack/tripleo-quickstart openstack/tripleo-quickstart-extras openstack/tripleo-heat-templates   14% openstack/tripleo-common severity   71% INFO   28% ERROR tags   71% logstash.txt console postci multiline _grokparsefailure   28% errors.txt console errors multiline _grokparsefailure voting   57% 1   28% 0   14% 1 0 zuul_executor   28% ze09.openstack.org   14% ze07.openstack.org ze02.openstack.org ze01.openstack.org   14% ze10.openstack.org ze05.openstack.org   14% ze03.openstack.org   14% ze07.openstack.org So jobs not always fail with that error. I think it should be CPU pressure related instead. The error message is highly likely a red herring pointing out to some other sort of issues, like system under pressure perhaps. Examples: Single error: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_48_00 Multiple errors: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/journal.txt.gz#_Oct_23_19_15_16 dstat: http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/dstat.html.gz dstat shows correlation with high CPU wait numbers (>70%) and an increased memory use (See 19h 48m 44s and further on) See also the elastic-recheck stats for that error pattern: total hits: 7 build_branch   85% master   14% stable/rocky build_change   14% 591540 610728 582301   14% 611447 608354   14% 582735   14% 610087   14% 610491 build_name   14% tripleo-ci-centos-7-containers-multinode tripleo-ci-centos-7-scenario001-multinode-oooq-container   14% tripleo-ci-centos-7-undercloud-containers tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates tripleo-ci-centos-7-scenario003-multinode-oooq-container   14% tripleo-ci-centos-7-containers-multinode   14% tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades   14% tripleo-ci-centos-7-scenario004-multinode-oooq-container build_node   100% centos-7 build_queue   85% check   14% gate build_status   71% FAILURE   14% FAILURE SUCCESS   14% SUCCESS FAILURE build_zuul_url   100% N/A filename   71% logs/undercloud/var/log/extra/logstash.txt   28% logs/undercloud/var/log/extra/errors.txt log_url   14% http://logs.openstack.org/40/591540/69/check/tripleo-ci-centos-7-undercloud-containers/7c3676b/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/28/610728/5/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/a299956/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/01/582301/28/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/7009de4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/47/611447/2/check/tripleo-ci-centos-7-containers-multinode/3ede27e/logs/undercloud/var/log/extra/logstash.txt http://logs.openstack.org/54/608354/3/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/9690552/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/15/610515/1/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/02cd3e4/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/35/582735/10/check/tripleo-ci-centos-7-containers-multinode/72c2e19/logs/undercloud/var/log/extra/logstash.txt   14% http://logs.openstack.org/87/610087/3/check/tripleo-ci-centos-7-scenario007-multinode-oooq-container/6002e49/logs/undercloud/var/log/extra/logstash.txt node_provider   57% inap-mtl01   14% inap-mtl01 rax-dfw   14% ovh-gra1 rax-iad inap-mtl01   14% rax-iad port   14% 35486   14% 38788   14% 42428   14% 42552   14% 45124 project   28% openstack/tripleo-heat-templates   28% openstack/tripleo-quickstart-extras   14% openstack/tripleo-heat-templates openstack/congress   14% openstack/tripleo-quickstart openstack/tripleo-quickstart-extras openstack/tripleo-heat-templates   14% openstack/tripleo-common severity   71% INFO   28% ERROR tags   71% logstash.txt console postci multiline _grokparsefailure   28% errors.txt console errors multiline _grokparsefailure voting   57% 1   28% 0   14% 1 0 zuul_executor   28% ze09.openstack.org   14% ze07.openstack.org ze02.openstack.org ze01.openstack.org   14% ze10.openstack.org ze05.openstack.org   14% ze03.openstack.org   14% ze07.openstack.org So jobs not always fail with that error. It should be CPU wait (IO) and memory pressure related instead.
2018-10-30 16:17:02 Juan Antonio Osorio Robles tripleo: milestone stein-1 stein-2
2019-01-13 22:51:14 Emilien Macchi tripleo: milestone stein-2 stein-3
2019-02-12 14:02:15 Bogdan Dobrelya tripleo: status Triaged Invalid