centos-9-scenario007-standalone-wallaby - os_tempest: Failing on "Ping router ip address"

Bug #1959582 reported by Douglas Viroel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Brent Eagles

Bug Description

periodic-tripleo-ci-centos-9-scenario007-standalone-wallaby is failing on creating resources before running tempest tests:
Latest runs: https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-scenario007-standalone-wallaby
Error:
TASK [os_tempest : Ping router ip address] *************************************
"From 192.168.24.1 icmp_seq=1 Destination Host Unreachable"

Looking into services/containers and errors we can see a failed container[1]
And related errors[2]:
2022-01-28 21:18:45.078 ERROR /var/log/containers/neutron/l3-agent.log: 124955 ERROR neutron.agent.linux.utils [-] Exit code: 127; Cmd: ['ip', 'netns', 'exec', 'qrouter-c345e3b6-d996-4827-a8dd-2e4280e4d67e', 'keepalived', '-P', '-f', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e/keepalived.conf', '-p', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e.pid.keepalived', '-r', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e.pid.keepalived-vrrp', '-D']; Stdin: ; Stdout: Starting a new child container neutron-keepalived-qrouter-c345e3b6-d996-4827-a8dd-2e4280e4d67e
2022-01-28 21:18:45.078 ERROR /var/log/containers/neutron/l3-agent.log: 124955 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: c345e3b6-d996-4827-a8dd-2e4280e4d67e: neutron_lib.exceptions.ProcessExecutionError: Exit code: 127; Cmd: ['ip', 'netns', 'exec', 'qrouter-c345e3b6-d996-4827-a8dd-2e4280e4d67e', 'keepalived', '-P', '-f', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e/keepalived.conf', '-p', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e.pid.keepalived', '-r', '/var/lib/neutron/ha_confs/c345e3b6-d996-4827-a8dd-2e4280e4d67e.pid.keepalived-vrrp', '-D']; Stdin: ; Stdout: Starting a new child container neutron-keepalived-qrouter-c345e3b6-d996-4827-a8dd-2e4280e4d67e

[1] https://logserver.rdoproject.org/74/36374/10/check/periodic-tripleo-ci-centos-9-scenario007-standalone-wallaby/c7821d2/logs/undercloud/var/log/extra/failed_containers.log.txt.gz
[2] https://logserver.rdoproject.org/74/36374/10/check/periodic-tripleo-ci-centos-9-scenario007-standalone-wallaby/c7821d2/logs/undercloud/var/log/extra/errors.txt.gz

Revision history for this message
Douglas Viroel (dviroel) wrote :

ci-centos-9-scenario007-multinode-oooq-container-wallaby[1] is also failing on tempest, but a step after, while trying to ssh into cirros isntance. Issue is slight different but since both are based on scenario007, seems to be related (neutron/l3-agent)[2]

[1] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-scenario007-multinode-oooq-container-wallaby
[2] https://logserver.rdoproject.org/74/36374/10/check/periodic-tripleo-ci-centos-9-scenario007-multinode-oooq-container-wallaby/5ec80b4/logs/subnode-1/var/log/extra/errors.txt.gz

Ronelle Landy (rlandy)
Changed in tripleo:
milestone: yoga-3 → yoga-1
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I quickly looked at logs from the https://logserver.rdoproject.org/74/36374/10/check/periodic-tripleo-ci-centos-9-scenario007-multinode-oooq-container-wallaby/5ec80b4/logs/subnode-1/var/log/extra/errors.txt.gz and IMO the main problem is that haproxy isn't started, so there is no metadata service for instances and ssh-keys aren't configured. The error is:

ERROR neutron.agent.linux.utils [-] Exit code: 127; Cmd: ['ip', 'netns', 'exec', 'qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c', 'haproxy', '-f', '/var/lib/neutron/ns-metadata-proxy/eb9ccbd6-99fb-4bb8-8753-17596910505c.conf']; Stdin: ; Stdout: Starting a new child container neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c
; Stderr: + export DOCKER_HOST=
+ DOCKER_HOST=
+ ARGS='-f /var/lib/neutron/ns-metadata-proxy/eb9ccbd6-99fb-4bb8-8753-17596910505c.conf'
++ ip netns identify
+ NETNS=qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c
+ NAME=neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c
+ HAPROXY_CMD='$(if [ -f /usr/sbin/haproxy-systemd-wrapper ]; then echo "/usr/sbin/haproxy -Ds"; else echo "/usr/sbin/haproxy -Ws"; fi)'
+ CLI='nsenter --net=/run/netns/qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c --preserve-credentials -m -t 1 podman'
+ LOGGING='--log-driver k8s-file --log-opt path=/var/log/containers/stdouts/neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c.log'
+ CMD='$HAPROXY'
++ nsenter --net=/run/netns/qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c --preserve-credentials -m -t 1 podman ps -a --filter name=neutron-haproxy- --format '{{.ID}}:{{.Names}}:{{.Status}}'
++ awk '{print $1}'
+ LIST=
++ printf '%s\n' ''
++ grep -E ':(Exited|Created)'
+ ORPHANTS=
+ '[' -n '' ']'
+ printf '%s\n' ''
+ grep -q 'neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c$'
+ echo 'Starting a new child container neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c'
+ nsenter --net=/run/netns/qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c --preserve-credentials -m -t 1 podman run --detach --log-driver k8s-file --log-opt path=/var/log/containers/stdouts/neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c.log -v /var/lib/config-data/puppet-generated/neutron/etc/neutron:/etc/neutron:ro -v /run/netns:/run/netns:shared -v /var/lib/neutron:/var/lib/neutron:shared -v /dev/log:/dev/log --net host --pid host --cgroupns host --privileged -u root --name neutron-haproxy-qrouter-eb9ccbd6-99fb-4bb8-8753-17596910505c 192.168.24.1:8787/tripleowallabycentos9/openstack-neutron-l3-agent:cf2d49f44be13a2a4df3c6228451f6b2-updated-20220128155406 /bin/bash -c 'HAPROXY="$(if [ -f /usr/sbin/haproxy-systemd-wrapper ]; then echo "/usr/sbin/haproxy -Ds"; else echo "/usr/sbin/haproxy -Ws"; fi)"; exec $HAPROXY -f /var/lib/neutron/ns-metadata-proxy/eb9ccbd6-99fb-4bb8-8753-17596910505c.conf'
Error: create directory `/sys/fs/cgroup/../../libpod-07d282785cdd79c7c62a030b8a84fbe68bf7ca0b54a49714bd72cff05c9edf63.scope`: No such file or directory: OCI runtime attempted to invoke a command that was not found

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I think that the problem may be caused by missing backport https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827642 in stable/wallaby. I proposed it now and it is being tested in https://review.rdoproject.org/r/c/testproject/+/38969

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827642
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/f962b8e14829d896b732867e2a9b862b4323ecb4
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit f962b8e14829d896b732867e2a9b862b4323ecb4
Author: David Vallee Delisle <email address hidden>
Date: Tue Dec 14 09:58:06 2021 -0500

    Start the l3 agent with cgroupns: host

    Since the l3 agent is spinning containers, it should use the host cgroups
    namespaces just like we did in nova [1]

    [1] https://review.opendev.org/c/openstack/tripleo-heat-templates/+/802489/

    Related-Bug: #1936005
    Closes-Bug: #1953738
    Closes-Bug: #1959582
    Change-Id: Ic83e946e1f3dc912bc4cf8270d66ecc7c2324c96
    (cherry picked from commit 157d0c112bf21139b4d9ca076f1121a941a35114)

tags: added: in-stable-wallaby
Revision history for this message
Douglas Viroel (dviroel) wrote :
Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827629 and its Wallaby backport are required to close this

Changed in tripleo:
status: Fix Released → In Progress
assignee: nobody → Brent Eagles (beagles)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827629
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/d3a6e7a99ad090bd7ef06b7bd84cf8558e5fdae5
Submitter: "Zuul (22348)"
Branch: master

commit d3a6e7a99ad090bd7ef06b7bd84cf8558e5fdae5
Author: Brent Eagles <email address hidden>
Date: Thu Feb 3 08:09:46 2022 -0330

    Start the neutron metadata agent with cgroupns host

    This container got missed in the fixups.

    Closes-Bug: #1959582
    Change-Id: I93d3eae0a738782f071a839ba7b41107a597e2fa

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827969

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/827969
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/354232139417f6db028e773d86187b1bf6e89cf2
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 354232139417f6db028e773d86187b1bf6e89cf2
Author: Brent Eagles <email address hidden>
Date: Thu Feb 3 08:09:46 2022 -0330

    Start the neutron metadata agent with cgroupns host

    This container got missed in the fixups.

    Closes-Bug: #1959582
    Change-Id: I93d3eae0a738782f071a839ba7b41107a597e2fa
    (cherry picked from commit d3a6e7a99ad090bd7ef06b7bd84cf8558e5fdae5)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 16.0.0

This issue was fixed in the openstack/tripleo-heat-templates 16.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.