neutron-tempest-plugin-designate-scenario cross project job is failing on OVN

Bug #1970679 reported by Michael Johnson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
yatin

Bug Description

The cross-project neutron-tempest-plugin-designate-scenario job is failing during the Designate gate runs due to an OVN failure.

+ lib/neutron_plugins/ovn_agent:start_ovn:698 : wait_for_sock_file /var/run/openvswitch/ovnnb_db.sock
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:173 : local count=0
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 1 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=2
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 2 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=3
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 3 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=4
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 4 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=5
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 5 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:174 : '[' '!' -S /var/run/openvswitch/ovnnb_db.sock ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:175 : sleep 1
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:176 : count=6
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:177 : '[' 6 -gt 5 ']'
+ lib/neutron_plugins/ovn_agent:wait_for_sock_file:178 : die 178 'Socket /var/run/openvswitch/ovnnb_db.sock not found'
+ functions-common:die:264 : local exitcode=0
[Call Trace]
./stack.sh:1284:start_ovn_services
/opt/stack/devstack/lib/neutron-legacy:516:start_ovn
/opt/stack/devstack/lib/neutron_plugins/ovn_agent:698:wait_for_sock_file
/opt/stack/devstack/lib/neutron_plugins/ovn_agent:178:die
[ERROR] /opt/stack/devstack/lib/neutron_plugins/ovn_agent:178 Socket /var/run/openvswitch/ovnnb_db.sock not found
exit_trap: cleaning up child processes

An example job run is here:
https://zuul.opendev.org/t/openstack/build/b014e50e018d426b9367fd3219ed489e

Tags: ovn
Revision history for this message
Lajos Katona (lajos-katona) wrote :
Changed in neutron:
status: New → Confirmed
tags: added: ovn
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-tempest-plugin (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/839763

Revision history for this message
yatin (yatinkarel) wrote :

Also pushed https://review.opendev.org/c/openstack/devstack/+/839752 in devstack that should also help in getting the actual issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-tempest-plugin (master)

Reviewed: https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/839763
Committed: https://opendev.org/openstack/neutron-tempest-plugin/commit/ee00742befa537472154286cf48aef43f8e1375d
Submitter: "Zuul (22348)"
Branch: master

commit ee00742befa537472154286cf48aef43f8e1375d
Author: yatinkarel <email address hidden>
Date: Thu Apr 28 18:53:38 2022 +0530

    Collect ovn/ovs logs and ovn dbs

    The jobs running with ovs/ovn distro packages
    not having logs collected, adding these to
    zuul_copy_output so these get's collected.
    These will be helpful in troubleshooting
    issues faced in the jobs related to ovs or ovn.

    Related-Bug: #1970679
    Change-Id: Iba591ca9d6e4d55f6152648a3adb1176a841581d

yatin (yatinkarel)
Changed in neutron:
assignee: nobody → yatin (yatinkarel)
Revision history for this message
yatin (yatinkarel) wrote :

So with extra logs got a failure[1]:-

As per it it took more than 5 seconds to start[2], and generally it's too quick:-

May 23 08:16:48 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 systemd[1]: Started Open vSwitch database server for OVN Northbound database.
May 23 08:16:48 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 ovn-ctl[75518]: * /var/lib/ovn/ovnnb_db.db does not exist
May 23 08:16:55 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 ovn-ctl[75518]: * Creating empty database /var/lib/ovn/ovnnb_db.db

May 23 08:16:48 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 systemd[1]: Started Open vSwitch database server for OVN Southbound database.
May 23 08:16:48 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 ovn-ctl[75524]: * /var/lib/ovn/ovnsb_db.db does not exist
May 23 08:16:55 nested-virt-ubuntu-focal-ovh-bhs1-0029738346 ovn-ctl[75524]: * Creating empty database /var/lib/ovn/ovnsb_db.db

So we can double the retries to 10 to be more on safe side for this. With this also noticed https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/836912 switched these jobs to deploy from ovs/ovn from source unintentionally and those jobs are not impacted by this issue.

[1] https://e6d7ddba3e6c9e0ac442-b854b998feabd4bf6926393c8ea9e138.ssl.cf5.rackcdn.com/841810/7/check/neutron-tempest-plugin-designate-scenario/f7e11ad/controller/logs/devstacklog.txt
[2] https://e6d7ddba3e6c9e0ac442-b854b998feabd4bf6926393c8ea9e138.ssl.cf5.rackcdn.com/841810/7/check/neutron-tempest-plugin-designate-scenario/f7e11ad/controller/logs/services.txt

Revision history for this message
yatin (yatinkarel) wrote :
Changed in neutron:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.