<< The search expression 'message:"ovnnb_db.sock: database connection failed"' gives me 1200+ hits in https://opensearch.logs.openstack.org for the last 2 weeks. I added some more filters and it gives 50 such results:- https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:build_name,negate:!t,params:(query:tripleo-ci-centos-9-standalone-external-compute-target-host),type:phrase),query:(match_phrase:(build_name:tripleo-ci-centos-9-standalone-external-compute-target-host))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:build_name,negate:!t,params:(query:tobiko-tripleo-minimal),type:phrase),query:(match_phrase:(build_name:tobiko-tripleo-minimal))),('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'94869730-aea8-11ec-9e6a-83741af3fdcd',key:filename,negate:!f,params:(query:job-output.txt),type:phrase),query:(match_phrase:(filename:job-output.txt)))),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22ovnnb_db.sock:%20database%20connection%20failed%22%20AND%20build_status:FAILURE'),sort:!()) In past we have seen taking much time to start and available of db files for those increasing timeout helped https://review.opendev.org/c/openstack/devstack/+/848548. But now it's little different issue where it takes time to stop and in that window(can be seen if the window is less than a second in below example) wait_for_sock_file returns true and moves forward and later connection to those .sock files fails as service is not restarted by that time. 2023-01-11 09:24:11.273593 | controller | + lib/neutron_plugins/ovn_agent:_start_process:239 : sudo systemctl restart ovn-central.service 2023-01-11 09:24:11.295863 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:711 : wait_for_sock_file /var/run/ovn/ovnnb_db.sock 2023-01-11 09:24:11.298605 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:186 : local count=0 2023-01-11 09:24:11.300757 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:187 : '[' '!' -S /var/run/ovn/ovnnb_db.sock ']' 2023-01-11 09:24:11.303155 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:712 : wait_for_sock_file /var/run/ovn/ovnsb_db.sock 2023-01-11 09:24:11.305826 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:186 : local count=0 2023-01-11 09:24:11.308367 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:187 : '[' '!' -S /var/run/ovn/ovnsb_db.sock ']' 2023-01-11 09:24:11.310862 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:713 : wait_for_db_file /var/lib/ovn/ovnnb_db.db 2023-01-11 09:24:11.313570 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:175 : local count=0 2023-01-11 09:24:11.316126 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnnb_db.db ']' 2023-01-11 09:24:11.319726 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:177 : sleep 1 2023-01-11 09:24:12.323489 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:178 : count=1 2023-01-11 09:24:12.326545 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:179 : '[' 1 -gt 40 ']' 2023-01-11 09:24:12.328763 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnnb_db.db ']' 2023-01-11 09:24:12.331382 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:714 : wait_for_db_file /var/lib/ovn/ovnsb_db.db 2023-01-11 09:24:12.333636 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:175 : local count=0 2023-01-11 09:24:12.336270 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnsb_db.db ']' 2023-01-11 09:24:12.339188 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:716 : is_service_enabled tls-proxy 2023-01-11 09:24:12.357535 | controller | + functions-common:is_service_enabled:2089 : return 0 2023-01-11 09:24:12.359893 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:717 : sudo ovn-nbctl --db=unix:/var/run/ovn/ovnnb_db.sock set-ssl /opt/stack/data/CA/int-ca/private/devstack-cert.key /opt/stack/data/CA/int-ca/devstack-cert.crt /opt/stack/data/CA/int-ca/ca-chain.pem 2023-01-11 09:24:12.367707 | controller | ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed (No such file or directory) from service status:- ● ovn-ovsdb-server-nb.service - Open vSwitch database server for OVN Northbound database Loaded: loaded (/lib/systemd/system/ovn-ovsdb-server-nb.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2023-01-11 09:24:11 UTC; 47s ago Main PID: 96196 (ovsdb-server) Tasks: 1 (limit: 9475) Memory: 2.9M CGroup: /system.slice/ovn-ovsdb-server-nb.service └─96196 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=db:OVN_Northbound,SSL,private_key --certificate=db:OVN_Northbound,SSL,certificate --ca-cert=db:OVN_Northbound,SSL,ca_cert --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db Jan 11 09:24:11 ubuntu-focal-inmotion-iad3-0032697188 systemd[1]: Started Open vSwitch database server for OVN Northbound database. Jan 11 09:24:11 ubuntu-focal-inmotion-iad3-0032697188 ovn-ctl[96196]: * /var/lib/ovn/ovnnb_db.db does not exist Jan 11 09:24:13 ubuntu-focal-inmotion-iad3-0032697188 ovn-ctl[96196]: * Creating empty database /var/lib/ovn/ovnnb_db.db Jan 11 09:24:13 ubuntu-focal-inmotion-iad3-0032697188 ovsdb-server[96196]: ovs|00001|vlog|INFO|opened log file /var/log/ovn/ovsdb-server-nb.log Jan 11 09:24:13 ubuntu-focal-inmotion-iad3-0032697188 ovsdb-server[96196]: ovs|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.13.8 Jan 11 09:24:23 ubuntu-focal-inmotion-iad3-0032697188 ovsdb-server[96196]: ovs|00003|memory|INFO|7340 kB peak resident set size after 10.0 seconds Jan 11 09:24:23 ubuntu-focal-inmotion-iad3-0032697188 ovsdb-server[96196]: ovs|00004|memory|INFO|cells:31 monitors:2 sessions:1 from an another ex https://zuul.openstack.org/build/b8761291279548239343257ae2e02fc7 it's more clear(service restart triggered at 10:45:46.189787, service stops at 10:45:49 and service ready at 10:46:05, wait for sock files succeeds before service stops 10:45:48):- 2023-01-12 10:45:46.189787 | controller | + lib/neutron_plugins/ovn_agent:_start_process:239 : sudo systemctl restart ovn-central.service 2023-01-12 10:45:48.159236 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:711 : wait_for_sock_file /var/run/ovn/ovnnb_db.sock 2023-01-12 10:45:48.161673 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:186 : local count=0 2023-01-12 10:45:48.166417 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:187 : '[' '!' -S /var/run/ovn/ovnnb_db.sock ']' 2023-01-12 10:45:48.167425 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:712 : wait_for_sock_file /var/run/ovn/ovnsb_db.sock 2023-01-12 10:45:48.169822 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:186 : local count=0 2023-01-12 10:45:48.172787 | controller | + lib/neutron_plugins/ovn_agent:wait_for_sock_file:187 : '[' '!' -S /var/run/ovn/ovnsb_db.sock ']' 2023-01-12 10:45:48.175211 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:713 : wait_for_db_file /var/lib/ovn/ovnnb_db.db 2023-01-12 10:45:48.177991 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:175 : local count=0 2023-01-12 10:45:48.181047 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnnb_db.db ']' 2023-01-12 10:45:48.183699 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:177 : sleep 1 2023-01-12 10:45:49.188939 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:178 : count=1 2023-01-12 10:45:49.191707 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:179 : '[' 1 -gt 40 ']' 2023-01-12 10:45:49.194073 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnnb_db.db ']' 2023-01-12 10:45:49.196129 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:177 : sleep 1 2023-01-12 10:45:50.200890 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:178 : count=2 2023-01-12 10:45:50.203234 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:179 : '[' 2 -gt 40 ']' 2023-01-12 10:45:50.205212 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnnb_db.db ']' 2023-01-12 10:45:50.207584 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:714 : wait_for_db_file /var/lib/ovn/ovnsb_db.db 2023-01-12 10:45:50.209988 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:175 : local count=0 2023-01-12 10:45:50.212481 | controller | + lib/neutron_plugins/ovn_agent:wait_for_db_file:176 : '[' '!' -f /var/lib/ovn/ovnsb_db.db ']' 2023-01-12 10:45:50.215127 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:716 : is_service_enabled tls-proxy 2023-01-12 10:45:50.232297 | controller | + functions-common:is_service_enabled:2089 : return 0 2023-01-12 10:45:50.234744 | controller | + lib/neutron_plugins/ovn_agent:start_ovn:717 : sudo ovn-nbctl --db=unix:/var/run/ovn/ovnnb_db.sock set-ssl /opt/stack/data/CA/int-ca/private/devstack-cert.key /opt/stack/data/CA/int-ca/devstack-cert.crt /opt/stack/data/CA/int-ca/ca-chain.pem 2023-01-12 10:45:50.244116 | controller | ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed (No such file or directory) ● ovn-ovsdb-server-nb.service - Open vSwitch database server for OVN Northbound database Loaded: loaded (/lib/systemd/system/ovn-ovsdb-server-nb.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2023-01-12 10:45:49 UTC; 1min 4s ago Main PID: 77511 (ovsdb-server) Tasks: 1 (limit: 9247) Memory: 2.8M CGroup: /system.slice/ovn-ovsdb-server-nb.service └─77511 ovsdb-server -vconsole:off -vfile:info --log-file=/var/log/ovn/ovsdb-server-nb.log --remote=punix:/var/run/ovn/ovnnb_db.sock --pidfile=/var/run/ovn/ovnnb_db.pid --unixctl=/var/run/ovn/ovnnb_db.ctl --remote=db:OVN_Northbound,NB_Global,connections --private-key=db:OVN_Northbound,SSL,private_key --certificate=db:OVN_Northbound,SSL,certificate --ca-cert=db:OVN_Northbound,SSL,ca_cert --ssl-protocols=db:OVN_Northbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Northbound,SSL,ssl_ciphers /var/lib/ovn/ovnnb_db.db Jan 12 10:45:49 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 systemd[1]: ovn-ovsdb-server-nb.service: Succeeded. Jan 12 10:45:49 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 systemd[1]: Stopped Open vSwitch database server for OVN Northbound database. Jan 12 10:45:49 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 systemd[1]: Started Open vSwitch database server for OVN Northbound database. Jan 12 10:45:50 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovn-ctl[77511]: * /var/lib/ovn/ovnnb_db.db does not exist Jan 12 10:46:05 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovn-ctl[77511]: * Creating empty database /var/lib/ovn/ovnnb_db.db Jan 12 10:46:05 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovsdb-server[77511]: ovs|00001|vlog|INFO|opened log file /var/log/ovn/ovsdb-server-nb.log Jan 12 10:46:05 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovsdb-server[77511]: ovs|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.13.8 Jan 12 10:46:15 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovsdb-server[77511]: ovs|00003|memory|INFO|7272 kB peak resident set size after 10.0 seconds Jan 12 10:46:15 nested-virt-ubuntu-focal-ovh-bhs1-0032709305 ovsdb-server[77511]: ovs|00004|memory|INFO|cells:31 monitors:2 sessions:1 Ensuring sock files are absent just like dbs should be able to handle this issue. Will push a patch to devstack.