kolla-ansible

Kolla-ansible timeout after 120 second while stop/start services when systemd used

Bug #2048130 reported by Michal Arbet on 2024-01-04

This bug affects 1 person

	Status	Importance	Assigned to
kolla-ansible	Status tracked in Caracal
Antelope	Fix Committed	High	Michal Arbet
Bobcat	Fix Committed	High	Michal Arbet
Caracal	Fix Committed	High	Michal Arbet

Bug Description

Hi,

I just found that kolla-ansible timeouting on start/stop services (only some of them). The reason is that when docker container (some services) receive SIGTERM instead RC 0 RC 143 is returned.

I've already tested it and add some debug messages into code of kolla_systemd_worker below :

Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service - waiting for 5.
Unit kolla-haproxy-container.service timeouted for wait for dead after 120.

Waiting for unit -> wait_for_unit(service=kolla-proxysql-container.service, timeout=120, state=deadUnit kolla-proxysql-container.service - waiting for 5.
Unit kolla-proxysql-container.service dead == dead, return.

Waiting for unit -> wait_for_unit(service=kolla-haproxy-container.service, timeout=120, state=runningUnit kolla-haproxy-container.service running == running, return.

Waiting for unit -> wait_for_unit(service=kolla-proxysql-container.service, timeout=120, state=runningUnit kolla-proxysql-container.service running == running, return.

Code is checking 'dead' but it is 'failed' (but still valid turned off service - just not exited with 0 but 143)

Then I tried to stop all services with systemctl stop $service

kolla-cinder_api-container.service > kolla-cinder_scheduler-container.service > kolla-cron-container.service > Active: kolla-designate_api-container.service > kolla-designate_backend_bind9-container.service > kolla-designate_central-container.service > kolla-designate_mdns-container.service > kolla-designate_producer-container.service > kolla-designate_sink-container.service > kolla-designate_worker-container.service > kolla-fluentd-container.service > kolla-glance_api-container.service > kolla-haproxy-container.service > kolla-haproxy_ssh-container.service > kolla-heat_api-container.service > kolla-heat_api_cfn-container.service > kolla-heat_engine-container.service > kolla-horizon-container.service > kolla-keepalived-container.service > kolla-keystone-container.service > kolla-keystone_fernet-container.service > kolla-keystone_ssh-container.service > kolla-kolla_toolbox-container.service > kolla-letsencrypt_lego-container.service > kolla-letsencrypt_webserver-container.service > kolla-magnum_api-container.service > kolla-magnum_conductor-container.service > kolla-mariadb-container.service > kolla-mariadb_clustercheck-container.service > kolla-memcached-container.service > kolla-neutron_bgp_dragent-container.service > kolla-neutron_dhcp_agent-container.service > kolla-neutron_l3_agent-container.service > kolla-neutron_metadata_agent-container.service > kolla-neutron_openvswitch_agent-container.service > kolla-neutron_server-container.service > kolla-nova_api-container.service > kolla-nova_conductor-container.service > kolla-nova_scheduler-container.service > kolla-nova_spicehtml5proxy-container.service > kolla-octavia_api-container.service > kolla-octavia_health_manager-container.housekeeping-container.service > kolla-octavia_worker-container.service > kolla-openvswitch_db-container.service > kolla-openvswitch_vswitchd-container.service > kolla-placement_api-container.service > kolla-proxysql-container.service > kolla-rabbitmq-container.service > kolla-redis-container.service > kolla-redis_sentinel-container.service > kolla-skyline_apiserver-container.service > kolla-skyline_console-container.service > Active: inactive (dead) since Thu 2024-01-04 19:00:43 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:01:26 UTC; 30s ago
failed (Result: exit-code) since Thu 2024-01-04 19:02:06 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:02:47 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:03:27 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:04:10 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:04:51 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:06:32 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:07:14 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:07:59 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:08:41 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:09:22 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:10:03 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:10:43 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:11:25 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:12:06 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:12:57 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:13:40 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:14:21 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:15:03 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:15:44 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:16:24 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:17:05 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:17:46 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:18:27 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:19:07 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:19:57 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:20:38 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:21:18 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:21:59 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:22:45 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:23:27 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:24:08 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:24:49 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:25:53 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:27:31 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:28:15 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:29:28 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:30:14 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:30:55 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:31:37 UTC; 30s ago
/>service > Active: inactive (dead) since Thu 2024-01-04 19:32:18 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:33:00 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:33:45 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:34:26 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:35:06 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:35:48 UTC; 30s ago
Active: failed (Result: exit-code) since Thu 2024-01-04 19:36:28 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:37:16 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:37:57 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:38:38 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:39:19 UTC; 30s ago
Active: inactive (dead) since Thu 2024-01-04 19:40:00 UTC; 30s ago

Problem services - everytime exited with 143

kolla-cron-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:02:06 UTC; 30s ago
kolla-designate_producer-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:06:32 UTC; 30s ago
kolla-keystone_fernet-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:15:44 UTC; 30s ago
kolla-letsencrypt_lego-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:17:46 UTC; 30s ago
kolla-magnum_api-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:19:07 UTC; 30s ago
kolla-mariadb_clustercheck-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:21:18 UTC; 30s ago
kolla-neutron_l3_agent-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:24:08 UTC; 30s ago
kolla-openvswitch_db-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:34:26 UTC; 30s ago
kolla-openvswitch_vswitchd-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:35:06 UTC; 30s ago
kolla-proxysql-container.service > Active: failed (Result: exit-code) since Thu 2024-01-04 19:36:28 UTC; 30s ago

Some reading about -> https://www.groundcover.com/kubernetes-troubleshooting/exit-code-143 , https://<email address hidden>/msg30473.html

Soooo, I will send a patch to fix this as 2 minutes for restart service is realy too much ...5 controllers means 2x5 x 2 (haproxy proxysql) - 20 minutes instead of few seconds.

Michal Arbet (michalarbet) on 2024-01-04

Changed in kolla-ansible:
assignee:	nobody → Michal Arbet (michalarbet)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-04: Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/904805

Changed in kolla-ansible:
status:	New → In Progress

Dr. Jens Harbott (j-harbott) on 2024-01-05

Changed in kolla-ansible:
importance:	Undecided → High

Revision history for this message

Dr. Jens Harbott (j-harbott) wrote on 2024-01-05:

For containers running cron this seems plausible, cron doesn't expect to be stopped. Maybe we should consider replacing those with something based on systemd timers?

For the openstack containers I cannot reproduce the issue, e.g. kolla-designate_producer-container.service stops just cleanly for me. Can you check the log in /var/log/kolla for the affected service whether it shows some issue during shutdown?

Revision history for this message

Michal Arbet (michalarbet) wrote on 2024-01-05:

Download full text (21.2 KiB)

Okay, now I know why I reported designate

I was iterating through the systemd services and called stop against them..
For specific reason I have on controller0 set connection to DB to controller0 for designate

BUT as I stopped mariadb before designate .. designate didn't have mariadb available ..so it failed ..on controller1 should be OK in my env

Okay, now I know why I reported designate

I was iterating through the systemd services and called stop against them..
For specific reason I have on controller0 set connection to DB to controller0 for designate

BUT as I stopped mariadb before designate .. designate didn't have mariadb available ..so it failed ..on controller1 should be OK in my env

2024-01-04 19:47:39.177 7 INFO oslo_service.service [None req-c370c25f-22f5-4cf9-b663-001fa6d7b48c - - - - - -] Caught SIGTERM, stopping children
2024-01-04 19:47:39.183 7 INFO designate.service [None req-c370c25f-22f5-4cf9-b663-001fa6d7b48c - - - - - -] Stopping producer service
2024-01-04 19:47:39.186 7 INFO oslo_service.service [None req-c370c25f-22f5-4cf9-b663-001fa6d7b48c - - - - - -] Waiting on 5 children to exit
2024-01-04 19:47:40.578 20 INFO designate.service [None req-3c99cc92-ed1c-4018-a7e5-fe9834d9e8c5 - - - - - -] Stopping producer service
2024-01-04 19:47:40.580 20 ERROR oslo_service.threadgroup [None req-3c99cc92-ed1c-4018-a7e5-fe9834d9e8c5 - - - - - -] Error waiting on timer.: oslo_messaging.rpc.client.RemoteError: Remote error: DBConnectionError (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.205.10' ([Errno 111] ECONNREFUSED)")
[SQL: SELECT 1]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
['Traceback (most recent call last):\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context\n    self.dialect.do_execute(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute\n    cursor.execute(statement, parameters)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/cursors.py", line 158, in execute\n    result = self._query(query)\n             ^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/cursors.py", line 325, in _query\n    conn.query(q)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 549, in query\n    self._affected_rows = self._read_query_result(unbuffered=unbuffered)\n                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 779, in _read_query_result\n    result.read()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 1157, in read\n    first_packet = self.connection._read_packet()\n                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 696, in _read_packet\n    packet_header = self._read_bytes(4)\n                    ^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 752, in _read_bytes\n    raise err.OperationalError(\n', "pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')\n", '\nThe above exception was the direct cause of the following exception:\n\n', 'Traceback (most recent call last):\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_db/sqlalchemy/engines.py", line 81, in _connect_ping_listener\n    connection.scalar(select(1))\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1262, in scalar\n    return self.execute(object_, *multiparams, **params).scalar()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1380, in execute\n    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 333, in _execute_on_connection\n    return connection._execute_clauseelement(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement\n    ret = self._execute_context(\n          ^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context\n    self._handle_dbapi_exception(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2122, in _handle_dbapi_exception\n    util.raise_(newraise, with_traceback=exc_info[2], from_=e)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context\n    self.dialect.do_execute(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute\n    cursor.execute(statement, parameters)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/cursors.py", line 158, in execute\n    result = self._query(query)\n             ^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/cursors.py", line 325, in _query\n    conn.query(q)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 549, in query\n    self._affected_rows = self._read_query_result(unbuffered=unbuffered)\n                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 779, in _read_query_result\n    result.read()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 1157, in read\n    first_packet = self.connection._read_packet()\n                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 696, in _read_packet\n    packet_header = self._read_bytes(4)\n                    ^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 752, in _read_bytes\n    raise err.OperationalError(\n', "oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')\n[SQL: SELECT 1]\n(Background on this error at: https://sqlalche.me/e/14/e3q8)\n", '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 616, in connect\n    sock = socket.create_connection(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/eventlet/green/socket.py", line 63, in create_connection\n    raise err\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/eventlet/green/socket.py", line 53, in create_connection\n    sock.connect(sa)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/eventlet/greenio/base.py", line 270, in connect\n    socket_checkerr(fd)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/eventlet/greenio/base.py", line 54, in socket_checkerr\n    raise socket.error(err, errno.errorcode[err])\n', 'ConnectionRefusedError: [Errno 111] ECONNREFUSED\n', '\nDuring handling of the above exception, another exception occurred:\n\n', 'Traceback (most recent call last):\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1798, in _execute_context\n    conn = self._revalidate_connection()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 646, in _revalidate_connection\n    self._dbapi_connection = self.engine.raw_connection(\n                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3394, in raw_connection\n    return self._wrap_pool_connect(self.pool.connect, _connection)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3368, in _wrap_pool_connect\n    util.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3361, in _wrap_pool_connect\n    return fn()\n           ^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 320, in connect\n    return _ConnectionFairy._checkout(self)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 884, in _checkout\n    fairy = _ConnectionRecord.checkout(pool)\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 490, in checkout\n    with util.safe_reraise():\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__\n    compat.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 488, in checkout\n    dbapi_connection = rec.get_connection()\n                       ^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 646, in get_connection\n    self.__connect()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 677, in __connect\n    with util.safe_reraise():\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__\n    compat.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 673, in __connect\n    self.dbapi_connection = connection = pool._invoke_creator(self)\n                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/create.py", line 578, in connect\n    return dialect.connect(*cargs, **cparams)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 598, in connect\n    return self.dbapi.connect(*cargs, **cparams)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 352, in __init__\n    self.connect()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 668, in connect\n    raise exc\n', 'pymysql.err.OperationalError: (2003, "Can\'t connect to MySQL server on \'192.168.205.10\' ([Errno 111] ECONNREFUSED)")\n', '\nThe above exception was the direct cause of the following exception:\n\n', 'Traceback (most recent call last):\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming\n    res = self.dispatcher.dispatch(message)\n          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch\n    return self._do_dispatch(endpoint, method, ctxt, args)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch\n    result = func(ctxt, **new_args)\n             ^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/designate/common/decorators/rpc.py", line 41, in exception_wrapper\n    return f(cls, *args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/designate/central/service.py", line 942, in find_zones\n    return self.storage.find_zones(context, criterion, marker, limit,\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/designate/storage/sqlalchemy/__init__.py", line 466, in find_zones\n    zones = self._find_zones(context, criterion, marker=marker,\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/designate/storage/sqlalchemy/__init__.py", line 374, in _find_zones\n    zones = self._find(\n            ^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/designate/storage/sqlalchemy/base.py", line 294, in _find\n    resultproxy = session.execute(query)\n                  ^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1711, in execute\n    conn = self._connection_for_bind(bind)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind\n    return self._transaction._connection_for_bind(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 747, in _connection_for_bind\n    conn = bind.connect()\n           ^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3315, in connect\n    return self._connection_cls(self, close_with_result=close_with_result)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 120, in __init__\n    self.dispatch.engine_connect(self, _branch_from is not None)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/event/attr.py", line 334, in __call__\n    fn(*args, **kw)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_db/sqlalchemy/engines.py", line 95, in _connect_ping_listener\n    connection.scalar(select(1))\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1262, in scalar\n    return self.execute(object_, *multiparams, **params).scalar()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1380, in execute\n    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 333, in _execute_on_connection\n    return connection._execute_clauseelement(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement\n    ret = self._execute_context(\n          ^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1806, in _execute_context\n    self._handle_dbapi_exception(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2122, in _handle_dbapi_exception\n    util.raise_(newraise, with_traceback=exc_info[2], from_=e)\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1798, in _execute_context\n    conn = self._revalidate_connection()\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 646, in _revalidate_connection\n    self._dbapi_connection = self.engine.raw_connection(\n                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3394, in raw_connection\n    return self._wrap_pool_connect(self.pool.connect, _connection)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3368, in _wrap_pool_connect\n    util.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 3361, in _wrap_pool_connect\n    return fn()\n           ^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 320, in connect\n    return _ConnectionFairy._checkout(self)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 884, in _checkout\n    fairy = _ConnectionRecord.checkout(pool)\n            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 490, in checkout\n    with util.safe_reraise():\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__\n    compat.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 488, in checkout\n    dbapi_connection = rec.get_connection()\n                       ^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 646, in get_connection\n    self.__connect()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 677, in __connect\n    with util.safe_reraise():\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__\n    compat.raise_(\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/util/compat.py", line 208, in raise_\n    raise exception\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/pool/base.py", line 673, in __connect\n    self.dbapi_connection = connection = pool._invoke_creator(self)\n                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/create.py", line 578, in connect\n    return dialect.connect(*cargs, **cparams)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 598, in connect\n    return self.dbapi.connect(*cargs, **cparams)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 352, in __init__\n    self.connect()\n', '  File "/var/lib/kolla/venv/lib/python3.11/site-packages/pymysql/connections.py", line 668, in connect\n    raise exc\n', 'oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can\'t connect to MySQL server on \'192.168.205.10\' ([Errno 111] ECONNREFUSED)")\n[SQL: SELECT 1]\n(Background on this error at: https://sqlalche.me/e/14/e3q8)\n'].
2024-01-04 19:47:40.580 20 ERROR oslo_service.threadgroup Traceback (most recent call last):
2024-01-04 19:47:40.580 20 ERROR oslo_service.threadgroup   File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_service/threadgroup.py", line 333, in _wait_timers
2024-01-04 19:47:40.580 20 ERROR oslo_service.threadgroup     x.wait()
--
2024-01-04 19:51:09.379 7 INFO oslo_service.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Caught SIGTERM, stopping children
2024-01-04 19:51:09.383 7 INFO designate.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Stopping producer service
2024-01-04 19:51:09.391 7 INFO oslo_service.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Waiting on 5 children to exit
2024-01-04 19:51:11.200 16 INFO designate.service [None req-a8161cf2-4104-4664-b493-d83e0a1b357c - - - - - -] Stopping producer service
2024-01-04 19:51:11.218 7 INFO oslo_service.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Child 16 exited with status 0
2024-01-04 19:51:11.233 18 INFO designate.service [None req-2e687ebf-197d-4183-ad2a-70ec3c363478 - - - - - -] Stopping producer service
2024-01-04 19:51:11.264 20 INFO designate.service [None req-631e321c-677a-4f07-989c-1f5e93dd2b1b - - - - - -] Stopping producer service
2024-01-04 19:51:11.265 17 INFO designate.service [None req-debe1b33-0ed3-4d50-aa23-6191a2d30c02 - - - - - -] Stopping producer service
2024-01-04 19:51:11.276 19 INFO designate.service [None req-e142df08-32d2-4d1f-a2f2-1e9980152463 - - - - - -] Stopping producer service
2024-01-04 19:51:11.279 7 INFO oslo_service.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Child 18 exited with status 0
2024-01-04 19:51:11.289 7 INFO oslo_service.service [None req-583daac8-bf1a-4c0d-9787-8f173ff307fd - - - - - -] Child 17 exited with status 0

Normally it's working and your are right for designate, for neutron l3-agent was situation similar but with rabbitmq ... I will test it on clean master deployment again without this fix and will report result in some time ...

Revision history for this message

Michal Arbet (michalarbet) wrote on 2024-01-05:

But I am afraid that we still can't fix non-openstack services ....

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-05: Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/904805
Committed: https://opendev.org/openstack/kolla-ansible/commit/b1fd2b40f7cd1c6c457bd42b25ca32dc1e5e0f4f
Submitter: "Zuul (22348)"
Branch: master

commit b1fd2b40f7cd1c6c457bd42b25ca32dc1e5e0f4f
Author: Michal Arbet <email address hidden>
Date: Thu Jan 4 22:26:13 2024 +0100

Fix long service restarts while using systemd

    Some containers exiting with 143 instead of 0, but
    this is still OK. This patch just allows
    ExitCode 143 (SIGTERM) as fix. Details in
    bugreport.

Services which exited with 143 (SIGTERM):

    kolla-cron-container.service
    kolla-designate_producer-container.service
    kolla-keystone_fernet-container.service
    kolla-letsencrypt_lego-container.service
    kolla-magnum_api-container.service
    kolla-mariadb_clustercheck-container.service
    kolla-neutron_l3_agent-container.service
    kolla-openvswitch_db-container.service
    kolla-openvswitch_vswitchd-container.service
    kolla-proxysql-container.service

Partial-Bug: #2048130
Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-05: Fix proposed to kolla-ansible (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/904740

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-05: Fix proposed to kolla-ansible (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/904841

Maksim Malchuk (mmalchuk) on 2024-01-06

no longer affects:	kolla-ansible/yoga
no longer affects:	kolla-ansible/zed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-06: Fix merged to kolla-ansible (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/904841
Committed: https://opendev.org/openstack/kolla-ansible/commit/09ca4bdcf5962a52edc95bdb015ca7e748443c0a
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 09ca4bdcf5962a52edc95bdb015ca7e748443c0a
Author: Michal Arbet <email address hidden>
Date: Thu Jan 4 22:26:13 2024 +0100

Fix long service restarts while using systemd

    Some containers exiting with 143 instead of 0, but
    this is still OK. This patch just allows
    ExitCode 143 (SIGTERM) as fix. Details in
    bugreport.

Services which exited with 143 (SIGTERM):

    Partial-Bug: #2048130
    Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb
    (cherry picked from commit b1fd2b40f7cd1c6c457bd42b25ca32dc1e5e0f4f)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-06: Fix merged to kolla-ansible (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/904740
Committed: https://opendev.org/openstack/kolla-ansible/commit/51ca1bc6967d56cd6534a0280c17a9cf6677408d
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 51ca1bc6967d56cd6534a0280c17a9cf6677408d
Author: Michal Arbet <email address hidden>
Date: Thu Jan 4 22:26:13 2024 +0100

Fix long service restarts while using systemd

    Some containers exiting with 143 instead of 0, but
    this is still OK. This patch just allows
    ExitCode 143 (SIGTERM) as fix. Details in
    bugreport.

Services which exited with 143 (SIGTERM):

    Partial-Bug: #2048130
    Change-Id: Ia8c85d03404cfb368e4013066c67acd2a2f68deb
    (cherry picked from commit b1fd2b40f7cd1c6c457bd42b25ca32dc1e5e0f4f)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-09: Fix proposed to kolla-ansible (master)

#10

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/905117

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-09:

#11

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/905121

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-09:

#12

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/905131

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-10:

#13

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/905208

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-03-26: Change abandoned on kolla-ansible (master)

#14

Change abandoned by "Michal Nasiadka <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/905208
Reason: does not help

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.