New tooz not working with redis sentinel

Bug #2056656 reported by Michal Arbet
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
taskflow
Fix Released
Critical
Takashi Kajinami
tooz
Fix Released
Critical
Takashi Kajinami

Bug Description

Hi,

New tooz 6.0.1 broke coordination via redis, log from cinder :

2024-03-10 00:50:15.890 23382 ERROR oslo_service.service Traceback (most recent call last):
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py", line 53, in wrapper
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service return func(*args, **kwargs)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py", line 501, in _start
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service self._server_info = self._client.info()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/commands/core.py", line 1002, in info
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service return self.execute_command("INFO", **kwargs)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/client.py", line 533, in execute_command
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service conn = self.connection or pool.get_connection(command_name, **options)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/connection.py", line 1086, in get_connection
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service connection.connect()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/sentinel.py", line 55, in connect
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service return self.retry.call_with_retry(self._connect_retry, lambda error: None)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/retry.py", line 51, in call_with_retry
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service raise error
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/retry.py", line 46, in call_with_retry
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service return do()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/sentinel.py", line 45, in _connect_retry
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service self.connect_to(self.connection_pool.get_master_address())
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/sentinel.py", line 107, in get_master_address
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service master_address = self.sentinel_manager.discover_master(self.service_name)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/redis/sentinel.py", line 301, in discover_master
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service raise MasterNotFoundError(f"No master found for {service_name!r}{error_info}")
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service redis.sentinel.MasterNotFoundError: No master found for 'kolla' : Redis<ConnectionPool<Connection<host=192.168.205.10,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.'), Redis<ConnectionPool<Connection<host=192.168.205.11,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.'), Redis<ConnectionPool<Connection<host=192.168.205.12,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.')
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service The above exception was the direct cause of the following exception:
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service Traceback (most recent call last):
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_service/service.py", line 810, in run_service
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service service.start()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/cinder/service.py", line 227, in start
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service coordination.COORDINATOR.start()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/cinder/coordination.py", line 87, in start
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service self.coordinator.start(start_heart=True)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/coordination.py", line 689, in start
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service super(CoordinationDriverWithExecutor, self).start(start_heart)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/coordination.py", line 426, in start
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service self._start()
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py", line 62, in wrapper
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service utils.raise_with_cause(
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/tooz/utils.py", line 223, in raise_with_cause
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service excutils.raise_with_cause(exc_cls, message, *args, **kwargs)
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service File "/var/lib/kolla/venv/lib/python3.11/site-packages/oslo_utils/excutils.py", line 142, in raise_with_cause
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service raise exc_cls(message, *args, **kwargs) from kwargs.get('cause')
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service tooz.coordination.ToozConnectionError: No master found for 'kolla' : Redis<ConnectionPool<Connection<host=192.168.205.10,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.'), Redis<ConnectionPool<Connection<host=192.168.205.11,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.'), Redis<ConnectionPool<Connection<host=192.168.205.12,port=26379,db=0>>> - AuthenticationError('invalid username-password pair or user is disabled.')
2024-03-10 00:50:15.890 23382 ERROR oslo_service.service
2024-03-10 00:50:15.914 6 INFO oslo_service.service [None req-6c60300f-11a4-4823-9f45-454135e22423 - - - - - -] Child 23382 exited with status 1
2024-03-10 00:50:15.939 23383 INFO cinder.service [-] Starting cinder-volume node (version 23.1.0)

It looks that it's bug in redis library because if i don't set sentinel_kwargs, everything is working as before.

(cinder-volume)[root@ceph1 /]# diff -u /var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py.orig /var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py
--- /var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py.orig 2024-03-10 00:53:41.441965412 +0000
+++ /var/lib/kolla/venv/lib/python3.11/site-packages/tooz/drivers/redis.py 2024-03-10 00:53:54.026198590 +0000
@@ -476,7 +476,6 @@
             sentinel_name = kwargs.pop('sentinel')
             sentinel_server = sentinel.Sentinel(
                 sentinel_hosts,
- sentinel_kwargs=kwargs,
                 **kwargs)
             master_client = sentinel_server.master_for(sentinel_name)
             # The master_client is a redis.Redis using a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tooz (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tooz/+/912339

Changed in python-tooz:
status: New → In Progress
Revision history for this message
Michal Arbet (michalarbet) wrote :

[coordination]
backend_url = redis://user:SECRET@192.168.205.10:26379?sentinel=kolla&sentinel_fallback=192.168.205.11:26379&sentinel_fallback=192.168.205.12:26379&db=0&socket_timeout=60&retry_on_timeout=yes

Revision history for this message
Takashi Kajinami (kajinamit) wrote :

This is side effect of https://review.opendev.org/c/openstack/tooz/+/912339 and now tooz requires auth/ssl for sentinel if you require auth/ssl for redis.
Technically we can try supporting enabling auth only in sentinel but I'd recommend fixing the incomplete authentication and enabling auth in both redis and sentinel (or disabling auth in both).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tooz/+/912344

Revision history for this message
Takashi Kajinami (kajinamit) wrote (last edit ):

We implemented redis sentinel support during this cycle, which enables SSL and authentication consistently to both redis and sentinel.
We updated the other libraries(tooz and taskflow) to follow that behavior. I personally assumed that people may not enable auth when redis sentinel is used (because we could enable authentication for only redis and redis sentinel was deployed with no-auth) but kolla was using that deployment architecture.

IMO the ideal goal is to enable authentication for sentinel, instead of that "partial" authentication configurations. However I'm aware that Redis storage driver in gnocchi currently supports only partial authentication (enabled in redis, disabled in sentinel). I'll work on it later but we don't know if that change is included in the version distros pick up.

So it's probably better to restore the previous behavior which uses partial authentication, to keep the existing usage in kolla. I'll submit patches for it.

Revision history for this message
Michal Arbet (michalarbet) wrote :

We can discuss on irc on Monday. My local env seems to be working now ..don't know why ci not for now.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to taskflow (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/taskflow/+/912346

Changed in taskflow:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on taskflow (master)

Change abandoned by "Takashi Kajinami <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/taskflow/+/912345
Reason: replaced by https://review.opendev.org/c/openstack/taskflow/+/912346

Changed in taskflow:
importance: Undecided → Critical
Changed in python-tooz:
importance: Undecided → Critical
Changed in taskflow:
assignee: nobody → Takashi Kajinami (kajinamit)
Changed in python-tooz:
assignee: nobody → Takashi Kajinami (kajinamit)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tooz (master)

Change abandoned by "Michal Arbet <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tooz/+/912339
Reason: fixed in k-a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to taskflow (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/taskflow/+/912069

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to taskflow (master)

Reviewed: https://review.opendev.org/c/openstack/taskflow/+/912346
Committed: https://opendev.org/openstack/taskflow/commit/4adf2790fecbc781561815876916f094ac9afc1c
Submitter: "Zuul (22348)"
Branch: master

commit 4adf2790fecbc781561815876916f094ac9afc1c
Author: Takashi Kajinami <email address hidden>
Date: Mon Mar 11 00:39:03 2024 +0900

    Revert "Use consistent credential for Redis and Redis Sentinel"

    This reverts commit 3fbd05078f84fc5b8190201fc6eeb7d005bf4988.

    Reason for revert:
    Some deployment tools such as kolla already rely on the previous
    behavior which requires authentication for only redis.

    Conflicts:
            taskflow/jobs/backends/impl_redis.py

    Closes-Bug: #2056656
    Change-Id: I24e0272c269c6fd287234fd2d3b2754983911a7f

Changed in taskflow:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tooz (master)

Reviewed: https://review.opendev.org/c/openstack/tooz/+/912344
Committed: https://opendev.org/openstack/tooz/commit/3bce8e1dcaab06175b10fb97d9e681e95c9c6103
Submitter: "Zuul (22348)"
Branch: master

commit 3bce8e1dcaab06175b10fb97d9e681e95c9c6103
Author: Takashi Kajinami <email address hidden>
Date: Mon Mar 11 01:20:01 2024 +0900

    Make authentication/SSL for redis sentinel optional

    Change 4954e284b9616f5e0c2cea77d94bbe18e0b8fd39 updated the redis
    sentinel driver to apply auth/ssl settings for redis sentinel, based
    on ones of redis, but this change broke the existing usage in kolla
    deployments, which require redis with authentication enabled and
    sentinel with authentication DISABLED.

    This restores the old behavior, which do not enable authentication and
    ssl for sentinel even when these for redis is enabled.

    Closes-Bug: #2056656
    Change-Id: I3047c80359df3dad64be041db6f4a3a6180479d6

Changed in python-tooz:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to taskflow (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/taskflow/+/912069
Committed: https://opendev.org/openstack/taskflow/commit/f652112423424ff403701bec5ed02d6394321f79
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit f652112423424ff403701bec5ed02d6394321f79
Author: Takashi Kajinami <email address hidden>
Date: Mon Mar 11 00:39:03 2024 +0900

    Revert "Use consistent credential for Redis and Redis Sentinel"

    This reverts commit 3fbd05078f84fc5b8190201fc6eeb7d005bf4988.

    Reason for revert:
    Some deployment tools such as kolla already rely on the previous
    behavior which requires authentication for only redis.

    Conflicts:
            taskflow/jobs/backends/impl_redis.py

    Closes-Bug: #2056656
    Change-Id: I24e0272c269c6fd287234fd2d3b2754983911a7f
    (cherry picked from commit 4adf2790fecbc781561815876916f094ac9afc1c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/taskflow 5.6.0

This issue was fixed in the openstack/taskflow 5.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tooz 6.1.0

This issue was fixed in the openstack/tooz 6.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/taskflow 5.7.0

This issue was fixed in the openstack/taskflow 5.7.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.