Redis replication with tls doesnt work, so lets disable it oob
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Pradeep Kilambi |
Bug Description
description from bandini's bug:
Seems like “slave” redis servers are not connected to the redis “master” that is being started by pacemaker when you deploy with TLS Everywhere.
On an unencrypted Redis cluster, we can see slaves connected to the master:
# /usr/bin/redis-cli -a xxx -s '/var/run/
verify if there are any slaves. Here is a working example:
# Replication
role:master
connected_slaves:2
slave0:
slave1:
master_
repl_backlog_
repl_backlog_
repl_backlog_
repl_backlog_
We don’t see any slave connection when TLS everywhere is enabled.
On initial deployment, all replicas of the redis resource start correctly in pacemaker, and give 1 Master and 2 Slave (no error). But that is only because no replication has taken place at all.
However, when restarting a Slave, the start operation won’t succeed because the redis resource agent will try to connect to the redis master, and would fail to do:
Failed Actions:
* redis_start_0 on redis-bundle-2 'unknown error' (1): call=8, status=Timed Out, exitreason='none',
last-
With following logs from the slave redis server:
96:S 27 Nov 14:24:15.116 # Error condition on socket for SYNC: Connection reset by peer
96:S 27 Nov 14:24:16.116 * Connecting to MASTER overcloud-
96:S 27 Nov 14:24:16.117 * MASTER <-> SLAVE sync started
96:S 27 Nov 14:24:16.117 * Non blocking connect for SYNC fired the event.
96:S 27 Nov 14:24:16.117 # Error condition on socket for SYNC: Connection reset by peer
96:S 27 Nov 14:24:17.120 * Connecting to MASTER overcloud-
This is because on the 6379 port of the remote host there is an stunnel process expecting SSL traffic, whereas redis sends unencrypted traffic to it.
Fix proposed to branch: master /review. openstack. org/523969
Review: https:/