Comment 0 for bug 1960656

Revision history for this message
Alexander Balderson (asbalderson) wrote :

During a deployment of latest Kubernetes on baremetal SQA ran into an issue where all 3 etcd units came up and were marked as active/idle, but the syslog shows etcd rejecting every request that came in, including those from 127.0.0.1. Here is a snip from the end of the log on etcd_2

Feb 10 22:25:34 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.186:53202" (error "EOF", ServerName "")
Feb 10 22:26:07 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.4:56646" (error "EOF", ServerName "")
Feb 10 22:26:43 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.169.209:45360" (error "EOF", ServerName "")
Feb 10 22:27:28 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "10.246.168.187:60292" (error "EOF", ServerName "")
Feb 10 22:27:29 juju-da6364-5-lxd-0 systemd[1]: Started snap.etcd.etcdctl.42388d27-d989-4115-8dfd-291f30bb6b6b.scope.
Feb 10 22:27:29 juju-da6364-5-lxd-0 etcd[38396]: rejected connection from "127.0.0.1:52626" (error "tls: first record does not look like a TLS handshake", ServerName "")

As a result 2 of the 3 vault units in the deployment were unable to connect to etcd and start the vault service, blocking the deployment.

The etcd units are running etcd 3.4/stable

I've attached the crashdump, but the testrun can be found at:
https://solutions.qa.canonical.com/testruns/testRun/fb27ca53-2c5c-4ffe-9e59-516242fda696