"hub txn watcher sync error" and one controller is lost
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Solution QA found this error from https:/
We can see
Unit Workload Agent Machine Public address Ports Message
controller/0* active idle 0 10.246.64.201
controller/1 active idle 1 10.246.64.203
controller/2 unknown lost 2 10.246.64.202 agent lost, see 'juju show-status-log controller/2'
In the juju debug log
2023-09-29 14:32:20 ERROR juju.api.watcher watcher.go:95 error trying to stop watcher: hub txn watcher sync error: starting change stream: Closed explicitly
2023-09-29 14:32:20 ERROR juju.api.watcher watcher.go:95 error trying to stop watcher: hub txn watcher sync error: starting change stream: Closed explicitly
2023-09-29 14:32:20 ERROR juju.api.watcher watcher.go:95 error trying to stop watcher: websocket: close sent
2023-09-29 14:32:20 DEBUG juju.api monitor.go:35 RPC connection died
2023-09-29 14:32:20 DEBUG juju.rpc server.go:328 error closing codec: tls: failed to send closeNotify alert (but connection was closed anyway): write tcp 10.246.
We found a very old similar bug, LP#1802067. But decided to open a new one.
More logs can be found from https:/
This appears to be a simple disconnection of the controller from the MongoDB primary.
Was it an intermittent failure, as in the controller re-established its presence?
I'll triage as low, as it appears to be an intermittent failure due to the networking environment, and we're working to replace MongoDB. If it is appearing frequently, we can update the status.