Comment 3 for bug 2037308

Revision history for this message
Jeffrey Chang (modern911) wrote : Re: [3.1.6] Non leader controller agents lost unexpectedly

We found a case on juju 3.1.5,
testrun https://solutions.qa.canonical.com/testruns/bce89dbd-17d7-4a88-9f0d-5d24647cbba2
crashdump https://oil-jenkins.canonical.com/artifacts/bce89dbd-17d7-4a88-9f0d-5d24647cbba2/generated/generated/juju_maas_controller/juju-crashdump-controller-2023-10-01-04.49.43.tar.gz

I see some connection issue

023-10-01 04:51:43 DEBUG juju.mgo server.go:329 Ping for 10.246.64.203:37017 is 1 ms
2023-10-01 04:51:43 DEBUG juju.mgo server.go:329 Ping for 10.246.64.201:37017 is 1 ms
2023-10-01 04:51:43 DEBUG juju.mgo server.go:329 Ping for 10.246.64.202:37017 is 6 ms
2023-10-01 04:51:43 DEBUG juju.apiserver request_notifier.go:134 [24] user-admin API connection terminated after 471.125531ms
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:134 [26] user-admin API connection terminated after 541.63376ms
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:189 <- [C] machine-2 {"request-id":101,"type":"ProxyUpdater","version":2,"request":"WatchForProxyConfigAndAPIHostPortChanges","params":"'params redacted'"}
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:221 -> [C] machine-2 10.433376ms {"request-id":101,"response":"'body redacted'"} ProxyUpdater[""].WatchForProxyConfigAndAPIHostPortChanges
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:189 <- [C] machine-2 {"request-id":102,"type":"ProxyUpdater","version":2,"request":"ProxyConfig","params":"'params redacted'"}
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:189 <- [C] machine-2 {"request-id":103,"type":"NotifyWatcher","version":1,"id":"24","request":"Next","params":"'params redacted'"}
2023-10-01 04:51:44 DEBUG juju.apiserver request_notifier.go:221 -> [C] machine-2 10.01048ms {"request-id":102,"response":"'body redacted'"} ProxyUpdater[""].ProxyConfig
2023-10-01 04:51:44 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: extend, ns: singular-controller, model: 5a837b, lease: 5a837ba3-bb3b-4a72-85f0-34afc3c9fa3c, holder: machine-0) to be processed
2023-10-01 04:51:44 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: extend, ns: singular-controller, model: 1072fa, lease: 1072fa17-025d-4f16-8ce8-286b381ad824, holder: machine-0) to be processed
2023-10-01 04:51:44 WARNING juju.core.raftlease client.go:134 response timeout waiting for Command(ver: 1, op: extend, ns: singular-controller, model: 7ab2db, lease: 7ab2db7f-752c-4a51-8271-6a694d91507f, holder: machine-0) to be processed
2023-10-01 04:51:44 DEBUG juju.worker.raft writer.go:29 [raft] 2023-10-01T04:51:44.313Z [WARN] raft: Election timeout reached, restarting election
2023-10-01 04:51:44 DEBUG juju.worker.raft writer.go:29 [raft] 2023-10-01T04:51:44.314Z [INFO] raft: entering candidate state: node="Node at 10.246.64.201:17070 [Candidate]" term=439
2023-10-01 04:51:44 INFO juju.worker.raft.rafttransport dialer.go:53 dialing 10.246.64.202:17070
2023-10-01 04:51:44 INFO juju.worker.raft.rafttransport dialer.go:53 dialing 10.246.64.203:17070
2023-10-01 04:51:44 DEBUG juju.worker.raft writer.go:29 [raft] 2023-10-01T04:51:44.316Z [ERROR] raft: failed to make requestVote RPC: target="{Voter 2 10.246.64.202:17070}" error="dial failed: dial tcp 10.246.64.202:17070: connect: connection refused"
2023-10-01 04:51:44 DEBUG juju.worker.raft writer.go:29 [raft] 2023-10-01T04:51:44.316Z [ERROR] raft: failed to make requestVote RPC: target="{Voter 1 10.246.64.203:17070}" error="dial failed: dial tcp 10.246.64.203:17070: connect: connection refused"