missing unit for leader
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Triaged
|
Low
|
Unassigned |
Bug Description
Our customer is doing series upgrade gnocchi and mongodb were there in same container, and gnocchi upgrade-series was going into error state, so once they removed mongodb from the container gnocchi upgrade went fine.
Then they redeployed mongodb on different container facing the issue 'no replset config has been received'.
mongodb/6 maintenance executing 42/lxd/9 10.110.244.146 27017/tcp,
nrpe/162 waiting allocating 10.110.244.146 agent initializing
mongodb/7 maintenance executing 43/lxd/9 10.110.244.147 27017/tcp,
nrpe/161 waiting allocating 10.110.244.147 agent initializing
Obviously, missing unit for leader is the root cause, it causes init_replset [1] not to be run so the issue happens.
$ juju run --unit mongodb/20 is-leader
False
$ juju run --unit mongodb/21 is-leader
False
But why is leader missing? The above is all that has been done, then we tried:
1, we removed the application several times with the former name, and it always failed.
juju remove-application mongodb --force
juju deploy mongodb -n 2 --constraints "spaces=oam-space" --bind "internal-space configsvr=
2, we restarted juju agent and juju unit on two hosts according to lp:1810331 [2], it failed as well.
3, Finally redeploying with a diffeent name fixed the issue.
and I also did many tests but it didn't reproduce. I also analyzed some data.
1, unitstates.json shows leader is false for both mongodb/20 and mongodb/21, see https:/
2, settings.json shows there is no mongodb/20 and mongdob/21, see - https:/
The present version is: series=bionic, cs:mongodb-54, mongodb=3.6.3
[1] https:/
[2] https:/
Our customer encountered this problem again when they upgraded juju controller and model from 2.8.9 to 2.8.10 as a upgradation procedure. mongodb went again in maintenance status, and there was no any leader at that time. see - https:/ /paste. ubuntu. com/p/xYMqrQQ5d m/