charm upgrade fron 20.03 to 22.03 never completes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
charm-ovn-central |
New
|
Undecided
|
Unassigned |
Bug Description
I deployed from 20.03/stable then upgraded to 22.03/stable and 1/3 units is stuck as follows:
$ juju status ovn-central
Model Controller Cloud/Region Version SLA Timestamp
ovntest hopem stsstack/stsstack 2.9.37 unsupported 11:52:17Z
App Version Status Scale Charm Channel Rev Exposed Message
ovn-central 20.03.2 active 3 ovn-central 22.03/stable 57 no Unit is ready
Unit Workload Agent Machine Public address Ports Message
ovn-central/0 active executing 10 10.5.2.11 6641/tcp,6642/tcp (upgrade-charm) Unit is ready
ovn-central/1 active idle 11 10.5.3.76 6641/tcp,6642/tcp Unit is ready (leader: ovnsb_db)
ovn-central/2* active idle 12 10.5.1.42 6641/tcp,6642/tcp Unit is ready (leader: ovnnb_db)
Machine State Address Inst id Series AZ Message
10 started 10.5.2.11 56f71304-
11 started 10.5.3.76 743003bc-
12 started 10.5.1.42 9e1d3966-
ovn-central/0 log shows:
# egrep "ovn-ovsdb-
2023-04-11 17:29:23 WARNING unit.ovn-
2023-04-11 17:29:23 WARNING unit.ovn-
2023-04-11 17:29:25 WARNING unit.ovn-
2023-04-11 17:29:25 WARNING unit.ovn-
2023-04-11 17:29:26 WARNING unit.ovn-
2023-04-11 17:29:26 WARNING unit.ovn-
2023-04-12 11:31:49 WARNING unit.ovn-
2023-04-12 11:31:50 WARNING unit.ovn-
And appears the sbdb unit is missing:
# systemctl list-units| grep ovn
ovn-central.
ovn-northd.
ovn-ovsdb-
It is enabled though:
# systemctl list-unit-files| grep ovn
ovn-central.service enabled enabled
ovn-nb-
ovn-northd.service static enabled
ovn-ovsdb-
ovn-ovsdb-
ovn-sb-
Manual restart does bring it back but ovn-sbctl commands are hanging. Logs show:
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
2023-04-
SB cluster membership seems ok:
# ovs-appctl -t /var/run/ ovn/ovnsb_ db.ctl cluster/status OVN_Southbound e94f-418a- 871a-3f09aa9af5 59) fcac-4d73- 9658-52de4b9296 80)
fb8c
Name: OVN_Southbound
Cluster ID: 9625 (96251ba3-
Server ID: fb8c (fb8cd694-
Address: ssl:10.5.2.11:6644
Status: cluster member
Role: follower
Term: 6
Leader: 1fcc
Vote: unknown
Election timer: 4000
Log: [2, 106]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->3f54 ->1fcc <-3f54 <-1fcc
Servers:
fb8c (fb8c at ssl:10.5.2.11:6644) (self)
3f54 (3f54 at ssl:10.5.1.42:6644)
1fcc (1fcc at ssl:10.5.3.76:6644)