Snapshot action fails with keys-version=v2
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Etcd Charm |
Triaged
|
Medium
|
Unassigned |
Bug Description
In a customer cloud, and replicated in a test environment, I've observed problems with running the snapshot action of the etcd charm.
Context:
* Deployed etcd charm: cs:etcd-546
* V3 users of etcd are: cs:~containers/
* V2 users of etcd are: cs:~containers/
* Overall environment is CDK 1.19, although I think etcd may have been upgraded to a CDK 1.20 version; I'm not totally sure.
* Version of etcdctl installed onto k8s workers/masters: 2.3.7 (installed to /usr/local/
* Version of etcdctl installed onto etcd units: 3.4.5 (via etcd snap tracking 3.4/stable)
I am not 100% sure of the logic here, but I can clearly see via querying etcd that the V2 keys all seem related to Flannel, and thus I'm suspecting they're using the /usr/local/
Anyway, in this environment:
* Backing up V3 keys works, e.g.: juju run-action etcd/0 snapshot keys-version=v3
* Backing up V2 keys fails, e.g.: juju run-action etcd/0 snapshot keys-version=v2
Output of such failures looks like this:
$ juju run-action etcd/0 snapshot --wait keys-version=v2
unit-etcd-0:
UnitId: etcd/0
id: "16"
message: exit status 1
results:
ReturnCode: 1
Stderr: |
++ action-get target
+ ETCD_BACKUP_
++ action-get keys-version
+ ETCD_KEYS_
+ UNIT_NAME=etcd
+ UNIT_NUM=0
+ ETCD_DATA_
+ '[' '!' -d /var/snap/
+ ETCD_DATA_
++ date +%Y-%m-%d-%H.%M.%S
+ DATE_STAMP=
+ ARCHIVE=
+ mkdir -p /home/ubuntu/
+ '[' v2 == v2 ']'
+ /snap/bin/
Error: unknown command "backup" for "etcdctl"
Run 'etcdctl --help' for usage.
Error: unknown command "backup" for "etcdctl"
status: failed
timing:
completed: 2021-03-29 22:13:17 +0000 UTC
enqueued: 2021-03-29 22:13:14 +0000 UTC
started: 2021-03-29 22:13:16 +0000 UTC
summary: |
- Snapshot action appears to not work in mixed v2/v3 environments + Snapshot action fails with keys-version=v2 |
I'm unable to find a clear workaround for the v2 case.
If I copy the etcdctl from one of the flannel charms onto one of the etcd units for the sake of running "etcdctl backup", it fails:
$ sudo ./etcdctl backup --data-dir /var/snap/ etcd/current/ --backup-dir /home/ubuntu/ etcd-snapshots/ $(date +%Y%m%d_%H%M%S)
2021-03-29 23:13:41.117268 W | snap: skipped unexpected non snapshot file db
2021-03-29 23:13:41.118151 W | wal: ignored file 1.tmp in wal
panic: runtime error: makeslice: len out of range
goroutine 1 [running]: com/coreos/ etcd/wal. (*decoder) .decode( 0xc0000f5590, 0xc00016f8f0, 0x0, 0x0)
/etcd/ gopath/ src/github. com/coreos/ etcd/wal/ decoder. go:55 +0x14c com/coreos/ etcd/wal. (*WAL). ReadAll( 0xc0001720d0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/etcd/ gopath/ src/github. com/coreos/ etcd/wal/ wal.go: 237 +0x157 com/coreos/ etcd/etcdctl/ command. handleBackup( 0xc000122a20)
/etcd/ gopath/ src/github. com/coreos/ etcd/etcdctl/ command/ backup_ command. go:90 +0x551 com/coreos/ etcd/Godeps/ _workspace/ src/github. com/codegangsta /cli.Command. Run(0xaf3d50, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0xaff39d, 0x18, 0x0, ...)
/etcd/ gopath/ src/github. com/coreos/ etcd/Godeps/ _workspace/ src/github. com/codegangsta /cli/command. go:137 +0x709 com/coreos/ etcd/Godeps/ _workspace/ src/github. com/codegangsta /cli.(* App).Run( 0xc0001227e0, 0xc00001e1e0, 0x6, 0x6, 0x0, 0x0)
/etcd/ gopath/ src/github. com/coreos/ etcd/Godeps/ _workspace/ src/github. com/codegangsta /cli/app. go:175 +0x6e8
/etcd/ gopath/ src/github. com/coreos/ etcd/etcdctl/ main.go: 69 +0x1d69
github.
github.
github.
github.
github.
main.main()