Migrations fail (SSH host key verification failure) after setting libvirt-migration-network
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Cloud Controller Charm |
Confirmed
|
High
|
Alex Kavanagh | ||
OpenStack Nova Compute Charm |
Confirmed
|
High
|
Alex Kavanagh |
Bug Description
Issue encountered on:
- OS: Xenial
- OpenStack version: Pike
- Charm: nova-compute revision 311
I can confirm that this issue arises after setting the "libvirt-
Prior to setting the directive, migrations are successful.
After setting the directive, the very same migrations fail (I've tested with the same source/destination nodes, and same compute instances).
How I was able to replicate the issue:
1. Set `libvirt-
juju config nova-compute-kvm libvirt-
2. I wait until the nova-compute-kvm units finish "executing" and are again in an "active/idle" state. While in the "executing" state, I can confirm that there does appear to be an attempt to exchange SSH keys: "(config-changed) SSH key exchange"
3. Attempt the migration:
openstack server migrate 3c70bf83-
But this results in the error (taken from source node's /var/log/
2020-01-24 01:05:42.317 1214121 ERROR nova.virt.
2020-01-24 01:05:42.399 1214121 ERROR nova.virt.
I can confirm that the source node can reach the destination node on port 22 (confirmed via netcat/telnet).
I can confirm that once the libvirt-
Changed in charm-nova-cloud-controller: | |
assignee: | nobody → Alex Kavanagh (ajkavanagh) |
Changed in charm-nova-cloud-controller: | |
assignee: | Alex Kavanagh (ajkavanagh) → nobody |
status: | In Progress → Confirmed |
Changed in charm-nova-compute: | |
status: | New → Confirmed |
Changed in charm-nova-cloud-controller: | |
importance: | Undecided → High |
Changed in charm-nova-compute: | |
importance: | Undecided → Medium |
importance: | Medium → High |
Changed in charm-nova-cloud-controller: | |
assignee: | nobody → Alex Kavanagh (ajkavanagh) |
Changed in charm-nova-compute: | |
assignee: | nobody → Alex Kavanagh (ajkavanagh) |
This *might* be due to host key caching that the nova-cloud- controller is now doing. Please could you try to clear the host key cache on the nova-cloud- controller( s) unit using the juju action:
juju run-action nova-cloud- controller/ 0 clear-unit- knownhost- cache
This needs to be run on all the nova-cloud- controllers.
This will re-find all the the SSH keys on all the nova-compute units and re-share them. If this fixes the problem, then the issue is around not clearing the cache and re-seeding the keys when the migration network is changed.