live migration is failing with libvirt >= 6.8.0

Bug #1918250 reported by Martin Schuppert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
tripleo
Fix Released
Undecided
Martin Schuppert

Bug Description

With libvirt 6.8.0 introduced virt-ssh-helper:

+ * remote: ``virt-ssh-helper`` replaces ``nc`` for SSH tunnelling
+
+ Libvirt now provides a ``virt-ssh-helper`` binary on the server
+ side. The libvirt remote client will use this binary for setting
+ up an SSH tunnelled connection to hosts. If not present, it will
+ transparently fallback to the traditional ``nc`` tunnel. The new
+ binary makes it possible for libvirt to transparently connect
+ across hosts even if libvirt is built with a different installation
+ prefix on the client vs server. It also enables remote access to
+ the unprivileged per-user libvirt daemons(eg using a URI such as
+ ``qemu+ssh://hostname/session``. The only requirement is that
+ ``virt-ssh-helper`` is present in $PATH of the remote host.

Libvirt first checks for the `virt-ssh-helper` binary, if it's not present,
then it falls back to `nc`.

The code where the 'nova-migration-wrapper' script looks for the
"nc" binary is here[1]

libvirt used to first check for `nc` (netcat). But these two libvirt
commits[2][3] -- which are present in the libvirt build used in this
bug -- have now changed it to first look for `virt-ssh-helper`, if it
not available, then fall back to `nc`.

The nova-migration-wrapper doesn't accept this command and denies
the connection.

Mar 08 16:52:39 overcloud-novacompute-1 nova_migration_wrapper[240622]: Denying connection='192.168.24.18 54668 192.168.24.9 2022' command=['sh', '-c', "'which", 'virt-ssh-helper', '1>/dev/null', '2>&1;', 'if', 'test', '$?', '=', '0;', 'then', '', '', '', '', 'virt-ssh-helper', "'qemu:///system';", 'else', '', '', '', 'if', "'nc'", '-q', '2>&1', '|', 'grep', '"requires', 'an', 'argument"', '>/dev/null', '2>&1;', 'then', 'ARG=-q0;else', "ARG=;fi;'nc'", '$ARG', '-U', '/var/run/libvirt/libvirt-sock;', "fi'"]

A possible workaround is to force-use "netcat" (`nc`) by appending to the
migration URI: "&proxy=netcat", so the `diff` of the URL:

  - qemu+ssh://<email address hidden>:2022/system?keyfile=/etc/nova/migration/identity
  + qemu+ssh://<email address hidden>:2022/system?keyfile=/etc/nova/migration/identity&proxy=netcat

But longer term we want to allow the virt-ssh-helper, because that's needed
to work properly with the split daemons as the socket path has changed

[1] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova-migration-wrapper#L32

[2] https://libvirt.org/git/?p=libvirt.git;a=commit;h=f8ec7c842d (rpc:
    use new virt-ssh-helper binary for remote tunnelling, 2020-07-08)

[3] https://libvirt.org/git/?p=libvirt.git;a=commit;h=7d959c302d (rpc:
    Fix virt-ssh-helper detection, 2020-10-27)

Changed in tripleo:
assignee: nobody → Martin Schuppert (mschuppert)
status: New → In Progress
description: updated
tags: added: train-backport-potential
description: updated
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@Martin: You reported this against the upstream nova project but you are linking to the RDO specific nova wrapper code. Is the reported problem really affects the upstream nova project?

I'm marking this Invalid from upstream nova perspective. If you disagree then please set it back to New and help us pointing to the fault in upstream nova.

Changed in nova:
status: New → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 13.6.0

This issue was fixed in the openstack/puppet-tripleo 13.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 11.6.0

This issue was fixed in the openstack/puppet-tripleo 11.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 12.6.0

This issue was fixed in the openstack/puppet-tripleo 12.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 14.1.0

This issue was fixed in the openstack/puppet-tripleo 14.1.0 release.

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.