Comment 1 for bug 1627476

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Live migration fails, because qemu can't bind a socket:

2016-09-24T11:25:15.164333Z qemu-system-x86_64: -incoming tcp:[::]:49152: Failed to bind socket: Address already in use

libvirtd allocates a port number within a configured range (we use defaults):

root@node-549:~# grep migration_port /etc/libvirt/qemu.conf
#migration_port_min = 49152
#migration_port_max = 49215

The problem with that is that libvirtd does not actually *bind* a socket, so there is a race condition between allocation done internally in libvirtd and start of a qemu process (which may fail on bind(), as we see in our case).

libvirtd *must* handle concurrent live migrations on one target host properly by itself, but the default port range seem to intersect with the ephemeral ports (which are used to establish connections to remote endpoints):

root@node-549:~# cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000

i.e. the kernel might have already given the very same port number to some other process which called connect() after libvirtd picked it, but before qemu actually started and bound it.

IMO, we should make sure these two port ranges do no intersect when deploying a compute node.