Comment 7 for bug 1580765

Revision history for this message
Brian Candler (b-candler) wrote : Re: Live migration error: Can't mount at ./dev/.lxd-mounts

Thank you. I have updated to lxd/xenial-proposed on both nodes. Now I am back to the original problem:

root@nuc1:~# lxc move sample nuc2:sample2
error: Error transferring container data: restore failed:
(00.173165) 1: Error (mount.c:2406): mnt: Can't mount at ./dev/.lxd-mounts: No such file or directory
(00.177785) Error (cr-restore.c:1352): 4619 killed by signal 9
(00.214312) Error (cr-restore.c:2182): Restoring FAILED.

On the target node, if I do "ls /var/lib/lxd/containers/sample2/rootfs/dev" repeatedly, then shortly after the migration starts I see the directory appear:

root@nuc2:~# ls /var/lib/lxd/containers/sample2/rootfs/dev
agpgart core kmem loop6 midi02 mixer2 pts ram13 ram5 rmidi1 smpte2 tty0 tty7
audio dsp loop0 loop7 midi03 mixer3 ram ram14 ram6 rmidi2 smpte3 tty1 tty8
audio1 dsp1 loop1 mapper midi1 mpu401data ram0 ram15 ram7 rmidi3 sndstat tty2 tty9
audio2 dsp2 loop2 mem midi2 mpu401stat ram1 ram16 ram8 sequencer stderr tty3 urandom
audio3 dsp3 loop3 midi0 midi3 null ram10 ram2 ram9 shm stdin tty4 zero
audioctl fd loop4 midi00 mixer port ram11 ram3 random smpte0 stdout tty5
console full loop5 midi01 mixer1 ptmx ram12 ram4 rmidi0 smpte1 tty tty6

... and then finally it vanishes again:

root@nuc2:~# ls /var/lib/lxd/containers/sample2/rootfs/dev
ls: cannot access '/var/lib/lxd/containers/sample2/rootfs/dev': No such file or directory

At least this problem is consistent. Trying again with inotify:

root@nuc2:~# inotifywait /var/lib/lxd/shmounts
Setting up watches.
Watches established.
/var/lib/lxd/shmounts/ CREATE,ISDIR sample2
root@nuc2:~#

Then same error. It seems that this directory appears and disappears very quickly:

root@nuc2:~# inotifywait /var/lib/lxd/shmounts; while ! ls /var/lib/lxd/shmounts/sample2 >/dev/null; do sleep 0.01; done; while ls /var/lib/lxd/shmounts/sample2; do sleep 0.01; done
Setting up watches.
Watches established.
/var/lib/lxd/shmounts/ CREATE,ISDIR sample2
ls: cannot access '/var/lib/lxd/shmounts/sample2': No such file or directory
root@nuc2:~#

Next I tried running strace on the target:

root@nuc2:~# strace -f -p [pid-of-lxd] 2>/tmp/strace.out

This worked once (i.e. it failed with the mount error). Unfortunately I then ran it again trying to catch longer strings:

root@nuc2:~# strace -f -s 128 -p [pid-of-lxd] 2>/tmp/strace.out

(which of course overwrote my strace file), and now I get a different error:

root@nuc1:~# lxc move sample nuc2:sample2
error: Error transferring container data: restore failed:
(00.173337) Error (cr-restore.c:2012): Can't attach to init: Operation not permitted
(00.194078) Error (cr-restore.c:2182): Restoring FAILED.

I am now getting this error on every migration with strace attached (even without -s 128, even after rebooting the nodes). I wish I had kept the first file :-(

Any other suggestions for what to look for? Are you interested in the strace from the "Can't attach to init" problem?