Cold migration fails

Bug #322779 reported by Soren Hansen
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
opennebula (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: opennebula

Cold migration fails because we're connecting to qemu:///system, so the saved state is owned by root, so we can't copy it to the remote host. We can't switch to qemu:///session, because adding VM's to a bridged network is a privileged operation.

Revision history for this message
Soren Hansen (soren) wrote :

A possible solution would be a SUID binary that given a VID, would chown /usr/lib/one/<VID>/checkpoint to oneadmin.

Revision history for this message
Soren Hansen (soren) wrote :

hmm... The SUID binary would of course have to go in the -node package, which is a bit of a shame.

Soren Hansen (soren)
description: updated
Revision history for this message
Soren Hansen (soren) wrote :

In summary, the challenge is that libvirtd runs as root, so saves the memory state as root, mode 600. We need to copy that to another host somehow while running as oneadmin.

Revision history for this message
Javi Fontan (jfontan) wrote :

Does this thread bring any light to the problem? https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/235386

Anyway, we are about to reengineer the drivers so it is a good time to take care of that problems. We will take a look to this ticket.

Revision history for this message
Soren Hansen (soren) wrote :

To be perfectly honest, I consider this a bug in libvirt, but if you can come up with a decent workaround in OpenNebula, that would be lovely.

Revision history for this message
Florian Kruse (florian-kruse) wrote :

One year later, this bug still affects me in Karmic Server. Is there any workaround yet?

Revision history for this message
Ruben S, Montero (rubensm-dacya) wrote : Re: [Bug 322779] Re: Cold migration fails

Hi

This should be working for OpenNebula 1.4. Check issue 131[1] in the
development portal, you can safely apply the associated changes to
OpenNebula 1.2. If you are using 1.4, please send us the log files...

[1] http://dev.opennebula.org/issues/131

Cheers

Ruben

On Fri, Mar 26, 2010 at 7:26 AM, Florian Kruse
<email address hidden> wrote:
> One year later, this bug still affects me in Karmic Server. Is there any
> workaround yet?
>
> --
> Cold migration fails
> https://bugs.launchpad.net/bugs/322779
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “opennebula” package in Ubuntu: New
>
> Bug description:
> Binary package hint: opennebula
>
> Cold migration fails because we're connecting to qemu:///system, so the saved state is owned by root, so we can't copy it to the remote host. We can't switch to qemu:///session, because adding VM's to a bridged network is a privileged operation.
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/opennebula/+bug/322779/+subscribe
>

--
Dr. Ruben Santiago Montero
Associate Professor (Profesor Titular), Complutense University of Madrid

URL: http://dsa-research.org/doku.php?id=people:ruben
Weblog: http://blog.dsa-research.org/?author=7

Revision history for this message
Florian Kruse (florian-kruse) wrote :

Reading issue 131 it seems to me you suggest using qemu:///session instead of qemu:///system. However, this is no option for me as I use bridged networking. qemu:///session fails on that:

$ virsh -c qemu:///session create deployment.0
Connecting to uri: qemu:///session
error: Failed to create domain from deployment.0
error: internal error Failed to add tap interface 'vnet%d' to bridge 'br0' : Operation not permitted

Revision history for this message
Ruben S, Montero (rubensm-dacya) wrote :

Hi,

Well actually is the opposite: the patch defaults the driver to use
qemu:///system. It also does a touch to the checkpoint file before
saving the image so it belongs to oneadmin and not to root.

You can check the commit

http://dev.opennebula.org/projects/opennebula/repository/revisions/f8252cfe8bc49bc0ecec376476b711e5d2f1c5dd

Cheers

Ruben

On Fri, Mar 26, 2010 at 10:06 AM, Florian Kruse
<email address hidden> wrote:
> Reading issue 131 it seems to me you suggest using qemu:///session
> instead of qemu:///system. However, this is no option for me as I use
> bridged networking. qemu:///session fails on that:
>
> $ virsh -c qemu:///session create deployment.0
> Connecting to uri: qemu:///session
> error: Failed to create domain from deployment.0
> error: internal error Failed to add tap interface 'vnet%d' to bridge 'br0' : Operation not permitted
>
> --
> Cold migration fails
> https://bugs.launchpad.net/bugs/322779
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “opennebula” package in Ubuntu: New
>
> Bug description:
> Binary package hint: opennebula
>
> Cold migration fails because we're connecting to qemu:///system, so the saved state is owned by root, so we can't copy it to the remote host. We can't switch to qemu:///session, because adding VM's to a bridged network is a privileged operation.
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/ubuntu/+source/opennebula/+bug/322779/+subscribe
>

--
Dr. Ruben Santiago Montero
Associate Professor (Profesor Titular), Complutense University of Madrid

URL: http://dsa-research.org/doku.php?id=people:ruben
Weblog: http://blog.dsa-research.org/?author=7

Revision history for this message
Florian Kruse (florian-kruse) wrote :

Hi,

On 26.03.2010, at 10:56, Ruben S, Montero wrote:
> Well actually is the opposite: the patch defaults the driver to use
> qemu:///system. It also does a touch to the checkpoint file before
> saving the image so it belongs to oneadmin and not to root.

Okay, I just assumed the possibility to change the driver was a suggestion to use qemu:///session (in the bug description it is mentioned as well).

I just missed the touch command. It works like a charm.

> You can check the commit
>
> http://dev.opennebula.org/projects/opennebula/repository/revisions/f8252cfe8bc49bc0ecec376476b711e5d2f1c5dd

Unfortunately the changeset cannot be easily integrated into OpenNebula 1.2 since one_vmm_kvm.rb seems to be completely rewritten in OpenNebula 1.4. However, I made a small, quick and very dirty workaround for the current OpenNebula implementation of Karmic. Below you can see the patch that needs to be applied to /usr/lib/one/mads/one_vmm_kvm.rb.

Yet there is still another problem. AppArmor prevents libvirt to write checkpoints outside of oneadmin's home. Is there an open bug ticket for that or should I file a new one? There was a similar bug report in an earlier Ubuntu release but the fix only gave libvirt the ability to write inside the user's home and not in /var/lib/one/...

$ diff -u /usr/lib/one/mads/one_vmm_kvm.ubuntu-orig.rb /usr/lib/one/mads/one_vmm_kvm.rb
--- /usr/lib/one/mads/one_vmm_kvm.ubuntu-orig.rb 2010-03-26 19:30:51.434615520 +0100
+++ /usr/lib/one/mads/one_vmm_kvm.rb 2010-03-26 19:58:07.475803935 +0100
@@ -112,6 +112,7 @@
     end

     def action_save(args)
+ touch_checkpoint_file(args[2], args[4])
         std_action("SAVE", "save #{args[3]} #{args[4]}", args)
     end

@@ -179,6 +180,18 @@
         res[0].close
         res
     end
+
+ def touch_checkpoint_file(host, file)
+ res=Open3.popen3(
+ "ssh -n #{host} touch #{file} ;"+
+ " echo ExitCode: $? 1>&2")
+ res[0].close
+
+ stdout=res[1].read
+ stderr=res[2].read
+
+ write_response("TOUCH", stdout, stderr, file)
+ end

     def write_response(action, stdout, stderr, args)
         exit_code=get_exit_code(stderr)

Revision history for this message
Florian Kruse (florian-kruse) wrote :
Revision history for this message
Ruben S, Montero (rubensm-dacya) wrote :
Download full text (3.6 KiB)

Hi Florian

To address the Apparmor issue in Ubuntu 9.10, just add the
$ONE_LOCATION/var directory to
/etc/apparmor.d/abstractions/libvirt-qemu

For example if ONE_LOCATION = /srv/cloud/one then your libvirt-qemu
apparmor file should be:

...
#include <abstractions/private-files-strict>
owner @{HOME}/ r,
owner @{HOME}/** rw,
/srv/cloud/one/var/** rw,

The you have to restart the daemon. This should be done in all the
worker nodes of the cluster

Cheers

PS: You are right the text of the issues is totally misleading

On Fri, Mar 26, 2010 at 8:56 PM, Florian Kruse
<email address hidden> wrote:
> Hi,
>
> On 26.03.2010, at 10:56, Ruben S, Montero wrote:
>> Well actually is the opposite: the patch defaults the driver to use
>> qemu:///system. It also does a touch to the checkpoint file before
>> saving the image so it belongs to oneadmin and not to root.
>
> Okay, I just assumed the possibility to change the driver was a
> suggestion to use qemu:///session (in the bug description it is
> mentioned as well).
>
> I just missed the touch command. It works like a charm.
>
>> You can check the commit
>>
>> http://dev.opennebula.org/projects/opennebula/repository/revisions/f8252cfe8bc49bc0ecec376476b711e5d2f1c5dd
>
> Unfortunately the changeset cannot be easily integrated into OpenNebula
> 1.2 since one_vmm_kvm.rb seems to be completely rewritten in OpenNebula
> 1.4. However, I made a small, quick and very dirty workaround for the
> current OpenNebula implementation of Karmic. Below you can see the patch
> that needs to be applied to /usr/lib/one/mads/one_vmm_kvm.rb.
>
> Yet there is still another problem. AppArmor prevents libvirt to write
> checkpoints outside of oneadmin's home. Is there an open bug ticket for
> that or should I file a new one? There was a similar bug report in an
> earlier Ubuntu release but the fix only gave libvirt the ability to
> write inside the user's home and not in /var/lib/one/...
>
> $ diff -u /usr/lib/one/mads/one_vmm_kvm.ubuntu-orig.rb /usr/lib/one/mads/one_vmm_kvm.rb
> --- /usr/lib/one/mads/one_vmm_kvm.ubuntu-orig.rb        2010-03-26 19:30:51.434615520 +0100
> +++ /usr/lib/one/mads/one_vmm_kvm.rb    2010-03-26 19:58:07.475803935 +0100
> @@ -112,6 +112,7 @@
>     end
>
>     def action_save(args)
> +        touch_checkpoint_file(args[2], args[4])
>         std_action("SAVE", "save #{args[3]} #{args[4]}", args)
>     end
>
> @@ -179,6 +180,18 @@
>         res[0].close
>         res
>     end
> +
> +    def touch_checkpoint_file(host, file)
> +        res=Open3.popen3(
> +            "ssh -n #{host} touch #{file} ;"+
> +            " echo ExitCode: $? 1>&2")
> +        res[0].close
> +
> +        stdout=res[1].read
> +        stderr=res[2].read
> +
> +        write_response("TOUCH", stdout, stderr, file)
> +    end
>
>     def write_response(action, stdout, stderr, args)
>         exit_code=get_exit_code(stderr)
>
> --
> Cold migration fails
> https://bugs.launchpad.net/bugs/322779
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “opennebula” package in Ubuntu: New
>
> Bug description:
> Binary package hint: opennebula
>
> Cold migration fails because we're conn...

Read more...

Revision history for this message
Florian Kruse (florian-kruse) wrote :

Hi!

On 26.03.2010, at 23:45, Ruben S, Montero wrote:
> To address the Apparmor issue in Ubuntu 9.10, just add the
> $ONE_LOCATION/var directory to
> /etc/apparmor.d/abstractions/libvirt-qemu

That's what I found out as well. However, since it took me some time to find out and I think newly installed packages should work out of the box, I consider this missing line a bug. Apparently there is no open bug report for that. I'll file one.

Greetings,

Florian

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.6 KiB)

This bug was fixed in the package opennebula - 2.0.1-5

---------------
opennebula (2.0.1-5) unstable; urgency=low

  * d/patches/ldflags_build.diff: Proposed patch from Jaime Melis to allow
    LDFLAGS setting in OpenNebula build process. Will be included in next
    upstream release.
  * d/rules: Use dpkg-buildflags to get LDFLAGS.
    Set DEB_LDFLAGS_APPEND=-Wl,--no-as-needed to fix build with
    no-add-needed linker changes.
  * d/patches/fortify-source.diff: Fix FTBFS when build with
    -D_FORTIFY_SOURCE=2 flag.
  * d/rules: Remove d/opennebula-node.postinst in clean target.
  * d/opennebula-node.postinst.in: opennebula-node on node without libvirt
    but with xen-utils was failing as group `libvirt' does not exist.
    Thanks to Łukasz Oleś for report and patch.

opennebula (2.0.1-4) unstable; urgency=low

  * d/opennebula{-node,}.{postint,postrm}: Fix puiparts failure in postrm.
    We cannot rely on adduser being present at package purge time.
    General cleanup of maintainer scripts.
  * d/opennebula-common.postrm: Don't delete user opennebula
    (keep uid/gid permanently) but disable it.

opennebula (2.0.1-3) unstable; urgency=low

  * d/control: move Depends on openssh-client from opennebula to
    opennebula-common (for ssh-keygen). Closes: #605110.
  * Using dpkg-statoverride instead of chown for postinst.

opennebula (2.0.1-2) unstable; urgency=low

  * d/rules: Fix FTBFS (Closes: #605042) by using dh_listpackages to detect if
    arch all packages (ie. opennebula-node) debhelper commands will act on.

opennebula (2.0.1-1) unstable; urgency=low

  * New upstream release.
  * d/rules: Use share/etc/init.d/one.debian as init.d script.
  * Refresh all patches.
  * d/{control, rules}: Allow users of cloud group to launch xm & xmtop
    from xen-utils-common (Closes: #604567):
    - Depends on libvirt-bin | xen-utils-4.0
    - Bump dependencies on sudo to (>= 1.7.2p1) for /etc/sudoers.d feature.
    - Install /etc/sudoers.d/opennebula-node (in opennebula-node package).
  * d/opennebula.install: Install /var/lib/one/remotes
  * d/control: Set Maintainer as Debian OpenNebula Maintainers
    and myself as Uploaders.

opennebula (2.0-1) experimental; urgency=low

  * First upload to Debian (Closes: #500716):
    - Drop d/patches/fix_cppflags.diff: Merged upstream.
    - Drop d/*.examples: Already handled by upstream install.sh
  * New upstream release (2.0).
  * d/control: Add Recommends: lvm2, sudo, wget, genisoimage.
    - d/patches/genisoimage.diff: Use genisoimage instead of mkisofs.
  * d/rules, d/opennebula-node.postinst.in: Handle group assignment for
    oneadmin user, libvirt (Debian) and libvirtd (Ubuntu).
  * d/opennebula.{postinst,dirs}: creation of /var/lib/one/images with
    proper permissions.
  * d/control: Suggests libamazonec2-ruby for Amazon EC2 access.

opennebula (2.0~rc1-1) UNRELEASED; urgency=low

  * New upstream release (2.0 RC1).
  * Add d/opennebula.README and d/opennebula-node.README as simple
    startup how-to.
  * d/rules: Fix perms for non-executables files.
  * d/opennebula.init: Handle creation of /var/lock.
  * d/patches/default_conf.diff: Switch to tm_ssh as default Transport Manager.
  * d/contr...

Read more...

Changed in opennebula (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.