interuppting lxc-clone can destroy source container

Bug #1285850 reported by Kapil Thangavelu
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Expired
High
Unassigned
Vivid
Won't Fix
Undecided
Unassigned
Wily
Won't Fix
Undecided
Unassigned
Xenial
Expired
High
Unassigned

Bug Description

ubuntu 13.10, i'm currently scripting lxc-clone to create a dozen containers on demand, but if i ctrl-c interuppt the operation, i've seen a few cases where it will destroy the source container (using snapshots and btrfs)

$ lxc-clone -s precise-subvolume target-1 -- -S ~/.ssh/id_dsa.pub
ctrl-c

the source container's rootfs directory goes missing, and only a rootfs.hold file remains in place.

btrfs subvolume list shows nothing for the missing container rootfs as well.

lxc-version -> 1.0.0.alpha1

Tags: patch oil
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

same problem doing an aufs snapshot container deletion (lxc daily ppa) on an interuppted clone . this one was a bit more odd

sudo lxc-destroy -n argo-m3
lxc_container: _recursive_rmdir_onedev: failed to delete /var/lib/lxc/precise-base/rootfs
lxc_container: Error destroying rootfs for argo-m3
Destroying argo-m3 failed

not sure why its trying to delete a different container root, hallyn pointed out there's probably an issue updating the container config rootfs pointer to late in the clone process leading to this issue.

Changed in lxc:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for submitting this bug.

While we were very careful to make sure that in any error path, what you see doesn't happen, we actually wrote down a temporary configuration file on disk, for the new container, using the old rootfs. Interruption of clone doesn't allow us to do the cleanup we do in error paths, so a subsequent lxc-destroy (which is automatically done by the api when it sees the creation was interrupted) removes the old rootfs.

The patch below should fix it. I haven't had a chance to test it yet though.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

We just lost a very important container due to this bug. We were using lxc-clone to create the backup to be safe...

Revision history for this message
Larry Michel (lmic) wrote :

We hit this while cloning our maas container. After killing the lxc-clone, I tried to run lxc-destroy on the target container and entire system became unresponsive during the lxc-destroy. Upon rebooting, retried to destroy the target container and it worked, but lxc-clone kept failing and we saw that the filesystem was missing from the source container.

 ubuntu@peuchen:/$ dpkg -l|grep lxc
rc liblxc0 1.0.0~alpha1-0ubuntu11~ctools0 Linux Containers userspace tools (library)
ii liblxc1 1.1.4-0ubuntu1.1~ubuntu12.04.1~ppa1 Linux Containers userspace tools (library)
ii lxc 1.1.4-0ubuntu1.1~ubuntu12.04.1~ppa1 Linux Containers userspace tools
ii lxc-templates 1.1.4-0ubuntu1.1~ubuntu12.04.1~ppa1 Linux Containers userspace tools (templates)
ii lxcfs 0.11-0ubuntu3~ubuntu12.04.1~ppa1 FUSE based filesystem for LXC
ii python3-lxc 1.1.4-0ubuntu1.1~ubuntu12.04.1~ppa1 Linux Containers userspace tools (Python 3.x bindings)
ubuntu@peuchen:/$ uname -a
Linux peuchen.oil 3.13.0-66-generic #108~precise1-Ubuntu SMP Thu Oct 8 10:07:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

tags: added: oil
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This bug should have been marked fix committed in march 2014 and fix released
shortly thereafter. I'm worried you've found a new bug.

Can you reproduce?

Can you post the config for your container?

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Here's the config for the container: http://paste.ubuntu.com/14432998/

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Thanks, nothing odd there, will try to reproduce with cloud archive)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This was fixed by commit 5eea90e8505d9f336bb28379d8575be159fdd2e1, it was github issue http://github.com/lxc/lxc/issues/694.

It needs to be SRUd somewhat urgently.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Should also need fix in trusty-backports

The fix will come in 1.1.6.

tags: added: patch
no longer affects: lxc
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxc (Ubuntu Vivid):
status: New → Confirmed
Changed in lxc (Ubuntu Wily):
status: New → Confirmed
Changed in lxc (Ubuntu Xenial):
status: New → Confirmed
Changed in lxc (Ubuntu):
status: New → Confirmed
Rolf Leggewie (r0lf)
Changed in lxc (Ubuntu):
importance: Undecided → High
Changed in lxc (Ubuntu Xenial):
importance: Undecided → High
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1285850] Re: interuppting lxc-clone can destroy source container

If you have reproduced this recently, please show us exactly how
and on what version of lxc. This should have been fixed as of commits

3b392519: lxcapi_clone: restore the unexpanded config len
and
5eea90e8: clone: clear the rootfs out of unexpanded config

for github issue 694.

 status incomplete

Changed in lxc (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Jon Grimm (jgrimm) wrote :

Marked Vivid and Wily as Won't Fix as EOL releases.

Changed in lxc (Ubuntu Vivid):
status: Confirmed → Won't Fix
Changed in lxc (Ubuntu Wily):
status: Confirmed → Won't Fix
Revision history for this message
Robie Basak (racb) wrote :

Mirroring the development release's Incomplete status in Xenial - see comment 15.

Changed in lxc (Ubuntu Xenial):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for lxc (Ubuntu Xenial) because there has been no activity for 60 days.]

Changed in lxc (Ubuntu Xenial):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for lxc (Ubuntu) because there has been no activity for 60 days.]

Changed in lxc (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.