File truncated after lxc snapshot restore

Bug #1646458 reported by Adrian Simmons
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxd (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I'm having a persistent problem with a container based app breaking after restore from snapshot, specifically several lines are truncated from the end of a file within the container.

I can stop/start the container, copy and copy it back to the same name and everything is fine. Taking snapshots works fine, but on restore from a snapshot the damage has occurred in the container. Even deleting and recreating the container leads to the same problem.

Snapshot restore was working but started breaking about 2-3 weeks ago. Apologies for not having paid more attention when this bug started occurring but I was dealing with bugs in the app and it wasn't immediately clear this was a problem with lxd.

Taking a snapshot in a vbox vm with the same software installed works perfectly, the problem seems specific to restore of an lxc container

I upgraded the host system from 16.04 to 16.10 in the (mistaken) hope it would fix this issue. So currently the host is/has:
Ubuntu 16.10, Linux 4.8.0-28-generic #30-Ubuntu SMP
lxd:
  Installed: 2.4.1-0ubuntu1
  Candidate: 2.4.1-0ubuntu1
  Version table:
 *** 2.4.1-0ubuntu1 500
        500 http://gb.archive.ubuntu.com/ubuntu yakkety/main amd64 Packages
        100 /var/lib/dpkg/status

The host is an HP N40L microserver with an AMD Turion II Neo N40L CPU. The zpool is on a spinning HD with no hardware or software RAID.

zfs utils show no apparent problems with the zpool:
sudo zpool status -v
  pool: lxdfs
 state: ONLINE
  scan: scrub repaired 0 in 0h2m with 0 errors on Sun Oct 9 00:26:06 2016
config:

 NAME STATE READ WRITE CKSUM
 lxdfs ONLINE 0 0 0
   /var/lib/lxd/lxd-zfs.img ONLINE 0 0 0

errors: No known data errors

## What I do
- lxd is setup to use bridged networking allowing containers to obtain an ip via dhcp on my router
- The guest container is created and the app installed with:
lxc launch ubuntu:14.04 fix -c security.privileged=true
lxc file push ~/.ssh/id_rsa.pub ods/home/ubuntu/.ssh/authorized_keys --mode=0600 --uid=1000

- Then log into the container:
ssh ubuntu@ods
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get autoremove
sudo shutdown -r now

- Take a snapshot on the host:
lxc snapshot ods upgraded

- Log into the guest and install the web app stack:
ssh ubuntu@ods
wget https://github.com/opendevshop/devshop/releases/download/1.0.0-beta10/install.sh
sudo -H bash install.sh --server-webserver=nginx

- Once install is complete log into the web app set admin password etc.
- Take another snapshot on the host
lxc snapshot ods devshop-installed
- Restore last snapshot:
lxc restore ods devshop-installed
- Previously working app now throws an error, on investigation it turns out one particular config file is now missing about 4 lines.

More details to come.

Revision history for this message
Adrian Simmons (adrian-perlucida) wrote :

Actually I cant now recreate this problem - lxd/c now requests criu is installed for restore (live migration). But installing criu and trying to restore throws a criu error.

I think this repot can be closed for now. Sorry for the noise.

Revision history for this message
Adrian Simmons (adrian-perlucida) wrote :

Re-installed criu.
Deleted all containers and recreated my problem container again.
Just successfully took and restored a snapshot.
Issue appears to be fixed in a fully up to date 16.10 install.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Glad that things are working again!

Changed in lxd (Ubuntu):
status: New → Invalid
Revision history for this message
Adrian Simmons (adrinux) wrote :

Spoke too soon. It's still happening.
Current workaround is avoiding snapshot and only using copy.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.