lxc template fails to stop

Bug #1348386 reported by Tim Penhey
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Katherine Cox-Buday
1.20
Fix Released
High
Katherine Cox-Buday

Bug Description

The clone template determines when to stop by watching the console.log in the container directory. This used to contain the output from cloud-init.

It seems that sometime recently, this has changed, and we are now no longer seeing any output in the console.log file.

This means that we hit the 5 minute wait timeout for the machine to stop after the last bit of output we saw (which was nothing ever).

Tags: lxc oil clone
Revision history for this message
Tim Penhey (thumper) wrote :

It seems that lxc-start changed the meaning of "-c" from the console.log file to a console device. We need to find when, and how to support both old and new versions.

Revision history for this message
Tim Penhey (thumper) wrote :

This happened in the 0.8 to 0.9 version

Ian Booth (wallyworld)
Changed in juju-core:
milestone: next-stable → 1.21-alpha1
Changed in lxc (Ubuntu):
importance: Undecided → High
Revision history for this message
Tim Penhey (thumper) wrote :

Tested on a precise lxc container I have:

it seems we could try with -L (or even better --console-log) and fall back to -c (or --console) if that fails with exit code 1 (we could even check the output for "invalid option".

----
ubuntu@clean-precise:~$ lxc-start -n foo -L omg
lxc-start: invalid option -- 'L'
Usage: lxc-start --name=NAME -- COMMAND

lxc-start start COMMAND in specified container NAME

Options :
  -n, --name=NAME NAME for name of the container
  -d, --daemon daemonize the container
  -f, --rcfile=FILE Load configuration file FILE
  -c, --console=FILE Set the file output for the container console
  -C, --close-all-fds If any fds are inherited, close them
                       If not specified, exit with failure instead
         Note: --daemon implies --close-all-fds
  -s, --define KEY=VAL Assign VAL to configuration variable KEY

Common options :
  -o, --logfile=FILE Output log to FILE instead of stderr
  -l, --logpriority=LEVEL Set log priority to LEVEL
  -q, --quiet Don't produce any output
  -?, --help Give this help list
      --usage Give a short usage message

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

See the lxc-start man page for further information.

ubuntu@clean-precise:~$ echo $?
1

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi - Please let us know if that does not suffice, and we need to do the fallback when lxc detects -c is not a device. I'll remove the lxc task in the meantime.

no longer affects: lxc (Ubuntu)
Revision history for this message
Ian Booth (wallyworld) wrote :

Juju can retry -c if -L fails (and we have scheduled work to do that).
But there will likely be many other scenarios where people would need to change their scripts etc to accommodate the lxc change? If it is possible for lxc to gracefully handle being called using the old argument and Do The Right Thing, I think that would be good.

Changed in juju-core:
assignee: nobody → Katherine Cox-Buday (cox-katherine-e)
Changed in juju-core:
status: Triaged → Fix Committed
Revision history for this message
Claude Durocher (claude-d) wrote :

Using Juju 1.20, I'm stuck with "agent-state-info: template container "juju-trusty-template" did not stop". I see you commited changes to 1.21 but in the mean time, how can I resolve this error?

Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Revision history for this message
Greg Lutostanski (lutostag) wrote :

Hit this bug again with precise + 1.20.7 from ppa:juju/stable.

Attaching juju-debug log. If there is anything else you need from my end let me know, we might have it or can collect it next time we hit it.

Thanks.

tags: added: oil
Revision history for this message
Katherine Cox-Buday (cox-katherine-e) wrote :

What version of lxc-start is on this machine?

Revision history for this message
Katherine Cox-Buday (cox-katherine-e) wrote :

It might also help to see the cloudinit log from the "juju-trusty-lxc-template" LXC container. It's possible that something hung on this machine when creating the template.

Revision history for this message
Larry Michel (lmic) wrote :

We've hit this with Juju 1.22. Attaching the juju-debug log file.

machine-4[3705]: 2015-04-05 11:09:12 INFO juju.provisioner.lxc lxc-broker.go:59 starting lxc container for machineId: 4/lxc/1
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container.lxc clonetemplate.go:98 wait for flock on juju-trusty-lxc-template
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container lock.go:50 acquire lock "juju-trusty-lxc-template", ensure clone exists
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container.lxc clonetemplate.go:139 template exists, continuing
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container lock.go:66 release lock "juju-trusty-lxc-template"
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container.lxc clonetemplate.go:98 wait for flock on juju-trusty-lxc-template
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container lock.go:50 acquire lock "juju-trusty-lxc-template", clone
machine-4[3705]: 2015-04-05 11:09:12 INFO juju.container lock.go:66 release lock "juju-trusty-lxc-template"
machine-4[3705]: 2015-04-05 11:09:12 ERROR juju.provisioner.lxc lxc-broker.go:110 failed to start container: lxc container cloning failed: cannot clone a running container
machine-4[3705]: 2015-04-05 11:09:12 ERROR juju.provisioner provisioner_task.go:531 cannot start instance for machine "4/lxc/1": lxc container cloning failed: cannot clone a running container

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.