lxc-stop powered off a server

Bug #1520225 reported by Junien F
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Expired
High
Unassigned

Bug Description

Hi,

Earlier today, a server got powered off and I'm 99% sure it's because I ran the following command as root :
# lxc-stop -n <container>

The container was started manually with :
# lxc-start -n <container> -F --share-net 1

and it wasn't fully booted (ie I didn't have the console yet).

syslog just showed :
Nov 26 11:13:12 foo kernel: [30192021.012313] init: tty4 main process (1298) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.013083] init: tty5 main process (1301) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.013750] init: jujud-unit-landscape-client-0 main process (1303) terminated with status 2
Nov 26 11:13:12 foo kernel: [30192021.014412] init: jujud-unit-ksplice-0 main process (1306) terminated with status 2
Nov 26 11:13:12 foo kernel: [30192021.015017] init: tty2 main process (1307) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.015592] init: tty3 main process (1309) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.016160] init: tty6 main process (1312) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.016810] init: jujud-machine-0 main process (1313) terminated with status 2
Nov 26 11:13:12 foo kernel: [30192021.017423] init: jujud-unit-ubuntu-basenode-0 main process (1314) terminated with status 2
Nov 26 11:13:12 foo kernel: [30192021.018583] init: tty1 main process (1784) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.019164] init: ttyS1 main process (1791) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.019738] init: cron main process (20196) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.020328] init: irqbalance main process (26014) killed by TERM signal
Nov 26 11:13:12 foo kernel: [30192021.020926] init: cgmanager main process (27367) killed by TERM signal

and that's it. I don't have other relevant logs to offer I'm afraid.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty

~$ dpkg -l|grep lxc
ii liblxc1 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (library)
ii lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools
ii lxc-templates 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (templates)
ii python3-lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (Python 3.x bindings)

apport information below

Thanks
---
ApportVersion: 2.14.1-0ubuntu3.19
Architecture: amd64
DistroRelease: Ubuntu 14.04
Package: lxc 1.0.7-0ubuntu0.10
PackageArchitecture: amd64
ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.13.0-68-generic root=UUID=17a1fdea-b0d0-4f6f-a664-ec532f379c64 ro console=tty0 console=ttyS1,38400 nosplash
ProcEnviron:
 TERM=screen-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 3.13.0-68.111-generic 3.13.11-ckt27
Tags: trusty third-party-packages apparmor
Uname: Linux 3.13.0-68-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
defaults.conf:
 lxc.network.type = veth
 lxc.network.link = lxcbr0
 lxc.network.flags = up
 lxc.network.hwaddr = 00:16:3e:xx:xx:xx

Revision history for this message
Junien F (axino) wrote : Dependencies.txt

apport information

tags: added: apparmor apport-collected third-party-packages trusty
description: updated
Revision history for this message
Junien F (axino) wrote : KernLog.txt

apport information

Revision history for this message
Junien F (axino) wrote : RelatedPackageVersions.txt

apport information

Revision history for this message
Junien F (axino) wrote : lxc-net.default.txt

apport information

Revision history for this message
Junien F (axino) wrote : lxc.default.txt

apport information

Revision history for this message
Junien F (axino) wrote : lxcsyslog.txt

apport information

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I can't quite reproduce this information. Could you show the container configuration?

Note that your host being trusty (and presumably upstart-based), if your container also is running upstart, then if you use --share-net 1, then the abstract unix socket which upstart uses to ask init to shut down will be talking in the host's network, to the host's upstart.

--share-net 1 is not recommended. It is similar to 'lxc.network.type = none', of which lxc.container.conf(5) warns:

              none: will cause the container to share the host's network namespace. This means the host network devices are usable in the container. It also means that if both the container and host have upstart as init, 'halt' in a
              container (for instance) will shut down the host.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Please also show the output of

ls /etc/apt/sources.list.d
dpkg -l | grep lxc

Changed in lxc (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Revision history for this message
James Troup (elmo) wrote :

Serge, why do we offer people something which is such a landmine?

Revision history for this message
Stéphane Graber (stgraber) wrote :

Because it's needed for things like the Android container on Ubuntu Touch, application containers on OpenWRT, ...

There are legitimate use cases for those features, those use cases just never apply when running a full Linux distro inside the container :)

LXC used to default to sharing the netns by default when no lxc.network entry was set, that was clearly bad design and we've fixed that (no defaulting to an empty netns), but if someone specifically sets lxc.network.type=none or passes --share-net 1, we give them exactly what they requested.

Revision history for this message
Junien F (axino) wrote :

Hi,

container config : http://paste.ubuntu.com/13524088/

/etc/apt/sources.list.d just contains a repo for some homemade backports

"dpkg -l | grep lxc" was given in the initial comment, but here it is again :
$ dpkg -l|grep lxc
ii liblxc1 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (library)
ii lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools
ii lxc-templates 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (templates)
ii python3-lxc 1.0.7-0ubuntu0.10 amd64 Linux Containers userspace tools (Python 3.x bindings)

I understand that "--share-net 1" can make the container connect to the host's upstart socket. I still don't think "lxc-stop" should stop the host, ever.

Cheers

Revision history for this message
Stéphane Graber (stgraber) wrote :

lxc-stop didn't, upstart in your container did.

lxc-stop sends SIGPWR to the container's PID 1, in your case, upstart. As Serge pointed out, using --share-net means that the upstart tools int the container interact with upstart on the host.

As a result, the shutdown sequence in the container triggered some of the host's jobs (my bet would be the rc job) which caused it to shutdown.

To be clear, lxc-stop doesn't have any code which can cause a host shutdown on its own at all, the tool just did exactly what it usually does, send a pre-determined signal to the container's init process, it's that process which then caused the host to shutdown.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for lxc (Ubuntu) because there has been no activity for 60 days.]

Changed in lxc (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.