lxc-ls lists running containers multiple times

Bug #1043018 reported by Dan Kegel
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
Low
Unassigned
Precise
Fix Released
Low
Unassigned

Bug Description

====================
SRU Justification:
1. Impact: lxc-ls output can have duplicates.
2. Development fix: use 'netstat -xl' in place of 'netstat -xa' to only show listening sockets
3. Stable fix: same as development fix
4. Test case: start a container, open a second console (with lxc-console -n <containername>). Then do 'lxc-ls' and look for duplicate entries in the second line.
5. Regression potential: none.
====================

This seems wrong:

$ lxc-ls
demo_centos5 demo_centos6 demofedora16 demo_ubuntu_1004 demo_ubuntu_1204 demo_ubuntu_1204-temp-NjwI1BQ ubu12-bb-01-ubu12
ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12
ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12 ubu12-bb-01-ubu12

Lessee: lxc-ls does
active=$(netstat -xa 2>/dev/null | grep $lxcpath | \
        sed -e 's#.*'"$lxcpath/"'\(.*\)/command#\1#');

which expands to

++ netstat -xa
++ grep /var/lib/lxc
++ sed -e 's#.*/var/lib/lxc/\(.*\)/command#\1#'
+ active='ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
ubu12-bb-01-ubu12
...

netstat -xa | grep /var/lib/lxc shows
$ netstat -xa | grep /var/lib/lxc
unix 2 [ ACC ] STREAM LISTENING 1509155 @/var/lib/lxc/ubu12-bb-01-ubu12/command
unix 2 [ ] STREAM CONNECTING 0 @/var/lib/lxc/ubu12-bb-01-ubu12/command
unix 2 [ ] STREAM CONNECTING 0 @/var/lib/lxc/ubu12-bb-01-ubu12/command
unix 3 [ ] STREAM CONNECTING 0 @/var/lib/lxc/ubu12-bb-01-ubu12/command

So, perhaps that should be grep "LISTENING.*$lxcpath"

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for reporting this bug.

I actually cannot reproduce this (on Ubuntu quantal). I have 'q1' started with an additional console opened, but lxc-ls shows:

serge@amd1:~$ sudo lxc-ls
b1 b2 bb1 p1 q1
q1

Is there somethign else I should be doing to reproduce this?

Can you tell us which release you are on, which lxc version you are using, and the container configuration file?

Changed in lxc (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Revision history for this message
Dan Kegel (dank) wrote :

This was on ubuntu 12.04 after doing apt-get dist-upgrade but before rebooting,
but I can still get it to happen after reboot, so it's still valid, I think.

I can reliably reproduce it by doing

sudo lxc-create -n demo_ubuntu_1204 -t ubuntu -- -r precise --bindhome $LOGNAME
lxc-start-ephemeral -o demo_ubuntu_1204 echo hi

lxc-start-ephemeral will hang. (At this point, netstat -xa | grep /var/lib/lxc shows a LISTENING entry
for the container.)
Pressing ^C now will fail to terminate lxc-start-ephemeral, but netstat -xa | grep /var/lib/lxc
will show both a LISTENING and a CONNECTING entry, and lxc-ls shows duplicate entries.

At the same time, my home directory goes bonkers and can't be read, it seems the
script's mount magic breaks my nfs home directory?

Revision history for this message
Dan Kegel (dank) wrote :

I see a problem with just lxc-start, too, not lxc-start-ephemeral.
After doing
  sudo lxc-start -n demo_ubuntu_1204
and then in another window
  sudo lxc-console -n demo_centos6-temp-j4G0FcH
once I log in, the guest hangs, and lxc-ls shows

demo_ubuntu_1204 demo_ubuntu_1204-temp-JjkTzkK demo_ubuntu_1204-temp-oMUDGfY
demo_ubuntu_1204-temp-oMUDGfY demo_ubuntu_1204-temp-oMUDGfY

mount on the host shows

/dev/mapper/ubu12test-root on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime,mode=755)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
/dev/sda1 on /boot type ext2 (rw)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
obnas1a:/vol/homes/dank on /mnt/home/dank type nfs (rw,soft,tcp,rsize=32768,wsize=32768,nfsvers=3,sloppy,addr=10.10.1.201)
none on /tmp/lxc-lp-KsohZOA type tmpfs (rw)
none on /var/lib/lxc/demo_ubuntu_1204-temp-oMUDGfY type overlayfs (rw,upperdir=/tmp/lxc-lp-KsohZOA,lowerdir=/var/lib/lxc/demo_ubuntu_1204)
none on /var/lib/lxc/demo_ubuntu_1204-temp-oMUDGfY/ephemeralbind type tmpfs (rw)

I suppose next I should try this on an account with a local home directory.

Revision history for this message
Dan Kegel (dank) wrote :

Yeah, the hang only happens on users with nfs home directories.
I should file a separate bug for that.

Even on users with a local home directory, though, lxc-ls lists containers multiple times
after starting and logging into a container. For instance,

$ lxc-ls
demo_ubuntu_1204
demo_ubuntu_1204
$ bash -x /usr/bin/lxc-ls
+ lxcpath=/var/lib/lxc
+ '[' '!' -r /var/lib/lxc ']'
+ ls -- /var/lib/lxc
demo_ubuntu_1204
++ netstat -xa
++ grep /var/lib/lxc
++ sed -e 's#.*/var/lib/lxc/\(.*\)/command#\1#'
+ active=demo_ubuntu_1204
+ test -n demo_ubuntu_1204
+ get_cgroup
+ local mount_string
++ mount -t cgroup
++ grep -E -e '^lxc '
+ mount_string=
+ test -n ''
++ grep -m1 -E '^[^ \t]+[ \t]+[^ \t]+[ \t]+cgroup' /proc/self/mounts
+ mount_string='cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset,clone_children 0 0'
+ test -z 'cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset,clone_children 0 0'
++ echo 'cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset,clone_children 0 0'
++ cut '-d ' -f2
+ mount_point=/sys/fs/cgroup/cpuset
+ test -n /sys/fs/cgroup/cpuset
++ cat /proc/1/cgroup
++ awk -F: '{ print $3 }'
++ head -1
+ init_cgroup=/
+ cd /sys/fs/cgroup/cpuset///lxc
+ ls -d -- demo_ubuntu_1204
demo_ubuntu_1204

So one copy is coming from the ls of /var/lib/lxc, the other is coming from netstat -xa.

Why the two paths?

Changed in lxc (Ubuntu):
status: Incomplete → New
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Ah, I see the problem. This is fix released in quantal, but is still broken in precise. The fix is to switch 'netstat -xa' with 'netstat -xl'. We'll get that SRU'd.

Thanks again for the bug report.

Changed in lxc (Ubuntu):
status: New → Fix Released
Changed in lxc (Ubuntu Precise):
status: New → In Progress
importance: Undecided → Low
assignee: nobody → Serge Hallyn (serge-hallyn)
description: updated
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(Apologies, I see that's what you had said earlier :)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :
tags: added: needsru
Changed in lxc (Ubuntu Precise):
assignee: Serge Hallyn (serge-hallyn) → nobody
status: In Progress → Triaged
Revision history for this message
Dan Kegel (dank) wrote :

That's necessary, but not sufficient. There is still duplicate output even with that change.
lxc-ls first lists all containers with "ls -- /var/lib/lxc", and then, inexplicably, also lists all active
containers with netstat. Why the duplication? Should it use sort -u?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1043018] Re: lxc-ls lists running containers multiple times

Quoting Dan Kegel (<email address hidden>):
> That's necessary, but not sufficient. There is still duplicate output even with that change.
> lxc-ls first lists all containers with "ls -- /var/lib/lxc", and then, inexplicably, also lists all active
> containers with netstat. Why the duplication? Should it use sort -u?

If I understand what you're saying correctly, that's not duplication - the
first line is all existing containers, the second line is all active
containers.

Have I misunderstood what you're saying? :) Apologies if so.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Note also that many people prefer to use 'lxc-list' to 'lxc-ls'. (I am
not one of them)

Revision history for this message
Dan Kegel (dank) wrote :

Then perhaps the bug is in the manpage, http://manpages.ubuntu.com/manpages/precise/en/man1/lxc-ls.1.html
which doesn't say anything about listing existing containers on one line, and active containers on a second line.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Dan Kegel (<email address hidden>):
> Then perhaps the bug is in the manpage, http://manpages.ubuntu.com/manpages/precise/en/man1/lxc-ls.1.html
> which doesn't say anything about listing existing containers on one line, and active containers on a second line.

Hm, yes. Upstream's lxc-ls used to print both, but as of May 4 that
has changed. The man page however never stated the old behavior.

We'll sync the new behavior when 13.04 opens up. For 12.10 I'm not
sure whether we should fix the man page, or change the behavior.
(I personally like the current behavior)

Revision history for this message
Dan Kegel (dank) wrote :

I find the old behavior confusing, and http://www.greenhills.co.uk/2011/06/10/lxc.html seems to agree, it says
"I’m going to skip lxc-ls because it’s needlessly confusing."

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@Dan,

we'll discuss the preferred 12.10 behavior at UDS. In the meantime, many people prefer to use lxc-list over lxc-ls.

Changed in lxc (Ubuntu Precise):
status: Triaged → In Progress
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Please test proposed package

Hello Dan, or anyone else affected,

Accepted lxc into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lxc/0.7.5-3ubuntu64 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in lxc (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

The fix is not in precise-proposed. lxc-ls still uses netstat -xa rather than netstat -xl.

tags: added: verification-failed
removed: verification-needed
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

(I will upload a new package with the full fix once I check all other bugs in this SRU)

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Hello Dan, or anyone else affected,

Accepted lxc into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lxc/0.7.5-3ubuntu65 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-failed
tags: added: verification-needed
tags: added: verification-done
removed: needsru verification-needed
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 0.7.5-3ubuntu65

---------------
lxc (0.7.5-3ubuntu65) precise-proposed; urgency=low

  * Add proper fix (X001-lxc-ls-onelisting) for lxc-ls showing running
    containers multiple times. (LP: #1043018)

lxc (0.7.5-3ubuntu64) precise-proposed; urgency=low

  [ Serge Hallyn ]
  * lxc.lxc-net.upstart: tell iptables not to masquerate packets between
    containers. (LP: #1045947)
  * 0204-ubuntu-cloud-userdata-path: Fix broken behavior when a relative
    path is passed into '--userdata' argument. (LP: #1043582)
  * 0205-lxc-ls-manpage-document-two-lines: Document the default two-line
    output format of lxc-ls. (LP: #1043018)
  * lxc-start-ephemeral: support fedora and centos (LP: #1042431)
  * 0222-debian-dhcp3-package: fix install of debian testing containers.
    (LP: #1052972)
  * 0100-template-cleanup-cache: clean up template cache if interrupted
    during build. (LP: #1037331)

  [ Scott Moser ]
  * 0225-ubuntu-cloud-numeric-owner: use --numeric-owner when extracting root
    filesystems with tar (LP: #1066084)
 -- Serge Hallyn <email address hidden> Wed, 07 Nov 2012 11:03:36 -0600

Changed in lxc (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.