mininet created devices are not deleted correctly

Bug #1197078 reported by James Page
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned
mininet (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Raised upstream:

https://mailman.stanford.edu/pipermail/mininet-discuss/2013-July/002473.html

Also seen in DEP-8 tests for Saucy:

https://jenkins.qa.ubuntu.com/job/saucy-adt-openvswitch/28/ARCH=i386,label=adt/

Syslog extract:

Jul 1 10:57:42 autopkgtest ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl del-br s2
Jul 1 10:57:42 autopkgtest kernel: [ 402.953438] device s2-eth3 left promiscuous mode
Jul 1 10:57:42 autopkgtest kernel: [ 402.953588] device s2-eth4 left promiscuous mode
Jul 1 10:57:42 autopkgtest kernel: [ 402.953724] device s2-eth7 left promiscuous mode
Jul 1 10:57:42 autopkgtest kernel: [ 402.953853] device s2-eth6 left promiscuous mode
Jul 1 10:57:42 autopkgtest kernel: [ 402.953986] device s2-eth2 left promiscuous mode
Jul 1 10:57:42 autopkgtest kernel: [ 402.954049] device s2 left promiscuous mode
Jul 1 10:57:42 autopkgtest ovs-controller: 00366|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 (127.0.0.1:6633<->127.0.0.1:40235) at ../lib/stream-fd.c:142 (0% CPU usage)
Jul 1 10:57:42 autopkgtest ovs-controller: 00367|rconn|DBG|tcp:127.0.0.1:40235: connection closed by peer
Jul 1 10:57:42 autopkgtest ovs-controller: 00368|rconn|DBG|void: entering VOID
Jul 1 10:57:42 autopkgtest ovs-controller: 00369|poll_loop|DBG|wakeup due to [POLLIN] on fd 6 (FIFO pipe:[44662]) at ../lib/fatal-signal.c:173 (0% CPU usage)
Jul 1 10:57:42 autopkgtest ovs-controller: 00370|fatal_signal|WARN|terminating with signal 15 (Terminated)
Jul 1 10:57:42 autopkgtest kernel: [ 403.376840] IPv6: ADDRCONF(NETDEV_UP): s1-eth1: link is not ready
Jul 1 10:57:52 autopkgtest kernel: [ 413.552113] unregister_netdevice: waiting for lo to become free. Usage count = 2
Jul 1 10:58:03 autopkgtest kernel: [ 423.792110] unregister_netdevice: waiting for lo to become free. Usage count = 2
Jul 1 10:58:13 autopkgtest kernel: [ 434.032116] unregister_netdevice: waiting for lo to become free. Usage count = 2
Jul 1 10:58:23 autopkgtest kernel: [ 444.272126] unregister_netdevice: waiting for lo to become free. Usage count = 2

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: mininet 2.0.0-0ubuntu1
ProcVersionSignature: Ubuntu 3.10.0-1.8-generic 3.10.0-rc7
Uname: Linux 3.10.0-1-generic x86_64
ApportVersion: 2.10.2-0ubuntu3
Architecture: amd64
Date: Tue Jul 2 19:17:36 2013
EcryptfsInUse: Yes
InstallationDate: Installed on 2013-04-23 (70 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Release amd64 (20130423)
MarkForUpload: True
SourcePackage: mininet
UpgradeStatus: Upgraded to saucy on 2013-06-15 (16 days ago)
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version k3.10.0-1-generic.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.10.2-0ubuntu3
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 13.10
HibernationDevice: RESUME=UUID=bef939a0-3cc6-48d6-b35f-c20c153e248e
InstallationDate: Installed on 2013-05-20 (43 days ago)
InstallationMedia: Ubuntu-Server 13.10 "Saucy Salamander" - Alpha amd64 (20130520)
MachineType: Dell Inc. Vostro 1220
MarkForUpload: True
Package: mininet 2.0.0-0ubuntu1
PackageArchitecture: amd64
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.10.0-1-generic root=/dev/mapper/hostname--vg-root ro
ProcVersionSignature: Ubuntu 3.10.0-1.8-generic 3.10.0-rc7
RelatedPackageVersions:
 linux-restricted-modules-3.10.0-1-generic N/A
 linux-backports-modules-3.10.0-1-generic N/A
 linux-firmware 1.110
RfKill: Error: [Errno 2] No such file or directory
Tags: saucy saucy
Uname: Linux 3.10.0-1-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
dmi.bios.date: 10/15/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.3.0
dmi.board.name: 0Y743R
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.3.0:bd10/15/2009:svnDellInc.:pnVostro1220:pvr:rvnDellInc.:rn0Y743R:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Vostro 1220
dmi.sys.vendor: Dell Inc.

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
James Page (james-page) wrote :

Note that this has been seen on 12.04.2 systems running the 3.5 kernel and on the 3.10 kernel in Saucy.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1197078

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
James Page (james-page) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
James Page (james-page) wrote : BootDmesg.txt

apport information

Revision history for this message
James Page (james-page) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
James Page (james-page) wrote : CurrentDmesg.txt

apport information

Revision history for this message
James Page (james-page) wrote : Dependencies.txt

apport information

Revision history for this message
James Page (james-page) wrote : IwConfig.txt

apport information

Revision history for this message
James Page (james-page) wrote : Lspci.txt

apport information

Revision history for this message
James Page (james-page) wrote : Lsusb.txt

apport information

Revision history for this message
James Page (james-page) wrote : PciMultimedia.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcEnviron.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcInterrupts.txt

apport information

Revision history for this message
James Page (james-page) wrote : ProcModules.txt

apport information

Revision history for this message
James Page (james-page) wrote : UdevDb.txt

apport information

Revision history for this message
James Page (james-page) wrote : UdevLog.txt

apport information

Revision history for this message
James Page (james-page) wrote : WifiSyslog.txt

apport information

James Page (james-page)
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mininet (Ubuntu):
status: New → Confirmed
Revision history for this message
Philip (wette) wrote :
Download full text (15.2 KiB)

I'm not sure if this information is helpful, but i am seeing the process (mnexec) hanging in a kernel-call.

# uname -a
Linux fgcn-of-3 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux

this is a snippet from my syslog:

Jul 9 15:43:25 fgcn-of-3 kernel: [11261.060357] unregister_netdevice: waiting for lo to become free. Usage count = 3
Jul 9 15:43:35 fgcn-of-3 kernel: [11271.291483] unregister_netdevice: waiting for lo to become free. Usage count = 3
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619216] INFO: task mnexec:16236 blocked for more than 120 seconds.
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619220] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619223] mnexec D ffff88063fc93780 0 16236 1 0x00000000
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619228] ffff88060e28d810 0000000000000082 0000000000000000 ffff880626a6e8b0
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619235] 0000000000013780 ffff88060b4b7fd8 ffff88060b4b7fd8 ffff88060e28d810
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619240] 0000000000000000 00000001810ce209 ffff880609c3a0d0 ffffffff81659710
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619245] Call Trace:
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619255] [<ffffffff8134e11c>] ? __mutex_lock_common.isra.5+0xff/0x164
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619259] [<ffffffff8134e00a>] ? mutex_lock+0x1a/0x2d
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619265] [<ffffffff8128a9f5>] ? copy_net_ns+0x5c/0xcb
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619271] [<ffffffff81063065>] ? create_new_namespaces+0xd2/0x156
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619275] [<ffffffff81063255>] ? unshare_nsproxy_namespaces+0x57/0x6f
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619281] [<ffffffff810463ff>] ? sys_unshare+0xa7/0x1ff
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619285] [<ffffffff81353b52>] ? system_call_fastpath+0x16/0x1b
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619289] INFO: task mnexec:16237 blocked for more than 120 seconds.
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619292] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619294] mnexec D ffff88063fcf3780 0 16237 1 0x00000000
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619299] ffff88060d7287b0 0000000000000086 ffff880600000000 ffff880626aef0a0
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619305] 0000000000013780 ffff88060b68dfd8 ffff88060b68dfd8 ffff88060d7287b0
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619310] ffff880626adcd80 00000001000412d0 ffff880609ea0300 ffffffff81659710
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619315] Call Trace:
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619319] [<ffffffff8134e11c>] ? __mutex_lock_common.isra.5+0xff/0x164
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619323] [<ffffffff8134e00a>] ? mutex_lock+0x1a/0x2d
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619327] [<ffffffff8128a9f5>] ? copy_net_ns+0x5c/0xcb
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619332] [<ffffffff81063065>] ? create_new_namespaces+0xd2/0x156
Jul 9 15:43:36 fgcn-of-3 kernel: [11271.619336] [<ffffffff81063255>] ? unshare_nsproxy_names...

Revision history for this message
James Page (james-page) wrote :

This looks very much like bug 1181315

Revision history for this message
Philip (wette) wrote :

i am really not an kernel expert but i found that some kernel helper process hangs in line 283 of net/core/net_namespace.c (kernel version 3.2)

281 /* Run all of the network namespace exit methods */
282 list_for_each_entry_reverse(ops, &pernet_list, list)
283 ops_exit_list(ops, &net_exit_list);

at this time, it holds the mutex net_mutex which is bad...

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Comment #35 inb bug 1181315 suggests that bug may be resolved in 3.10.0.2. Can you test that kernel and see if it still exhibits the bug?

Revision history for this message
James Page (james-page) wrote :

@Joseph

I'm unable to reproduce this issue on the latest saucy kernel - did you manage to ID which commit fixed this up? I'm concerned it exists in other kernel versions 12.04->13.04

Revision history for this message
Philip (wette) wrote :

I am able to reproduce this bug with both kernel versions
Linux fgcn-of-4 3.11.0-5-generic #11-Ubuntu SMP Fri Sep 6 19:01:31 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Linux fgcn-of-3 3.11.0-031100-generic #201309021735 SMP Mon Sep 2 21:36:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

i am using the following procedure:

wget https://www.dropbox.com/s/h8i4mqfw3xel1ij/clos.py
sudo mn --custom clos.py --topo=clos --switch=user --controller=remote,192.168.1.25 --link=tc

and then just quit mininet again.

Revision history for this message
Philip (wette) wrote :

one small (and maybe weird) addition to my last post:

Whenever a system is configured with a bridge, the error seems NOT to occur. In our lab we have 4 systems, two with configured bridges and two without. Only the systems without bridges behave erroneous.
If, after a reboot, a bridge is added to these two systems (without bridges) they do NO LONGER have the bug.

Revision history for this message
Philip (wette) wrote :

i have to revoke my previous statement: the bridge only makes the bug to appear less frequent.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Please test latest development kernel (3.11.0-7.14)

Given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We are approaching release and would like to confirm if this bug is still present. Please test again with the latest development kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.11.0-7.14
Revision history for this message
Bob Lantz (rlantz) wrote :

I'm still seeing this in 12.04.3.

Revision history for this message
Bob Lantz (rlantz) wrote :

I'm still seeing this in 13.10 beta.

Revision history for this message
Bob Lantz (rlantz) wrote :

[ 2091.229234] unregister_netdevice: waiting for lo to become free. Usage count = 3
[ 2101.338376] unregister_netdevice: waiting for lo to become free. Usage count = 3

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@James Page, are you seeing this bug again, or are you still unable to reproduce it?

Revision history for this message
Bob Lantz (rlantz) wrote : Re: [Bug 1197078] Re: mininet created devices are not deleted correctly
Download full text (5.4 KiB)

It has shown up multiple times in our (mininet) nightly tests on 13.10 beta, but I don't know if it is still being seen on the Ubuntu testing side.

Sent from my iPhone

On Oct 10, 2013, at 1:12 PM, Joseph Salisbury <email address hidden> wrote:

> @James Page, are you seeing this bug again, or are you still unable to
> reproduce it?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1197078
>
> Title:
> mininet created devices are not deleted correctly
>
> Status in “linux” package in Ubuntu:
> Confirmed
> Status in “mininet” package in Ubuntu:
> Confirmed
>
> Bug description:
> Raised upstream:
>
> https://mailman.stanford.edu/pipermail/mininet-
> discuss/2013-July/002473.html
>
> Also seen in DEP-8 tests for Saucy:
>
> https://jenkins.qa.ubuntu.com/job/saucy-adt-
> openvswitch/28/ARCH=i386,label=adt/
>
> Syslog extract:
>
>
> Jul 1 10:57:42 autopkgtest ovs-vsctl: 00001|vsctl|INFO|Called as ovs-vsctl del-br s2
> Jul 1 10:57:42 autopkgtest kernel: [ 402.953438] device s2-eth3 left promiscuous mode
> Jul 1 10:57:42 autopkgtest kernel: [ 402.953588] device s2-eth4 left promiscuous mode
> Jul 1 10:57:42 autopkgtest kernel: [ 402.953724] device s2-eth7 left promiscuous mode
> Jul 1 10:57:42 autopkgtest kernel: [ 402.953853] device s2-eth6 left promiscuous mode
> Jul 1 10:57:42 autopkgtest kernel: [ 402.953986] device s2-eth2 left promiscuous mode
> Jul 1 10:57:42 autopkgtest kernel: [ 402.954049] device s2 left promiscuous mode
> Jul 1 10:57:42 autopkgtest ovs-controller: 00366|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 (127.0.0.1:6633<->127.0.0.1:40235) at ../lib/stream-fd.c:142 (0% CPU usage)
> Jul 1 10:57:42 autopkgtest ovs-controller: 00367|rconn|DBG|tcp:127.0.0.1:40235: connection closed by peer
> Jul 1 10:57:42 autopkgtest ovs-controller: 00368|rconn|DBG|void: entering VOID
> Jul 1 10:57:42 autopkgtest ovs-controller: 00369|poll_loop|DBG|wakeup due to [POLLIN] on fd 6 (FIFO pipe:[44662]) at ../lib/fatal-signal.c:173 (0% CPU usage)
> Jul 1 10:57:42 autopkgtest ovs-controller: 00370|fatal_signal|WARN|terminating with signal 15 (Terminated)
> Jul 1 10:57:42 autopkgtest kernel: [ 403.376840] IPv6: ADDRCONF(NETDEV_UP): s1-eth1: link is not ready
> Jul 1 10:57:52 autopkgtest kernel: [ 413.552113] unregister_netdevice: waiting for lo to become free. Usage count = 2
> Jul 1 10:58:03 autopkgtest kernel: [ 423.792110] unregister_netdevice: waiting for lo to become free. Usage count = 2
> Jul 1 10:58:13 autopkgtest kernel: [ 434.032116] unregister_netdevice: waiting for lo to become free. Usage count = 2
> Jul 1 10:58:23 autopkgtest kernel: [ 444.272126] unregister_netdevice: waiting for lo to become free. Usage count = 2
>
> ProblemType: Bug
> DistroRelease: Ubuntu 13.10
> Package: mininet 2.0.0-0ubuntu1
> ProcVersionSignature: Ubuntu 3.10.0-1.8-generic 3.10.0-rc7
> Uname: Linux 3.10.0-1-generic x86_64
> ApportVersion: 2.10.2-0ubuntu3
> Architecture: amd64
> Date: Tue Jul 2 19:17:36 2013
> EcryptfsInUse: Yes
> InstallationDate: Installed on 2013-04-23 (70 days ago)
> InstallationMedia:...

Read more...

Changed in mininet (Ubuntu):
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.