unregister_netdevice: waiting for lo to become free. Usage count

Bug #1403152 reported by menoabels
138
This bug affects 23 people
Affects Status Importance Assigned to Milestone
Linux
Incomplete
Medium
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Chris J Arges
Vivid
Fix Released
Medium
Chris J Arges
linux-lts-utopic (Ubuntu)
Won't Fix
Medium
Unassigned
Trusty
Fix Released
Undecided
Unassigned
linux-lts-xenial (Ubuntu)
Won't Fix
Undecided
Unassigned
Trusty
Won't Fix
Undecided
Unassigned

Bug Description

SRU Justification:

[Impact]

Users of kernels that utilize NFS may see the following messages when shutting down and starting containers:

    unregister_netdevice: waiting for lo to become free. Usage count = 1

This can cause issues when trying to create net network namespace and thus block a user from creating new containers.

[Test Case]

Setup multiple containers in parallel to mount and NFS share, create some traffic and shutdown. Eventually you will see the kernel message.

Dave's script here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/24

[Fix]
commit de84d89030fa4efa44c02c96c8b4a8176042c4ff upstream

--

I currently running trusty latest patches and i get on these hardware and software:

Ubuntu 3.13.0-43.72-generic 3.13.11.11

processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 77
model name : Intel(R) Atom(TM) CPU C2758 @ 2.40GHz
stepping : 8
microcode : 0x11d
cpu MHz : 2400.000
cache size : 1024 KB
physical id : 0
siblings : 8
core id : 7
cpu cores : 8
apicid : 14
initial apicid : 14
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms
bogomips : 4799.48
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

somehow reproducable the subjected error, and lxc is working still but not more managable until a reboot.

managable means every command hangs.

I saw there are alot of bugs but they seams to relate to older version and are closed, so i decided to file a new one?

I run alot of machine with trusty an lxc containers but only these kind of machines produces these errors, all
other don't show these odd behavior.

thx in advance

meno

Revision history for this message
In , fallenblood (fallenblood-linux-kernel-bugs) wrote :

Hey guys, sorry I don't have the time to put together a proper bug report, but the following GitHub thread has plenty of info about it.

https://github.com/docker/docker/issues/5618#issuecomment-50189996

Revision history for this message
In , alan (alan-linux-kernel-bugs) wrote :

This does not appear to be a bug, the kernel is merely waiting for all the users to drop the device. Something has a reference to it so it remains.

Revision history for this message
In , fallenblood (fallenblood-linux-kernel-bugs) wrote :

(In reply to Alan from comment #1)
> This does not appear to be a bug, the kernel is merely waiting for all the
> users to drop the device. Something has a reference to it so it remains.

This occurs with both Docker and VirtualBox. Both applications rely on a virtual network interface, and it happens once some amount of data (estimated less than a few hundred megabytes) has been transferred over that interface. It never happens with kernel 3.13.0-29-generic, but happens all the time with 3.14.3-rt4, so obviously the kernel is the decisive factor here; how is it not regression?

Revision history for this message
In , alan (alan-linux-kernel-bugs) wrote :

Virtualbox is out of tree modules so we don't care 8) - but it's useful data

First things to clear up given you've not filed much information yet

What is the reproducer ?
Does the message continue forever ?
Does it only occur on 3.14-rt not 3.14 standard ?
If it only happens on 3.14.3-rt, then does it happen on older -rt kernels ?

It's normal for this to occur if you drop a device that is in use for something, but it should very rapidly get cleaned up and go away. It's quite possible 3.13/3.14 timings make a difference and its just chance it worked before. Equally if its getting stuck forever that implies a race or refcount leak of some kind.

Revision history for this message
In , fallenblood (fallenblood-linux-kernel-bugs) wrote :

To reproduce the issue, install Docker (I'm alternatively using v0.11.1 or v1.0.0), spin up a container, and download a few hundred megabytes in it. On attempted system shutdown, you will see the message, and it ostensibly continues forever. You can see other people encountering this in the GitHub thread linked above. I haven't tried it with either 3.14 standard or other -rt kernels yet, but I'll try it out as soon as I get the time.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1403152

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue occur in a previous version of Ubuntu, or is this a new issue?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-vivid/

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
menoabels (meno-abels) wrote :

I try to make it reproducable and figured out that the problem is related to use of these kind of
interfaces in a lxc container. The tunnel are working in the running lxc-container but if you stop or reboot the
lxc-container the kernel reports this

unregister_netdevice: waiting for lo to become free.....

here the ip link output in the container

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
3: ip6gre0: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 brd 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
4: gt6nactr01: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/gre6 2a:4:f:89:00:05:17:09:00:00:00:00:00:02:00:01 peer 2a:4:f:8:00:0f:09:97:02:17:01:10:00:59:00:05

or the command which creates this kind of tunnels

ip -6 tunnel add gt6nactr01 mode ip6gre local 2a4:4483:5:1709::2:1 remote 2a4:4494:f:997:217:110:59:5
ip link set mtu 1500 dev gt6nactr01 up
ip addr add 2a04:4454:f:f003::1/64 dev gt6nactr01
ip addr add 169.254.193.1/30 dev gt6nactr01

i attach a dmesg output and hope that you can reproduce the problem.

thx

meno

Revision history for this message
menoabels (meno-abels) wrote :

i tested with

 3.18.0-031800-generic #201412071935 SMP Mon Dec 8 00:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

and the problem is still there

meno

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

Hi,

We're hitting this bug on the latest trusty kernel in a similar context of this docker issue, we also had this problem on lucid with a custom 3.8.11 kernel which seems to be more agressive than trusty but still happens:

https://github.com/docker/docker/issues/5618

In this issue an upstream kernel bug and a redhat bug are mentioned:

https://bugzilla.kernel.org/show_bug.cgi?id=81211
https://bugzilla.redhat.com/show_bug.cgi?id=880394

On the redhat bug there is an upstream commit that suposedly fix the problem:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4

Any chance to have this patch backported to the trusty kernel?

Thanks,

Rodrigo.

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

Kernel stack trace when lxc-start process hang:

[27211131.602770] INFO: task lxc-start:25977 blocked for more than 120 seconds.
[27211131.602785] Not tainted 3.13.0-40-generic #69-Ubuntu
[27211131.602789] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27211131.602795] lxc-start D 0000000000000000 0 25977 1 0x00000080
[27211131.602800] ffff8806ee73dd40 0000000000000286 ffff880474bdb000 ffff8806ee73dfd8
[27211131.602803] 0000000000014480 0000000000014480 ffff880474bdb000 ffffffff81cdb760
[27211131.602806] ffffffff81cdb764 ffff880474bdb000 00000000ffffffff ffffffff81cdb768
[27211131.602809] Call Trace:
[27211131.602821] [<ffffffff81723b69>] schedule_preempt_disabled+0x29/0x70
[27211131.602825] [<ffffffff817259d5>] __mutex_lock_slowpath+0x135/0x1b0
[27211131.602832] [<ffffffff811a2679>] ? __kmalloc+0x1e9/0x230
[27211131.602835] [<ffffffff81725a6f>] mutex_lock+0x1f/0x2f
[27211131.602840] [<ffffffff8161c2c1>] copy_net_ns+0x71/0x130
[27211131.602845] [<ffffffff8108f889>] create_new_namespaces+0xf9/0x180
[27211131.602848] [<ffffffff8108f983>] copy_namespaces+0x73/0xa0
[27211131.602851] [<ffffffff81065b16>] copy_process.part.26+0x9a6/0x16b0
[27211131.602854] [<ffffffff810669f5>] do_fork+0xd5/0x340
[27211131.602857] [<ffffffff810c8e8d>] ? call_rcu_sched+0x1d/0x20
[27211131.602859] [<ffffffff81066ce6>] SyS_clone+0x16/0x20
[27211131.602863] [<ffffffff81730089>] stub_clone+0x69/0x90
[27211131.602865] [<ffffffff8172fd2d>] ? system_call_fastpath+0x1a/0x1f
[27211131.602869] INFO: task lxc-start:26342 blocked for more than 120 seconds.
[27211131.602874] Not tainted 3.13.0-40-generic #69-Ubuntu

tags: added: kernel-da-key trusty
Changed in linux (Ubuntu):
importance: Medium → High
status: Incomplete → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The patch mentioned in comment #5 was added to the mainline kernel as of 3.13-rc1, so it should already be in Trusty.

git describe --contains dcdfdf5
v3.13-rc1~7^2~16

Can you again test the latest mainline, which is now 3.19-rc4:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc4-vivid/

Also, is this a regression? Was there a prior kernel version that did not have this bug?

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

From the docker issue it seems that someone couldn't reproduce the bug when downgrading to kernel 3.13.0-32-generic, I can't validate this statement because kernels prior 3.13.0-35-generic has a regression that crashes my ec2 instance.

Also people testing kernel 3.14.0 couldn't reproduce the bug.

I will try to reproduce with the latest mainline kernel in next few hours.

Thanks,
Rodrigo.

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

Just got an instance with the kernel 3.16.0-29-generic #39-Ubuntu (linux-lts-utopic) hitting this bug in production, we don't have a reliable reproducer so the only way for me to validate is to boot this kernel in production and wait the bug to happen.

Is there anything I can get from an instance showing the bug that may help ?

Revision history for this message
In , rodrigo.vaz (rodrigo.vaz-linux-kernel-bugs) wrote :

FWIW we got an instance with kernel 3.16.0-29-generic #39-Ubuntu hitting this bug, we don't use docker but we run lxc, we also don't have a reliable reproducer and the only way to notice this kernel is also affected was too boot it in production and wait the bug to happen (usually when we have container stoping and starting).

Revision history for this message
Chris J Arges (arges) wrote :

At this point a proper reproducer would help the most. This way we could get crashdumps and other useful information that may not be possible in production environments.

Reading through the bug the best description I can see is:
1) Start LXC container
2) Download > 100MB of data
3) Stop LXC container
4) Repeat until you see the kernel message

For example, I can script this a bit:
#!/bin/bash
# setup
sudo lxc-create -t download -n u1 -- -d ubuntu -r trusty -a amd64
fallocate -l 1000000 file
IP=$(sudo lxc-attach -n u1 -- hostname -I)
# test
while true; do
  sudo lxc-start --name u1 -d
  ssh ubuntu@$IP rm file
  scp file ubuntu@$IP:
  sudo lxc-stop --name u1
done

Running this on my machine doesn't show a failure. So I suspect there are other variables involved. Please let us know other relevant info that would help isolate a reproducer for this.
Thanks,

Changed in linux (Ubuntu):
assignee: nobody → Chris J Arges (arges)
Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

Some additional info:

- The stack trace is always the same posted above and the break point seems to be copy_net_ns every time.
- The process that hangs is always lxc-start in every occurrence that I was able to check.

Rodrigo.

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

I left a couple instances running with the mainline kernel (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc7-vivid/) during the weekend, it took more time to see the bug on the mainline kernel but this morning one out of ten instances had the same problem so I'm assuming mainline is also affected.

Kernel version 3.19.0-031900rc7-generic

Revision history for this message
menoabels (meno-abels) wrote : Re: [Bug 1403152] Re: unregister_netdevice: waiting for lo to become free. Usage count
Download full text (3.5 KiB)

Hey,

here is a discussion about the reproducablity. I'm wrote very early in
these thread that
if I set the following network config in a container

ip -6 tunnel add gt6nactr01 mode ip6gre local 2a4:4483:5:1709::2:1
remote 2a4:4494:f:997:217:110:59:5
ip link set mtu 1500 dev gt6nactr01 up
ip addr add 2a04:4454:f:f003::1/64 dev gt6nactr01
ip addr add 169.254.193.1/30 dev gt6nactr01

it will hang on reboot, try it again just now an it works/halt! There
is also a other possiblity to stop a lxc with the same result "waiting
for lo to become free".
And this is if you mount a nfs server with a lxc container, you have
to change something apparmor but than it works perfectly until you try
to reboot the lxc container.

Is there any reason why not try to fix these reproducable problems first?

cheers

meno

On Mon, Feb 9, 2015 at 6:50 PM, Rodrigo Vaz <email address hidden> wrote:
> I left a couple instances running with the mainline kernel
> (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc7-vivid/) during
> the weekend, it took more time to see the bug on the mainline kernel but
> this morning one out of ten instances had the same problem so I'm
> assuming mainline is also affected.
>
> Kernel version 3.19.0-031900rc7-generic
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1403152
>
> Title:
> unregister_netdevice: waiting for lo to become free. Usage count
>
> Status in The Linux Kernel:
> Unknown
> Status in linux package in Ubuntu:
> Triaged
> Status in linux source package in Trusty:
> New
> Status in linux source package in Utopic:
> New
>
> Bug description:
> I currently running trusty latest patches and i get on these hardware
> and software:
>
> Ubuntu 3.13.0-43.72-generic 3.13.11.11
>
> processor : 7
> vendor_id : GenuineIntel
> cpu family : 6
> model : 77
> model name : Intel(R) Atom(TM) CPU C2758 @ 2.40GHz
> stepping : 8
> microcode : 0x11d
> cpu MHz : 2400.000
> cache size : 1024 KB
> physical id : 0
> siblings : 8
> core id : 7
> cpu cores : 8
> apicid : 14
> initial apicid : 14
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms
> bogomips : 4799.48
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> somehow reproducable the subjected error, and lxc is working still but
> not more managable until a reboot.
>
> managable means every command hangs.
>
> I saw there are alot of bugs but they seams to relate to older v...

Read more...

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

Meno,

I just tried your testcase where you described adding an ipv6gre to the container and rebooting it, couldn't reproduce the netdev_hung problem so far, do you mind sharing specific details or even a script that will reproduce the problem?

Mounting an NFS share on my containers is not a common activity from our users and still we see the problem happening on multiple instances.

Regards,
Rodrigo.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Trusty):
status: New → Confirmed
Changed in linux (Ubuntu Utopic):
status: New → Confirmed
Chris J Arges (arges)
Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → Stefan Bader (smb)
Revision history for this message
In , fxposter (fxposter-linux-kernel-bugs) wrote :

I've got the same problem with latest docker and both Debian 7 and 8 with 3.16(.7) kernels: https://github.com/docker/docker/issues/9605#issuecomment-85025729. This happens once in a couple of days and the only way to continue running docker containers on this server is to force reboot.

Revision history for this message
Jordan Curzon (curzonj) wrote :

I'm working with Rodrigo Vaz and we've found some details about our occurrence of this issue using systemtap.

rt_cache_route places a dst_entry struct into a fib_nh struct as the nh_rth_input. Occasionally the reference counter on that dst is not decremented by the time free_fib_info_rcu is called on the fib during container teardown. In that case free_fib_info_rcu doesn't call dst_destroy and dev_put is not called on the lo interface of the container. The only situation where we've seen this is when A) the fib_nh->nh_dev points to the eth0 interface of the container, B) the dst is part of an rtable struct where rt-> rt_is_input==1, and C) the dst points to the lo interface of the container. The dst is cached in the fib only once and never replaced and then thousands of dst_hold/dst_release calls are made on the dst for connections and socket buffers.

We have only seen this so far on containers making lots of outbound connections. It doesn't appear to depend on the lifetime of the container, some are only alive for 30min and others are alive for 24hrs. The issue occurs when you try to destroy the container because that is when the fib is freed. We don't know when or where the dst ref_cnt becomes incorrect.

We don't know how to reproduce the issue.

Revision history for this message
Jordan Curzon (curzonj) wrote :

To add to my comment just above, we have found a workload that is not under own our control (which limits what we can do with it, including sharing it) but which exacerbates the issue on our systems. This workload causes the issue less than 5% of the time and that gives us the chance to look at what callers are using dst_release on the dst in question (the one that meets conditions A, B, and C; "ABC dst"). To clarify, failure at this point for us is when the dst->ref_cnt=1 during free_fib_info_rcu, success is when dst->ref_cnt=0 and free_fib_info_rcu calls dst_destroy on it's nh_rth_input member.

In both failure and non-failure scenarios the only two callers we see on are a single call to ipv4_pktinfo_prepare, and many calls to inet_sock_destruct, and tcp_data_queue. That likely eliminates a unique function call to dst_release that can be identified as missing in failure scenarios. My interpretation is that a single call to inet_sock_destruct or tcp_data_queue is failing to occur when in the failure scenario.

Revision history for this message
Stefan Bader (smb) wrote :

Jordan, which kernel version are you using to make your observations (thanks a lot for those, btw).

Revision history for this message
Steve Conklin (sconklin) wrote :

Answering for Jordan

# uname -a
Linux ip-10-30-153-210 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
# dpkg -l | grep linux
ii libselinux1:amd64 2.2.2-1ubuntu0.1 amd64 SELinux runtime shared libraries
ii linux-headers-3.13.0-40 3.13.0-40.69 all Header files related to Linux kernel version 3.13.0
ii linux-headers-3.13.0-40-generic 3.13.0-40.69 amd64 Linux kernel headers for version 3.13.0 on 64 bit x86 SMP
ii linux-headers-generic 3.13.0.40.47 amd64 Generic Linux kernel headers
ii linux-headers-virtual 3.13.0.40.47 amd64 Transitional package.
ii linux-image-3.13.0-40-generic 3.13.0-40.69 amd64 Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii linux-image-virtual 3.13.0.40.47 amd64 This package will always depend on the latest minimal generic kernel image.
ii linux-libc-dev:amd64 3.13.0-43.72 amd64 Linux Kernel Headers for development
ii linux-virtual 3.13.0.40.47 amd64 Minimal Generic Linux kernel and headers
ii util-linux 2.20.1-5.1ubuntu20.3 amd64 Miscellaneous system utilities

Revision history for this message
Jordan Curzon (curzonj) wrote :

An addition to my #18 comment above. I forgot that I had to remove logging on calls from skb_release_head_state to dst_release due to performance impact of the number of calls. In both failure and success scenarios, calls to dst_release by skb_release_head_state are also present.

Revision history for this message
Dave Richardson (pudnik019) wrote :

I am able to reliably recreate this on both Ubuntu 14.04 with kernel 3.13.0-37-generic and Ubuntu 14.10 with kernel 3.16.0-23-generic. I have test code that can hit the bug, but it's not cleaned up enough to share. The steps to repro are:

Start a container (COW clone using overlayfs from a base/parent Ubuntu 14.04 container). Start clone container, mount an NFS share (remote from machine hosting container) to /mnt in container, create a file on the NFS share in container (dd if=/dev/zero of=/mnt/foo bs=1024 count=100000), stop container, destroy container.

I have a script that performs the above sequence over and over again, running 5 such clone containers in parallel at any given time. It only takes about 30s for the kernel to get wedged with:

unregister_netdevice: waiting for lo to become free. Usage count = 1

This same script is fine if you do scp instead of writing over NFS. It seems like NFS inside the container is somehow involved here.

Revision history for this message
Steve Conklin (sconklin) wrote :

David, would you like help getting the reproducer code ready to share?

Revision history for this message
Dave Richardson (pudnik019) wrote :

Ok, I've got a minimal reproducer ready. It causes the bug when I run it on Ubuntu 14.04 with 3.13.0-24-generic and LXC version 1.0.7-0ubuntu0.1 0. It's python, and the script has some documentation describing its usage as well as the prerequisites. You should run it as root via sudo.

Prereqs require you to install a base ubuntu trusty container, install nfs-common in that container, and modify lxc-default to allow containers to perform NFS mounts. You will also need an NFS share that your containers can mount to scribble data to. The address of this share is passed as the third parameter to the script in the form of IP_ADDRESS:/path/to/nfs/share.

My typical usage looks like:

sudo ./reproducer.py 5 10 IPADDRESS:/path/to/nfs/share

Which starts 5 threads that in each of 10 iterations, creates a container, mounts the nfs share to /mnt, then dd's over some zeros, umounts /mnt, stops, and destroys the container. I can reliably hit the unregister_netdevice hang within 1 minute of running this.

~Dave

Chris J Arges (arges)
Changed in linux (Ubuntu):
assignee: Stefan Bader (smb) → Chris J Arges (arges)
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Chris J Arges (arges) wrote :

So far I haven't been able to reproduce this with 3.13.0-49, nor 3.13.0-37.

My setup is as follows:
Two vms, one with nfs-server and another for reproducer.

nfs-server setup with these commands:
  apt-get install nfs-kernel-server
  mkdir /export
  chown 777 /export
  echo "/export *(rw,no_root_squash,fsid=0,no_subtree_check)" > /etc/exports
  service nfs-kernel-server restart

Reproducer VM has LXC container that is bridged to the libvirt bridge.
  apt-get update
  apt-get install lxc
  lxc-create -t download -n base -- -d ubuntu -r trusty -a amd64
Then edit /var/lib/lxc/base/config to use:
  lxc.aa_profile = unconfined
  lxc.network.link = br0

Then run:
 sudo ./reproducer.py 5 10 192.168.122.241:/export

I tried running this many times with no luck. Any other variables I need to consider?

Revision history for this message
Dave Richardson (pudnik019) wrote :

Here's the setup I use: run the reproducer in a 14.04 server VM (Linux ubuntu 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux), VM is bridged to host machine's en0. I am using VMware Fusion on OSX 10.10 host.

Containers inside the VM use NAT via lxcbr0 (lxc.network.link = lxcbr0), getting 10. addresses via dhcp. My NFS server is on a different physical machine than the one running the reproducer VM. I am also not using unconfined profile, but rather modifying the default to allow NFS mounting.

I'll see if I can get it repro'ing using a two VM setup that you describe. If so, worst case I could send you the VMs.

Revision history for this message
Chris J Arges (arges) wrote :

Ok, moved to two real machines and I can repro there:
[ 3797.901145] unregister_netdevice: waiting for lo to become free. Usage count = 1

Will continue to look at this.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Great, thanks Chris! Let me know if I can help further.

Revision history for this message
Chris J Arges (arges) wrote :

This is fixed in v3.18 and broken in v3.17. Doing a reverse bisect on this to determine which patch(es) fix this issue.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Ok. For what it's worth, we still see the problem on Ubuntu 15.04 running v3.19. However, the problem is much harder to trigger and the attached reproducer script does not trigger the problem there, so I don't have code to reliably repro it on 3.19.

Revision history for this message
Chris J Arges (arges) wrote :

Dave,
Good to know. Hopefully the bisect will implicate the right lines of code in order to find a v3.19+ fix.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Agreed. Thanks for pushing this forward Chris.

Revision history for this message
Federico Alves (h-sales) wrote :

It happens to me on Centos 7, with kernel 4.0.1-1.el7.elrepo.x86_64
Every time I reboot a container, without any particular heavy data load, I ge
[630362.513630] hrtimer: interrupt took 5912836 ns
[727851.858312] device eth0 left promiscuous mode
[727853.622084] eth0: renamed from mcCDGW73
[727853.632989] device eth0 entered promiscuous mode
[728077.625688] unregister_netdevice: waiting for lo to become free. Usage count = 1
[728087.655832] unregister_netdevice: waiting for lo to become free. Usage count = 1
[728097.686066] unregister_netdevice: waiting for lo to become free. Usage count = 1

So this is not fixed in 4.X

Revision history for this message
Chris J Arges (arges) wrote :

The result of the reverse bisect between v3.17..v3.18 with the reproducer was:
# first bad commit: [34666d467cbf1e2e3c7bb15a63eccfb582cdd71f] netfilter: bridge: move br_netfilter out of the core
If I backport this patch plus 7276ca3f on top of v3.17 I no longer get the hang with the simple reproducer (although I suspect a more elaborate reproducer would still trigger the issue).

This isn't a fix because we obviously have issues in later kernels. The 'unregister_netdevice' message could occur for different code paths since there could potentially be many paths that modify the refcnt for the net_device. I'm going to track who is calling 'dev_put' and 'dev_hold' and figure out which code patch is not freeing before we start unregistering the device. I'll focus in on the reproducer I have for now, since this seems similar to the 'shutdown containers with network connections' case.

In addition I'm doing a bisect between earlier versions to see where the 'regression' occured. 3.11 seems to pass without regression for 5 threads / 50 iterations.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Great progress, thanks Chris. I did some initial instrumentation of dev_put and dev_hold calls earlier, dumping out the call stack and comparing a good run to a bad one. Nothing definitive yet, and it's bit tricky to match up on an SMP machine, but one thing stood out: many of the bad traces had UDP in the mix (for example, calls to __udp4_lib_mcast_deliver and __udp4_lib_rcv). Maybe useful information, maybe not.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Any updates?

Revision history for this message
Chris J Arges (arges) wrote :

Dave,
Working on this now. I tried a bisect between v3.12..v3.13 where it seemed that 3.12 worked better, but it wasn't conclusive. I tried looking at dev_put dev_hold via systemtap, but that seems to modify the timing sufficiently. Looking at comments from Jordan Curzon to see if I can start instrumenting just those sections of the kernel to track down where we're failing to decrement the reference. I'm tracing the NETDEV_UNREGISTER notifier calls now to see if that leads to something conclusive.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Sounds good, thanks Chris.

Revision history for this message
Chris J Arges (arges) wrote :

When running the reproducer, if I mount the nfs drive using -o vers=3,proto=udp (udp instead of tcp), I cannot get any failures when running 10 threads 150 iterations multiple times.

Revision history for this message
Chris J Arges (arges) wrote :

Additional Experiments:
- No NAT, lxc bridges to VM eth0, which is bridged to host machine br0
 Still fails
- Let timeout on unregister_notifier keep going
 Still running after an hour, can't create new containers (original issue)
 sudo conntrack -{L,F} hangs...
- Purge conntrack flowtable on host right before unmount
 Still fails
- Look at conntrack table on host when its stuck
 Had issues with this because of hang

Next steps:
- Track down TCP connection that's being held open

Revision history for this message
Chris J Arges (arges) wrote :

A few more results:
- I can track the dst_entry holding the dev causing the issue.
- I can reproduce on v3.19 kernels by first using 'modprobe br_netfilter'.

Revision history for this message
Dave Richardson (pudnik019) wrote :

Really great progress! Sounds like you're close to finding a fix. Thanks Chris.

Revision history for this message
Chris J Arges (arges) wrote :

Steve saw this patch: e53376bef2cd97d3e3f61fdc677fb8da7d03d0da as a potential fix. For completeness I tested a backport to 3.13 (since its in 3.14+). The same failure mode occurs with this patch.

Disabling sysctl_ip_early_demux still causes a failure.

To work around this issue, if you don't need CONFIG_BRIDGE_NETFILTER you can disable this and error does not seem to occur anymore. (Which follows from earlier comments.)

Revision history for this message
Joe Stringer (joestringer) wrote :

Just chiming in here, I contacted Rodrigo off-list and was verging towards that same patch. More below.

I suspect there's two issues here with very similar symptoms. In
particular post #8 which mentions people reporting that 3.14 improves
the situation.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/8

I've been chasing a bug in 3.13 with docker containers and connection
tracking which is fixed in 3.14, by this patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e53376bef2cd97d3e3f61fdc677fb8da7d03d0da

Note that the commit message for the above commit fixes a different
issue, but I've been able to produce issues of the nature in this thread
(hung docker / ip netns add commands like in post #6) before applying
this patch, but cannot reproduce after.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/6

In the issue that I face, I can find a kworker thread using up an entire
core, and when I cat /proc/$pid/stack I see this:

<ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0
[<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack]
[<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170
[nf_conntrack]
[<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack]
[<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0
[<ffffffffbe084991>] process_one_work+0x171/0x470
[<ffffffffbe08563b>] worker_thread+0x11b/0x3a0
[<ffffffffbe08bb82>] kthread+0xd2/0xf0
[<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff

The kworker is looping forever and failing to clean up conntrack state.
All the while, it holds the global netns lock. Given that I've bisected
to the commit linked above which is to do with refcounting, I suspect
that borked refcounting on conntrack entries makes them impossible to
properly free/destroy, which prevents this worker from cleaning up the
namespace, which then goes on to prevent anything else from interacting
with namespaces (add/delete/etc).

Revision history for this message
Chris J Arges (arges) wrote :

Joe,
I've created bug 1466135 to handle updating our 3.13 to fix your issue. Please add any information to that bug that would be useful in verifying the fix once it goes through our stable updates process. Thanks!

Revision history for this message
Chris J Arges (arges) wrote :

A summary of the bug so far:

Occasionally starting and stopping many containers with network traffic may
result in new containers being unable to start due to the inability to create
new network namespaces.

The following message repeats in the kernel log until reboot.

unregister_netdevice: waiting for lo to become free. Usage count = 1

Eventually when creating a new container this hung task backtrace occurs:

  schedule_preempt_disabled+0x29/0x70
  __mutex_lock_slowpath+0x135/0x1b0
  ? __kmalloc+0x1e9/0x230
  mutex_lock+0x1f/0x2f
  copy_net_ns+0x71/0x130
  create_new_namespaces+0xf9/0x180
  copy_namespaces+0x73/0xa0
  copy_process.part.26+0x9a6/0x16b0
  do_fork+0xd5/0x340
  ? call_rcu_sched+0x1d/0x20
  SyS_clone+0x16/0x20
  stub_clone+0x69/0x90
  ? system_call_fastpath+0x1a/0x1f

The following conditions I've been able to test:
- If CONFIG_BRIDGE_NETFILTER is disabled this problem does not occur.
- If net.bridge.bridge-nf-call-iptables is disabled, this problem does not occur.
- This problem can happen on single processor machines
- This problem can happen with IPv6 disabled
- This problem can happen with xt_conntrack enabled.

The unregister_netdevice warning always waits on lo. It always has reg_state
set to NETREG_UNREGISTERING. This follows that the device has been through the
unregister_netdevice_many path and is being unregistered. This path is ultimately
where net_mutex is locked and thus prevents copy_net_ns from executing.

In addition when the unregister netdevice warning happens, a crashdump reveals
the dst_busy_list always contains a dst_entry that references the device above.
This dst_entry has already been through ___dst_free since it has already been
marked DST_OBSOLETE_DEAD. 'dst->ops' is always set to ipv4_dst_ops.
dst->callback_head.next is NULL, and the next pointer is NULL. Use is also zero.

We can trace where the dst_entry is trying to be freed. When free_fib_info_rcu
is called, if nh_rth_input is set, it eventually calls dst_free. Because there
is still a refcnt held, it does not get immediately destroyed and continues on
to __dst_free. This puts the dst into the dst_garbage list, which is then
examined periodically by the dst_gc_work worker thread. Each time it tries to
clean it up it fails because it still has a non-zero refcnt.

The faulty dst_entry is being allocated via ip_rcv..ip_route_input_noref. In
addition this dst is most likely being held in response to a new packet via the
ip_rcv..inet_sk_rx_dst_set path.

Revision history for this message
Dave Richardson (pudnik019) wrote :

I can confirm that disabling CONFIG_BRIDGE_NETFILTER and/or disabling net.bridge.bridge-nf-call-iptables makes me unable to repro on Ubuntu 14.04 running 3.13.0-53-generic.

Revision history for this message
Chris J Arges (arges) wrote :

Email sent to netdev about what we've found so far with this bug:
http://www.spinics.net/lists/netdev/msg333868.html

Revision history for this message
Dave Richardson (pudnik019) wrote :

Looks like no progress on the netdev side yet?

Revision history for this message
Chris J Arges (arges) wrote :

So it seems like disabling CONFIG_BRIDGE_NETFILTER is a red herring, I've been able to reproduce with this off as well on 3.19.

Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Utopic):
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
importance: Undecided → Critical
importance: Critical → Medium
Changed in linux (Ubuntu Utopic):
importance: Undecided → Medium
Changed in linux (Ubuntu Vivid):
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu Utopic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Trusty):
status: Confirmed → In Progress
Changed in linux (Ubuntu):
importance: High → Medium
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

Ok so I have reports that NFS may also be a red herring, however I've found a patch that seems to greatly mitigate issues when testing with the reproducer I was provided. So I'm going to provide a test build with that for 3.13, 3.16, 3.19 and users can give me feedback if that helps.

In addition, if you still can reproduce the issue with this fix, it would be helpful to know more about your test case. I'd like to handle any of these issues as a separate bug.

Revision history for this message
Chris J Arges (arges) wrote :

Build is here:
http://people.canonical.com/~arges/lp1403152/

Please test and give feedback. Thanks!

Chris J Arges (arges)
Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Fix Released
Brad Figg (brad-figg)
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Utopic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Dave Richardson (pudnik019) wrote :

Chris, I will start testing this build and let you know how it goes. Thanks.

Revision history for this message
Dave Richardson (pudnik019) wrote :

What is the patch?

Revision history for this message
Chris J Arges (arges) wrote :

Dave,
I added it to the description of the bug:
commit de84d89030fa4efa44c02c96c8b4a8176042c4ff
--chris

Brad Figg (brad-figg)
no longer affects: linux (Ubuntu Utopic)
no longer affects: linux-lts-utopic (Ubuntu Vivid)
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-vivid' to 'verification-done-vivid'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-vivid
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (16.0 KiB)

This bug was fixed in the package linux-lts-utopic - 3.16.0-46.62~14.04.1

---------------
linux-lts-utopic (3.16.0-46.62~14.04.1) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1483790
  * SAUCE: REBASE-FIXUP: debian/scripts/misc/getabis: fix abi directory

  [ Upstream Kernel Changes ]

  * Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux-lts-utopic (3.16.0-46.61~14.04.1) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1478986

  [ Brad Figg ]

  * SAUCE: REBASE-FIXUP: debian/rules.d/1-maintainer.mk: Need '~14.04.1' as
    part of the version when looking for the baseCommit for printchanges

  [ Upstream Kernel Changes ]

  * Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1478852
  * storvsc: use cmd_size to allocate per-command data
    - LP: #1445195
  * storvsc: fix a bug in storvsc limits
    - LP: #1445195
  * Drivers: hv: vmbus: Support a vmbus API for efficiently sending page
    arrays
    - LP: #1445195
  * scsi: storvsc: Increase the ring buffer size
    - LP: #1445195
  * scsi: storvsc: Size the queue depth based on the ringbuffer size
    - LP: #1445195
  * scsi: storvsc: Always send on the selected outgoing channel
    - LP: #1445195
  * scsi: storvsc: Retrieve information about the capability of the target
    - LP: #1445195
  * scsi: storvsc: Don't assume that the scatterlist is not chained
    - LP: #1445195
  * scsi: storvsc: Set the tablesize based on the information given by the
    host
    - LP: #1445195
  * SUNRPC: TCP/UDP always close the old socket before reconnecting
    - LP: #1403152
  * ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
    - LP: #1468582
  * Fix kmalloc slab creation sequence
    - LP: #1475204
  * ARM: clk-imx6q: refine sata's parent
    - LP: #1478852
  * KVM: nSVM: Check for NRIPS support before updating control field
    - LP: #1478852
  * nfs: take extra reference to fl->fl_file when running a setlk
    - LP: #1478852
  * bridge: fix multicast router rlist endless loop
    - LP: #1478852
  * net: don't wait for order-3 page allocation
    - LP: #1478852
  * sctp: fix ASCONF list handling
    - LP: #1478852
  * bridge: fix br_stp_set_bridge_priority race conditions
    - LP: #1478852
  * packet: read num_members once in packet_rcv_fanout()
    - LP: #1478852
  * packet: avoid out of bounds read in round robin fanout
    - LP: #1478852
  * neigh: do not modify unlinked entries
    - LP: #1478852
  * tcp: Do not call tcp_fastopen_reset_cipher from interrupt context
    - LP: #1478852
  * net: phy: fix phy link up when limiting speed via device tree
    - LP: #1478852
  * sctp: Fix race between OOTB responce and route removal
    - LP: #1478852
  * x86/mce: Fix MCE severity messages
    - LP: #1478852
  * s5h1420: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * cx24116: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * af9013: Don't accept invalid bandwidth
    - LP: #1478852
  * cx24117: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * spi: fix race freeing dummy_tx/rx before it is unmapped
    - LP: #...

Changed in linux-lts-utopic (Ubuntu Trusty):
status: New → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (30.6 KiB)

This bug was fixed in the package linux - 3.19.0-26.28

---------------
linux (3.19.0-26.28) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1483630

  [ Upstream Kernel Changes ]

  * Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux (3.19.0-26.27) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1479055
  * [Config] updateconfigs for 3.19.8-ckt4 stable update

  [ Chris J Arges ]

  * [Config] Add MTD_POWERNV_FLASH and OPAL_PRD
    - LP: #1464560

  [ Mika Kuoppala ]

  * SAUCE: i915_bpo: drm/i915: Fix divide by zero on watermark update
    - LP: #1473175

  [ Tim Gardner ]

  * [Config] ACORN_PARTITION=n
    - LP: #1453117
  * [Config] Add i40e[vf] to d-i
    - LP: #1476393

  [ Timo Aaltonen ]

  * SAUCE: i915_bpo: Rebase to v4.2-rc3
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "mm/fault, drm/i915: Use pagefault_disabled()
    to check for disabled pagefaults"
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "drm: i915: Port to new backlight interface
    selection API"
    - LP: #1473175

  [ Upstream Kernel Changes ]

  * Revert "tools/vm: fix page-flags build"
    - LP: #1473547
  * Revert "ALSA: hda - Add mute-LED mode control to Thinkpad"
    - LP: #1473547
  * Revert "drm/radeon: adjust pll when audio is not enabled"
    - LP: #1473547
  * Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1479048
  * module: Call module notifier on failure after complete_formation()
    - LP: #1473547
  * gpio: gpio-kempld: Fix get_direction return value
    - LP: #1473547
  * ARM: dts: imx27: only map 4 Kbyte for fec registers
    - LP: #1473547
  * ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
    - LP: #1473547
  * x86/mce: Fix MCE severity messages
    - LP: #1473547
  * mac80211: don't use napi_gro_receive() outside NAPI context
    - LP: #1473547
  * iwlwifi: mvm: Free fw_status after use to avoid memory leak
    - LP: #1473547
  * iwlwifi: mvm: clean net-detect info if device was reset during suspend
    - LP: #1473547
  * drm/plane-helper: Adapt cursor hack to transitional helpers
    - LP: #1473547
  * ARM: dts: set display clock correctly for exynos4412-trats2
    - LP: #1473547
  * hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE
    - LP: #1473547
  * mfd: da9052: Fix broken regulator probe
    - LP: #1473547
  * ALSA: hda - Fix noise on AMD radeon 290x controller
    - LP: #1473547
  * lguest: fix out-by-one error in address checking.
    - LP: #1473547
  * xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
    - LP: #1473547
  * xfs: xfs_iozero can return positive errno
    - LP: #1473547
  * fs, omfs: add NULL terminator in the end up the token list
    - LP: #1473547
  * omfs: fix sign confusion for bitmap loop counter
    - LP: #1473547
  * d_walk() might skip too much
    - LP: #1473547
  * dm: fix casting bug in dm_merge_bvec()
    - LP: #1473547
  * hwmon: (nct6775) Add missing sysfs attribute initialization
    - LP: #1473547
  * hwmon: (nct6683) Add missing sysfs attribute initialization
    - LP: #1473547
  * target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
    - LP: #1473547
  * net...

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
Revision history for this message
In , justin (justin-linux-kernel-bugs) wrote :

I'm also seeing this problem; on debian 7 hosts with 3.16 kernel and debian 8 with 4.2.0; happens every 1-2 days for us (across a few dozen machines). We haven't found a way to reliable reproduce the bug yet; but we are unable to shutdown the system cleanly, we have to force power off to be able to continue to run containers.

Brad Figg (brad-figg)
Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-utopic (Ubuntu):
status: New → Confirmed
Revision history for this message
Andrew Ruthven (andrew-etc) wrote :

We're still seeing the same issue on Ubuntu Trusty running the linux-image-3.19.0-30-generic kernel.

Looking at this thread[0] on the Docker site which is referenced to in the kernel bugzilla for this issue, there is a reference[1] to a patch[2] on the netdev mailing list from 2015-11-05 by Francesco Ruggeri, later that day David Miller accepted it and queued it for -stable. The patch is currently in the master branch of Linus's tree, but hasn't made it into the 4.2 branch yet.

The patch doesn't apply cleanly to the Canonical 3.19 branch, but it looks like it is mostly line skew. I haven't tried hacking it in yet.

[0] https://github.com/docker/docker/issues/5618
[1] https://github.com/docker/docker/issues/5618#issuecomment-154334933
[2] http://www.spinics.net/lists/netdev/msg351337.html

Revision history for this message
Rodrigo Vaz (rodrigo-vaz) wrote :

FWIW here is an update of what I've tried in the last couple months trying to fix this problem (unsuccessfully):

- We tried to deny packets to the container's network before we destroy the namesapce
- Backported the patch mentioned in the previous comment to ubuntu kernel 3.19 and 4.2 (linux-lts-vivid and linux-lts-wily)
- Applied patches listed in this thread: http://www.spinics.net/lists/netdev/msg345205.html on trusty kernel, vivid kernel already has
- 4.4-rc3 net-next branch

All experiments above failed to fix the problem and the bug was triggered in production machines within 24h using these experimental kernels.

We still can't reliably reproduce this issue outside production but it is easy to validate any proposed solution with a few hours of production load.

Rodrigo.

Revision history for this message
James Dempsey (jamespd) wrote :
Download full text (4.2 KiB)

We also backported [1] to 4.2 (linux-lts-wily) and deployed it to our production OpenStack cloud. We just installed it yesterday and our MTBF is between two and twenty days, so we won't know if this has made any difference for a while now.

Some details about our configuration / failure mode:

Three OpenStack "Layer 3" hosts (running 3.19.0-30-generic #34~14.04.1-Ubuntu) providing virtual routers/VPNs/Metadata via network namespaces.

Our most recent failures occurred on hosts B and C (within 30 minutes of each other, after having been fine for weeks) while removing routers from A and re-creating them on B and C.

Our stack traces are a slightly different from the ones posted above...

Dec 14 15:37:05 hostname kernel: [961050.119727] INFO: task ip:9865 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.126707] Tainted: G C 3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.135073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.144094] ip D ffff88097e3e3de8 0 9865 9864 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.144098] ffff88097e3e3de8 ffff880e982693a0 0000000000013e80 ffff88097e3e3fd8
Dec 14 15:37:05 hostname kernel: [961050.144100] 0000000000013e80 ffff88101a8993a0 ffff880e982693a0 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.144102] ffffffff81cdb2a0 ffffffff81cdb2a4 ffff880e982693a0 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.144104] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.144109] [<ffffffff817b2fa9>] schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.144111] [<ffffffff817b4c95>] __mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.144115] [<ffffffff811cfd66>] ? __kmalloc+0x226/0x280
Dec 14 15:37:05 hostname kernel: [961050.144117] [<ffffffff816a14a1>] ? net_alloc_generic+0x21/0x30
Dec 14 15:37:05 hostname kernel: [961050.144120] [<ffffffff817b4d23>] mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.144122] [<ffffffff816a1c75>] copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.144125] [<ffffffff810943ad>] create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.144127] [<ffffffff810945ba>] unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.144130] [<ffffffff8107439b>] SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.144133] [<ffffffff817b6e4d>] system_call_fastpath+0x16/0x1b
Dec 14 15:37:05 hostname kernel: [961050.144135] INFO: task ip:9896 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.151109] Tainted: G C 3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.159558] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.168551] ip D ffff8804591cfde8 0 9896 9895 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.168556] ffff8804591cfde8 ffff880814031d70 0000000000013e80 ffff8804591cffd8
Dec 14 15:37:05 hostname kernel: [961050.168558] 0000...

Read more...

Revision history for this message
Dan Streetman (ddstreet) wrote :

Andrew, Rodrigo, James, do any of you have a crash dump from a system where the problem has happened? or any way to more easily reproduce it?

penalvch (penalvch)
Changed in linux-lts-utopic (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Dan Streetman (ddstreet) wrote :

I should note that I'm pretty sure this new problem is almost certainly a different bug (different cause) than the original bug fixed in comment 59 and earlier; just the symptoms appear the same.

Revision history for this message
Dan Streetman (ddstreet) wrote :

James Dempsey, the upstream commit you referenced is in trusty lts-vivid at Ubuntu-lts-3.19.0-49.55_14.04.1

Revision history for this message
Dan Streetman (ddstreet) wrote :

> James Dempsey, the upstream commit you referenced is in trusty lts-vivid at Ubuntu-lts-3.19.0-49.55_14.04.1

specifically, upstream it's commit 30f7ea1c2b5f5fb7462c5ae44fe2e40cb2d6a474 and in the trusty lts branch it's commit a8412804321b96aac2665527535ac28a934ed0e2

Revision history for this message
Dan Streetman (ddstreet) wrote :

Andrew Ruthven that's your patch also ^

Revision history for this message
Dan Streetman (ddstreet) wrote :

Rodrigo, just in case the problem you're seeing is fixed by ^ commit, please try the latest trusty lts-vivid (at version I listed above, or later).

Revision history for this message
In , mitsuto (mitsuto-linux-kernel-bugs) wrote :

I'm also seeing this behaviour on docker 1.9.1 and kernel 4.4.1 from kernel-ml package on centos7. It's intermittent and not reliably reproducible.

Revision history for this message
Cristian Calin (cristi-calin) wrote :

@ddstreet, the Ubuntu-lts-3.19.0-49.55_14.04.1 doesn't seem to fix this issue.
We just upgraded a Juno openstack environment to this kernel version and we are still seeing this on a network node.

[controller] root@node-28:~# uname -a
Linux node-28.domain.tld 3.19.0-49-generic #55~14.04.1-Ubuntu SMP Fri Jan 22 11:24:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[controller] root@node-28:~# dmesg -T | tail -n 3
[Thu Feb 11 07:14:15 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[Thu Feb 11 07:14:26 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[Thu Feb 11 07:14:36 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[controller] root@node-28:~# dmesg -T | grep -c unregister_netdevice..waiting
6401

Revision history for this message
Dan Streetman (ddstreet) wrote :

Cristian ok thnx. I'll keep investigating.

Revision history for this message
In , gopinath.taget (gopinath.taget-linux-kernel-bugs) wrote :

Guys,

I am seeing this issue on CoreOS 970.1.0 with Docker 1.10.1. This issue has been festering for more than 1.5 years. What can I do to help fix it? What would be the direction of investigation? Any pointers would help.

Thanks
Gopinath

Revision history for this message
Dan Streetman (ddstreet) wrote :

If anyone is able to reproduce this, I have a debug kernel at this ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

it prints a debug line when any loopback interface is dev_put/dev_hold, which may help show where the extra loopback reference is coming from when this problem happens. If anyone can reproduce the problem using the debug kernel, please attach the logs (and/or sosreport).

Revision history for this message
In , schnitzelbub (schnitzelbub-linux-kernel-bugs) wrote :

I can confirm that network IO will trigger this issue.
On a Debian 8.4 host running Docker 1.11.0 containers on top of kernel 4.4.6 I too face this problem.
My repro is to build an image (based on the default Debian image debian:jessie) that first dist-upgrades system packages and then installs some more packages.
I haven't experienced this running a similar setup on VirtualBox/Vagrant so far, though.
Any hint on how to tackle this highly appreciated.

Cheers,
Hannes

Revision history for this message
Michael Richardson (mjrichardson) wrote :

Credit where credit is due: we've slowly rolled the patch above (per #63) into production over the last six months, initially through custom compiled kernels and more recently via mainline LTS wily packages; and have seen no further issues with hung namespaces. Kudos to ddstreet.

Revision history for this message
In , philporada (philporada-linux-kernel-bugs) wrote :

I'm running on CentOS Linux release 7.2.1511 (Core) with the Linux 4.6.3-1.el7.elrepo.x86_64 kernel. My docker version is 1.11.2. I am running on Amazon EC2. This does NOT happen to me in VirtualBox with the same kernel/docker version.

There's quite a bit of discussion on https://github.com/docker/docker/issues/5618

Revision history for this message
In , bastienphilbert (bastienphilbert-linux-kernel-bugs) wrote :

Can you git bisect it as you known what kernel version it seemed to stop working on and report back with the offending commit id when the bisection is complete.

Revision history for this message
In , pegerto (pegerto-linux-kernel-bugs) wrote :

Hi,

According to the docker issue is raise by creating destroying multiple containers, form my undertanding is a raise condition as this is not replicable with a single cpu vm.

I reproduced the issue with CoreOS 1068.8.0 over VirtualBox, I can take a kernel dump from the hypervisor if this helps.

Regards.

Revision history for this message
In , bastienphilbert (bastienphilbert-linux-kernel-bugs) wrote :

Can please do, and you mean a race condition not a raise condition I think.

Revision history for this message
Dan Streetman (ddstreet) wrote :

There are upstream commits that may fix this problem, I've updated the same ppa with a new test build (with only the commits, and without any debug). If anyone can still reproduce this problem, please try the latest kernel in the ppa and report if it fixes the problem or not:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

the upstream commits are:
751eb6b6042a596b0080967c1a529a9fe98dac1d ("ipv6: addrconf: fix dev refcont leak when DAD failed")
a5d0dc810abf3d6b241777467ee1d6efb02575fc ("vti: flush x-netns xfrm cache when vti interface is removed")

Revision history for this message
Alexandre (totalworlddomination) wrote :

This kernel bug, is affecting all the container solutions also on Xenial / 16.04.1 up to 4.4.0-38.

Lots of people are hitting this on many distributions / container solutions:
https://github.com/docker/docker/issues/5618 (mentions Docker, LXC, kubernetes, etc.)
https://github.com/coreos/bugs/issues/254
...

Apparently this was fixed in 12.04 and now reintroduced in 16.04.

Once you get the following dmesg message:
unregister_netdevice: waiting for lo to become free. Usage count = 1
... The only fix is to reboot the host.

What I gathered from the docker issue tracking this, it seems a fix was merged in net-next very recently:
http://www.spinics.net/lists/netdev/msg393688.html

... and is in mainline now:
https://github.com/torvalds/linux/blob/master/net/ipv6/addrconf.c

As I wasn't sure this thread was tracking Xenial (I just added it to the list), I'm trying to figure out if Ubuntu is tracking this issue upstream and can backport it into 16.04's 4.4.0-? branch, as a lot of container use case start to break down because of this bug for the past weeks?

Thanks!

Revision history for this message
Alexandre (totalworlddomination) wrote :

Bug was reintroduced in Ubuntu Xenial's kernel.

Revision history for this message
Dan Streetman (ddstreet) wrote :

Alexandre, please see my last comment, that links to a test kernel PPA with EXACTLY that upstream commit, 751eb6b6042a596b0080967c1a529a9fe98dac1d ("ipv6: addrconf: fix dev refcont leak when DAD failed")

If you can reproduce this, please test with the PPA and report if it fixes the problem.

Revision history for this message
Dan Streetman (ddstreet) wrote :

Ah, you need a patched test kernel on xenial...I'll upload one to the PPA for testing.

Revision history for this message
Dan Streetman (ddstreet) wrote :

PPA updated with patched test kernels for xenial as well as trusty lts-xenial.
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

Revision history for this message
Paul van Schayck (paulvanschayck) wrote :

Dan, thanks for your work. I've tried your PPA with the included patch (4.4.0-36-generic #55hf1403152v20160916b1-Ubuntu). As I've also reported on the Github docker issue [1], this does not fix the issue for me.

Would it help you test again with this patch, and with your debug patches included?

[1] https://github.com/docker/docker/issues/5618#issuecomment-249149641

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-xenial (Ubuntu Trusty):
status: New → Confirmed
Changed in linux-lts-xenial (Ubuntu):
status: New → Confirmed
Revision history for this message
Dan Streetman (ddstreet) wrote :

> I've tried your PPA with the included patch (4.4.0-36-generic #55hf1403152v20160916b1-Ubuntu).
> As I've also reported on the Github docker issue [1], this does not fix the issue for me.

that's unfortunate.

> Would it help you test again with this patch, and with your debug patches included?

I suspect adding the debug would change the timing enough to work around the bug, as it did before. I'll take a deeper look to see if there is a different way to add debug.

tags: added: verification-done-vivid
removed: verification-needed-vivid
tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Alexandre (totalworlddomination) wrote :
Revision history for this message
Dan Streetman (ddstreet) wrote :

> Apparently the fix is in:
>
> https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22

Hopefully the commit does fix things, although Paul's comment above indicated it may not, at least not for everyone.

Revision history for this message
Alexandre (totalworlddomination) wrote :

Oh oops, didn't realize this was the patch with the 751eb6b6042a596b0080967c1a529a9fe98dac1d commit Paul mentionned.

Well, if that made it into Ubuntu's kernel before and including 4.4.0-42, it didn't fix it (I still get the bug on many machines & VMs).

Revision history for this message
Dan Streetman (ddstreet) wrote :

> Well, if that made it into Ubuntu's kernel before and including 4.4.0-42, it didn't fix it
> (I still get the bug on many machines & VMs).

No, it's not in the released xenial kernel yet, the upstream 4.4.22 stable commits are still in the xenial master-next branch kernel; it'll be a bit longer before xenial's normal kernel is updated with the patch.

Revision history for this message
Xav Paice (xavpaice) wrote :

From the logs it looks like the patch is now a part of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627730 which hit 4.4.0-46.67~14.04.1 (proposed) on 22nd Oct?

Revision history for this message
Dan Streetman (ddstreet) wrote :

> From the logs it looks like the patch is now a part of
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627730 which hit 4.4.0-46.67~14.04.1
> (proposed) on 22nd Oct?

yes, the patch is included in the 4.4.0-46.67 kernel (both linux-lts-xenial on trusty, and regular linux on xenial). anyone able to reproduce the problem with that kernel level?

Revision history for this message
Alexandre (totalworlddomination) wrote :

I was running 4.4.0-46 (not sure if different than the latest .67?) on October 27th and yesterday had to reboot for the "... count = 1" problem.

Was there previous 4.4.0-46 releases? If not, the bug is still present...

Otherwise, I'll confirm if it happens again, or if it doesn't within a few weeks.

Revision history for this message
costinel (costinel) wrote :

seeing this on xenial after this sequence of operations:

lxc stop run2
lxc delete run2
lxc copy run1 run2
lxc start run2 <- hangs here, with later dmesg

[337766.146479] unregister_netdevice: waiting for lo to become free. Usage count = 1
[337772.435786] INFO: task lxd:20665 blocked for more than 120 seconds.
[337772.435856] Tainted: P OE 4.4.0-47-generic #68-Ubuntu
[337772.435922] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[337772.436002] lxd D ffff88006b6cbcb8 0 20665 1 0x00000004
[337772.436006] ffff88006b6cbcb8 ffffffff821d0560 ffff880235aa8000 ffff8801b57d2580
[337772.436009] ffff88006b6cc000 ffffffff81ef5f24 ffff8801b57d2580 00000000ffffffff
[337772.436010] ffffffff81ef5f28 ffff88006b6cbcd0 ffffffff81830f15 ffffffff81ef5f20
[337772.436012] Call Trace:
[337772.436020] [<ffffffff81830f15>] schedule+0x35/0x80
[337772.436022] [<ffffffff818311be>] schedule_preempt_disabled+0xe/0x10
[337772.436024] [<ffffffff81832df9>] __mutex_lock_slowpath+0xb9/0x130
[337772.436026] [<ffffffff81832e8f>] mutex_lock+0x1f/0x30
[337772.436029] [<ffffffff8171f98e>] copy_net_ns+0x6e/0x120
[337772.436033] [<ffffffff810a148b>] create_new_namespaces+0x11b/0x1d0
[337772.436035] [<ffffffff810a15ad>] copy_namespaces+0x6d/0xa0
[337772.436038] [<ffffffff8107f135>] copy_process+0x905/0x1b70
[337772.436040] [<ffffffff81080530>] _do_fork+0x80/0x360
[337772.436044] [<ffffffff8120bf28>] ? SyS_access+0x1e8/0x230
[337772.436046] [<ffffffff810808b9>] SyS_clone+0x19/0x20
[337772.436048] [<ffffffff81834ff2>] entry_SYSCALL_64_fastpath+0x16/0x71

fwiw, at the time of stop, run2 had an additional IP address on dev lo, and at the time of copy, run1 also was running and had an additional ip address on dev lo

Revision history for this message
Paul van Schayck (paulvanschayck) wrote :

Dear Dan, user reshen on the github Docker issue thread [1] did some extensive testing using kernel 4.8.8. He has not been able to reproduce the issue using that version. I've also done testing using the ubuntu mainline kernel builds and also not been able to reproduce the issue anymore.

He has also pointed to two possible kernel patches responsible for the fix. Would it be possible for you to create a backport build of those again?

[1] https://github.com/docker/docker/issues/5618#issuecomment-262081488

Revision history for this message
Alexandre (totalworlddomination) wrote :

Update:
I've hit the issue with 4.4.0-47... now testing 4.4.0-49 for the next weeks.

Revision history for this message
Alexandre (totalworlddomination) wrote :

Seems like the fix isn't in or isn't working for this problem as I've just hit this issue again on 4.4.0-49, at least twice:
[434074.636075] unregister_netdevice: waiting for lo to become free. Usage count = 1

Since, I see a lot of people talking about 4.8+ kernels fixing this issue, is there a generic package (or future plans for one, say when 4.9 LTS comes out in a week or two) that will allow anyone to upgrade, but also keep up with updates without following versions by hand? (Like the linux-image-generic package)

cheers!

Revision history for this message
Alessandro Polverini (polve) wrote :

I reproduce the problem with LXC 2.0.6-1 and kernel 4.6.4

Revision history for this message
Dan Streetman (ddstreet) wrote :

I added the 2 commits referenced in the docker thread:
https://github.com/docker/docker/issues/5618#issuecomment-262081488

which is building right now in the test ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

kernel version 4.4.0-53.74+hf1403152v20161214b1 is the latest xenial kernel, plus those 2 patches.

As far as testing with the 4.8 kernel on xenial, you can use the kernel team's build ppa:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa

obviously kernels there are completely unsupported and may break everything, etc., etc., and should be used for testing/debug ONLY.

to install the yakkety 4.8 kernel on a xenial system, after adding the ppa to your system:

# sudo apt install linux-generic-hwe-16.04-edge

Revision history for this message
In , kevin (kevin-linux-kernel-bugs) wrote :
Download full text (10.0 KiB)

Hi,

I reproduced the issue using kernel with this patch:

(Log each refcnt increase and decrease for netdev with its function name)

https://lkml.org/lkml/2008/7/20/5

It shows the following logs (for loopback device inside container namespace) when the issue occured:

Jan 11 16:17:43 <hostname> kernel: [ 2196.943640] lo: dev_hold 1 rx_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943652] lo: dev_hold 2 netdev_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943654] lo: dev_hold 3 register_netdevice
Jan 11 16:17:43 <hostname> kernel: [ 2196.943658] lo: dev_hold 4 neigh_parms_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2196.943660] lo: dev_hold 5 inetdev_init

(I patched Docker to rename lo inside container to lodebug for ease of tracking)
Jan 11 16:17:43 <hostname> kernel: [ 2196.981741] lodebug: renamed from lo

Jan 11 16:17:43 <hostname> kernel: [ 2197.001771] lodebug: dev_hold 6 qdisc_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2197.001815] lodebug: dev_hold 7 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001824] lodebug: dev_hold 8 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001831] lodebug: dev_hold 9 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001836] lodebug: dev_hold 10 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001843] lodebug: dev_hold 11 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001849] lodebug: dev_hold 12 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001852] lodebug: dev_hold 13 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001856] lodebug: dev_hold 14 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.025451] lodebug: dev_put 14 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025473] lodebug: dev_put 13 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025475] lodebug: dev_put 12 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025477] lodebug: dev_put 11 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025480] lodebug: dev_put 10 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025482] lodebug: dev_put 9 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.414900] lodebug: dev_hold 9 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418634] lodebug: dev_hold 10 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418638] lodebug: dev_hold 11 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.445496] lodebug: dev_put 11 dst_destroy
Jan 11 16:17:44 <hostname> kernel: [ 2197.614240] lodebug: dev_hold 11 dst_init

(after __neigh_create function call, the reference count for loopback device is incorrectly increased to 13)

Jan 11 16:20:41 <hostname> kernel: [ 2374.898917] lodebug: dev_hold 13 __neigh_create
Jan 11 16:23:42 <hostname> kernel: [ 2555.900160] lodebug: dev_hold 14 dst_init
Jan 11 16:24:10 <hostname> kernel: [ 2583.573193] lodebug: dev_hold 15 dst_init
Jan 11 16:26:09 <hostname> kernel: [ 2702.992174] lodebug: dev_hold 16 dst_init
Jan 11 16:27:15 <hostname> kernel: [ 2769.188044] lodebug: dev_hold 17 dst_init
Jan 11 16:31:57 <hostname> kernel: [ 3050.821593] lodebug: dev_hold 18 dst_init
Jan 11 16:37:41 <hostname> kernel: [ 3394.962714] lodebug:...

Revision history for this message
Timo Furrer (tuxtimo) wrote :

I've hit the same issue today with 4.4.0-87-generic on a xenial.
Is there a confirmed fix for a particular released kernel for Ubuntu?

Revision history for this message
Timo Furrer (tuxtimo) wrote :

I've hit the same issue today with 4.4.0-87-generic on a xenial.
Is there a confirmed fix for a particular released kernel for Ubuntu?

And same for 4.10.0-28-generic

Revision history for this message
Dan Streetman (ddstreet) wrote :

Since this bug has been long-since marked 'fix released', I opened a new bug 1711407. Let's all please use that one for continued discussion of this problem.

Revision history for this message
In , stephen (stephen-linux-kernel-bugs) wrote :

(In reply to pegerto from comment #13)
> Hi,
>
> According to the docker issue is raise by creating destroying multiple
> containers, form my undertanding is a raise condition as this is not
> replicable with a single cpu vm.
>
> I reproduced the issue with CoreOS 1068.8.0 over VirtualBox, I can take a
> kernel dump from the hypervisor if this helps.
>
> Regards.

Remember that this upstream kernel bugzilla.
If any out of tree modules like VirtualBox are present then the bug is suspect and will be marked as INVALID.

Revision history for this message
In , de.techno (de.techno-linux-kernel-bugs) wrote :

Happens with 4.9 Debian kernel too.

Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , maitoor (maitoor-linux-kernel-bugs) wrote :

The system ubuntu16.0.4 don't fix it,and
$ uname -a
Linux hostname-test 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ docker version
Client:
 Version: 18.06.1-ce
 API version: 1.38
 Go version: go1.10.3
 Git commit: e68fc7a
 Built: Tue Aug 21 17:24:56 2018
 OS/Arch: linux/amd64
 Experimental: false

Server:
 Engine:
  Version: 18.06.1-ce
  API version: 1.38 (minimum version 1.12)
  Go version: go1.10.3
  Git commit: e68fc7a
  Built: Tue Aug 21 17:23:21 2018
  OS/Arch: linux/amd64
  Experimental: false
$ sudo dmesg |tail -20
[268985.837007] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[268995.856958] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269005.868864] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269015.876471] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269026.128173] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269036.143902] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269046.387621] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269056.411322] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269066.431065] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269076.438783] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269086.446489] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269096.490222] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269106.505820] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269116.545378] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269126.564839] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269136.568241] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269146.571673] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269156.827151] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269167.090594] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269177.330099] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1

Revision history for this message
In , jdelvare (jdelvare-linux-kernel-bugs) wrote :

(In reply to Kevin from comment #15)
> Jan 11 16:17:44 <hostname> kernel: [ 2197.614240] lodebug: dev_hold 11
> dst_init
>
> (after __neigh_create function call, the reference count for loopback device
> is incorrectly increased to 13)
>
> Jan 11 16:20:41 <hostname> kernel: [ 2374.898917] lodebug: dev_hold 13
> __neigh_create

This is hard to believe or explain. The only way to increase the counter is to call dev_hold(), and whenever dev_hold() is called, a message saying so would be logged. The only explanation I can think of would be if the memory area holding the per-cpu counters would have been accidentally overwritten (say by overrun of an array in nearby memory).

But a more likely explanation is that there is a missing line in the log, either due to copy and paste error, or because somehow the message never made it to the log buffer (are there still contexts in which printk doesn't work?)

Question: do you see the same apparent +2 counter increase whenever you hit the "waiting for <dev> to become free" error, or was it a one-time thing?

Changed in linux:
status: Confirmed → Incomplete
Dan Streetman (ddstreet)
Changed in linux-lts-xenial (Ubuntu Trusty):
status: Confirmed → Won't Fix
Changed in linux-lts-utopic (Ubuntu):
status: Confirmed → Won't Fix
Changed in linux-lts-xenial (Ubuntu):
status: Confirmed → Won't Fix
Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
Andreas Schildbach (schildbach) wrote :

I've got this issue on
```
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
# uname -a
Linux ultima 5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
```

The message is

```
kernel:[864004.039343] unregister_netdevice: waiting for lo to become free. Usage count = 1
```

tags: added: sts
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.