Bug #1403152 “unregister_netdevice: waiting for lo to become fre...” : Bugs : linux package : Ubuntu

Revision history for this message

In Linux Kernel Bug Tracker #81211, fallenblood (fallenblood-linux-kernel-bugs) wrote on 2014-07-27:

#100

Hey guys, sorry I don't have the time to put together a proper bug report, but the following GitHub thread has plenty of info about it.

https://github.com/docker/docker/issues/5618#issuecomment-50189996

Revision history for this message

In Linux Kernel Bug Tracker #81211, alan (alan-linux-kernel-bugs) wrote on 2014-07-29:

#101

This does not appear to be a bug, the kernel is merely waiting for all the users to drop the device. Something has a reference to it so it remains.

Revision history for this message

In Linux Kernel Bug Tracker #81211, fallenblood (fallenblood-linux-kernel-bugs) wrote on 2014-08-01:

#102

(In reply to Alan from comment #1)
> This does not appear to be a bug, the kernel is merely waiting for all the
> users to drop the device. Something has a reference to it so it remains.

This occurs with both Docker and VirtualBox. Both applications rely on a virtual network interface, and it happens once some amount of data (estimated less than a few hundred megabytes) has been transferred over that interface. It never happens with kernel 3.13.0-29-generic, but happens all the time with 3.14.3-rt4, so obviously the kernel is the decisive factor here; how is it not regression?

Revision history for this message

In Linux Kernel Bug Tracker #81211, alan (alan-linux-kernel-bugs) wrote on 2014-08-01:

#103

Virtualbox is out of tree modules so we don't care 8) - but it's useful data

First things to clear up given you've not filed much information yet

What is the reproducer ?
Does the message continue forever ?
Does it only occur on 3.14-rt not 3.14 standard ?
If it only happens on 3.14.3-rt, then does it happen on older -rt kernels ?

It's normal for this to occur if you drop a device that is in use for something, but it should very rapidly get cleaned up and go away. It's quite possible 3.13/3.14 timings make a difference and its just chance it worked before. Equally if its getting stuck forever that implies a race or refcount leak of some kind.

Revision history for this message

In Linux Kernel Bug Tracker #81211, fallenblood (fallenblood-linux-kernel-bugs) wrote on 2014-08-01:

#104

To reproduce the issue, install Docker (I'm alternatively using v0.11.1 or v1.0.0), spin up a container, and download a few hundred megabytes in it. On attempted system shutdown, you will see the message, and it ostensibly continues forever. You can see other people encountering this in the GitHub thread linked above. I haven't tried it with either 3.14 standard or other -rt kernels yet, but I'll try it out as soon as I get the time.

Revision history for this message

Brad Figg (brad-figg) wrote on 2014-12-16: Missing required logs.

#1

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1403152

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2014-12-16:

#2

Did this issue occur in a previous version of Ubuntu, or is this a new issue?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-vivid/

Changed in linux (Ubuntu):
importance:	Undecided → Medium

Revision history for this message

menoabels (meno-abels) wrote on 2014-12-16:

#3

rt-01-ham.dmesg Edit (79.8 KiB, text/plain)

I try to make it reproducable and figured out that the problem is related to use of these kind of
interfaces in a lxc container. The tunnel are working in the running lxc-container but if you stop or reboot the
lxc-container the kernel reports this

unregister_netdevice: waiting for lo to become free.....

here the ip link output in the container

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
3: ip6gre0: <NOARP> mtu 1448 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 brd 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
4: gt6nactr01: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/gre6 2a:4:f:89:00:05:17:09:00:00:00:00:00:02:00:01 peer 2a:4:f:8:00:0f:09:97:02:17:01:10:00:59:00:05

or the command which creates this kind of tunnels

ip -6 tunnel add gt6nactr01 mode ip6gre local 2a4:4483:5:1709::2:1 remote 2a4:4494:f:997:217:110:59:5
ip link set mtu 1500 dev gt6nactr01 up
ip addr add 2a04:4454:f:f003::1/64 dev gt6nactr01
ip addr add 169.254.193.1/30 dev gt6nactr01

i attach a dmesg output and hope that you can reproduce the problem.

thx

meno

Revision history for this message

menoabels (meno-abels) wrote on 2014-12-16:

#4

i tested with

3.18.0-031800-generic #201412071935 SMP Mon Dec 8 00:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

and the problem is still there

meno

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-01-15:

#5

Hi,

We're hitting this bug on the latest trusty kernel in a similar context of this docker issue, we also had this problem on lucid with a custom 3.8.11 kernel which seems to be more agressive than trusty but still happens:

https://github.com/docker/docker/issues/5618

In this issue an upstream kernel bug and a redhat bug are mentioned:

https://bugzilla.kernel.org/show_bug.cgi?id=81211
https://bugzilla.redhat.com/show_bug.cgi?id=880394

On the redhat bug there is an upstream commit that suposedly fix the problem:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4

Any chance to have this patch backported to the trusty kernel?

Thanks,

Rodrigo.

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-01-15:

#6

Kernel stack trace when lxc-start process hang:

[27211131.602770] INFO: task lxc-start:25977 blocked for more than 120 seconds.
[27211131.602785] Not tainted 3.13.0-40-generic #69-Ubuntu
[27211131.602789] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27211131.602795] lxc-start D 0000000000000000 0 25977 1 0x00000080
[27211131.602800] ffff8806ee73dd40 0000000000000286 ffff880474bdb000 ffff8806ee73dfd8
[27211131.602803] 0000000000014480 0000000000014480 ffff880474bdb000 ffffffff81cdb760
[27211131.602806] ffffffff81cdb764 ffff880474bdb000 00000000ffffffff ffffffff81cdb768
[27211131.602809] Call Trace:
[27211131.602821] [<ffffffff81723b69>] schedule_preempt_disabled+0x29/0x70
[27211131.602825] [<ffffffff817259d5>] __mutex_lock_slowpath+0x135/0x1b0
[27211131.602832] [<ffffffff811a2679>] ? __kmalloc+0x1e9/0x230
[27211131.602835] [<ffffffff81725a6f>] mutex_lock+0x1f/0x2f
[27211131.602840] [<ffffffff8161c2c1>] copy_net_ns+0x71/0x130
[27211131.602845] [<ffffffff8108f889>] create_new_namespaces+0xf9/0x180
[27211131.602848] [<ffffffff8108f983>] copy_namespaces+0x73/0xa0
[27211131.602851] [<ffffffff81065b16>] copy_process.part.26+0x9a6/0x16b0
[27211131.602854] [<ffffffff810669f5>] do_fork+0xd5/0x340
[27211131.602857] [<ffffffff810c8e8d>] ? call_rcu_sched+0x1d/0x20
[27211131.602859] [<ffffffff81066ce6>] SyS_clone+0x16/0x20
[27211131.602863] [<ffffffff81730089>] stub_clone+0x69/0x90
[27211131.602865] [<ffffffff8172fd2d>] ? system_call_fastpath+0x1a/0x1f
[27211131.602869] INFO: task lxc-start:26342 blocked for more than 120 seconds.
[27211131.602874] Not tainted 3.13.0-40-generic #69-Ubuntu

Joseph Salisbury (jsalisbury) on 2015-01-15

tags:	added: kernel-da-key trusty
Changed in linux (Ubuntu):
importance:	Medium → High
status:	Incomplete → Triaged

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2015-01-15:

#7

The patch mentioned in comment #5 was added to the mainline kernel as of 3.13-rc1, so it should already be in Trusty.

git describe --contains dcdfdf5
v3.13-rc1~7^2~16

Can you again test the latest mainline, which is now 3.19-rc4:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc4-vivid/

Also, is this a regression? Was there a prior kernel version that did not have this bug?

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-01-15:

#8

From the docker issue it seems that someone couldn't reproduce the bug when downgrading to kernel 3.13.0-32-generic, I can't validate this statement because kernels prior 3.13.0-35-generic has a regression that crashes my ec2 instance.

Also people testing kernel 3.14.0 couldn't reproduce the bug.

I will try to reproduce with the latest mainline kernel in next few hours.

Thanks,
Rodrigo.

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-02-03:

#9

Just got an instance with the kernel 3.16.0-29-generic #39-Ubuntu (linux-lts-utopic) hitting this bug in production, we don't have a reliable reproducer so the only way for me to validate is to boot this kernel in production and wait the bug to happen.

Is there anything I can get from an instance showing the bug that may help ?

Revision history for this message

In Linux Kernel Bug Tracker #81211, rodrigo.vaz (rodrigo.vaz-linux-kernel-bugs) wrote on 2015-02-03:

#105

FWIW we got an instance with kernel 3.16.0-29-generic #39-Ubuntu hitting this bug, we don't use docker but we run lxc, we also don't have a reliable reproducer and the only way to notice this kernel is also affected was too boot it in production and wait the bug to happen (usually when we have container stoping and starting).

Revision history for this message

Chris J Arges (arges) wrote on 2015-02-03:

#10

At this point a proper reproducer would help the most. This way we could get crashdumps and other useful information that may not be possible in production environments.

Reading through the bug the best description I can see is:
1) Start LXC container
2) Download > 100MB of data
3) Stop LXC container
4) Repeat until you see the kernel message

For example, I can script this a bit:
#!/bin/bash
# setup
sudo lxc-create -t download -n u1 -- -d ubuntu -r trusty -a amd64
fallocate -l 1000000 file
IP=$(sudo lxc-attach -n u1 -- hostname -I)
# test
while true; do
  sudo lxc-start --name u1 -d
  ssh ubuntu@$IP rm file
  scp file ubuntu@$IP:
  sudo lxc-stop --name u1
done

Running this on my machine doesn't show a failure. So I suspect there are other variables involved. Please let us know other relevant info that would help isolate a reproducer for this.
Thanks,

Changed in linux (Ubuntu):
assignee:	nobody → Chris J Arges (arges)

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-02-05:

#11

Some additional info:

- The stack trace is always the same posted above and the break point seems to be copy_net_ns every time.
- The process that hangs is always lxc-start in every occurrence that I was able to check.

Rodrigo.

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-02-09:

#12

I left a couple instances running with the mainline kernel (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc7-vivid/) during the weekend, it took more time to see the bug on the mainline kernel but this morning one out of ten instances had the same problem so I'm assuming mainline is also affected.

Kernel version 3.19.0-031900rc7-generic

Revision history for this message

menoabels (meno-abels) wrote on 2015-02-09: Re: [Bug 1403152] Re: unregister_netdevice: waiting for lo to become free. Usage count

#13

Download full text (3.5 KiB)

Hey,

here is a discussion about the reproducablity. I'm wrote very early in
these thread that
if I set the following network config in a container

ip -6 tunnel add gt6nactr01 mode ip6gre local 2a4:4483:5:1709::2:1
remote 2a4:4494:f:997:217:110:59:5
ip link set mtu 1500 dev gt6nactr01 up
ip addr add 2a04:4454:f:f003::1/64 dev gt6nactr01
ip addr add 169.254.193.1/30 dev gt6nactr01

it will hang on reboot, try it again just now an it works/halt! There
is also a other possiblity to stop a lxc with the same result "waiting
for lo to become free".
And this is if you mount a nfs server with a lxc container, you have
to change something apparmor but than it works perfectly until you try
to reboot the lxc container.

Is there any reason why not try to fix these reproducable problems first?

cheers

meno

On Mon, Feb 9, 2015 at 6:50 PM, Rodrigo Vaz <email address hidden> wrote:
> I left a couple instances running with the mainline kernel
> (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc7-vivid/) during
> the weekend, it took more time to see the bug on the mainline kernel but
> this morning one out of ten instances had the same problem so I'm
> assuming mainline is also affected.
>
> Kernel version 3.19.0-031900rc7-generic
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1403152
>
> Title:
> unregister_netdevice: waiting for lo to become free. Usage count
>
> Status in The Linux Kernel:
> Unknown
> Status in linux package in Ubuntu:
> Triaged
> Status in linux source package in Trusty:
> New
> Status in linux source package in Utopic:
> New
>
> Bug description:
> I currently running trusty latest patches and i get on these hardware
> and software:
>
> Ubuntu 3.13.0-43.72-generic 3.13.11.11
>
> processor : 7
> vendor_id : GenuineIntel
> cpu family : 6
> model : 77
> model name : Intel(R) Atom(TM) CPU C2758 @ 2.40GHz
> stepping : 8
> microcode : 0x11d
> cpu MHz : 2400.000
> cache size : 1024 KB
> physical id : 0
> siblings : 8
> core id : 7
> cpu cores : 8
> apicid : 14
> initial apicid : 14
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms
> bogomips : 4799.48
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> somehow reproducable the subjected error, and lxc is working still but
> not more managable until a reboot.
>
> managable means every command hangs.
>
> I saw there are alot of bugs but they seams to relate to older v...

Hey,

here is a discussion about the reproducablity. I'm wrote very early in
these thread that
if I set the following network config in a container

ip -6 tunnel add gt6nactr01 mode ip6gre local 2a4:4483:5:1709::2:1
remote 2a4:4494:f:997:217:110:59:5
ip link set mtu 1500 dev gt6nactr01 up
ip addr add 2a04:4454:f:f003::1/64 dev gt6nactr01
ip addr add 169.254.193.1/30 dev gt6nactr01

it will hang on reboot, try it again just now an it works/halt! There
is also a other possiblity to stop a lxc with the same result "waiting
for lo to become free".
And this is if you mount a nfs server with a lxc container, you have
to change something apparmor but than it works perfectly until you try
to reboot the lxc container.

Is there any reason why not try to fix these reproducable problems first?

cheers

meno

On Mon, Feb 9, 2015 at 6:50 PM, Rodrigo Vaz <1403152@bugs.launchpad.net> wrote:
> I left a couple instances running with the mainline kernel
> (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc7-vivid/) during
> the weekend, it took more time to see the bug on the mainline kernel but
> this morning one out of ten instances had the same problem so I'm
> assuming mainline is also affected.
>
> Kernel version 3.19.0-031900rc7-generic
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1403152
>
> Title:
>   unregister_netdevice: waiting for lo to become free. Usage count
>
> Status in The Linux Kernel:
>   Unknown
> Status in linux package in Ubuntu:
>   Triaged
> Status in linux source package in Trusty:
>   New
> Status in linux source package in Utopic:
>   New
>
> Bug description:
>   I currently running trusty latest patches and i get on these hardware
>   and software:
>
>   Ubuntu 3.13.0-43.72-generic 3.13.11.11
>
>   processor     : 7
>   vendor_id     : GenuineIntel
>   cpu family    : 6
>   model         : 77
>   model name    : Intel(R) Atom(TM) CPU  C2758  @ 2.40GHz
>   stepping      : 8
>   microcode     : 0x11d
>   cpu MHz               : 2400.000
>   cache size    : 1024 KB
>   physical id   : 0
>   siblings      : 8
>   core id               : 7
>   cpu cores     : 8
>   apicid                : 14
>   initial apicid        : 14
>   fpu           : yes
>   fpu_exception : yes
>   cpuid level   : 11
>   wp            : yes
>   flags         : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer aes rdrand lahf_lm 3dnowprefetch arat epb dtherm tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms
>   bogomips      : 4799.48
>   clflush size  : 64
>   cache_alignment       : 64
>   address sizes : 36 bits physical, 48 bits virtual
>   power management:
>
>   somehow reproducable the subjected error, and lxc is working still but
>   not more managable until a reboot.
>
>   managable means every command hangs.
>
>   I saw there are alot of bugs but they seams to relate to older version
>   and are closed, so i decided to file a new one?
>
>   I run alot of machine with trusty an lxc containers but only these kind of machines produces these errors, all
>   other don't show these odd behavior.
>
>   thx in advance
>
>   meno
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linux/+bug/1403152/+subscriptions

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-02-09:

#14

Meno,

I just tried your testcase where you described adding an ipv6gre to the container and rebooting it, couldn't reproduce the netdev_hung problem so far, do you mind sharing specific details or even a script that will reproduce the problem?

Mounting an NFS share on my containers is not a common activity from our users and still we see the problem happening on multiple instances.

Regards,
Rodrigo.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-02-18:

#15

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Trusty):
status:	New → Confirmed
Changed in linux (Ubuntu Utopic):
status:	New → Confirmed

Chris J Arges (arges) on 2015-02-20

Changed in linux (Ubuntu):
assignee:	Chris J Arges (arges) → Stefan Bader (smb)

Revision history for this message

In Linux Kernel Bug Tracker #81211, fxposter (fxposter-linux-kernel-bugs) wrote on 2015-03-23:

#106

I've got the same problem with latest docker and both Debian 7 and 8 with 3.16(.7) kernels: https://github.com/docker/docker/issues/9605#issuecomment-85025729. This happens once in a couple of days and the only way to continue running docker containers on this server is to force reboot.

Revision history for this message

Jordan Curzon (curzonj) wrote on 2015-03-23:

#17

I'm working with Rodrigo Vaz and we've found some details about our occurrence of this issue using systemtap.

rt_cache_route places a dst_entry struct into a fib_nh struct as the nh_rth_input. Occasionally the reference counter on that dst is not decremented by the time free_fib_info_rcu is called on the fib during container teardown. In that case free_fib_info_rcu doesn't call dst_destroy and dev_put is not called on the lo interface of the container. The only situation where we've seen this is when A) the fib_nh->nh_dev points to the eth0 interface of the container, B) the dst is part of an rtable struct where rt-> rt_is_input==1, and C) the dst points to the lo interface of the container. The dst is cached in the fib only once and never replaced and then thousands of dst_hold/dst_release calls are made on the dst for connections and socket buffers.

We have only seen this so far on containers making lots of outbound connections. It doesn't appear to depend on the lifetime of the container, some are only alive for 30min and others are alive for 24hrs. The issue occurs when you try to destroy the container because that is when the fib is freed. We don't know when or where the dst ref_cnt becomes incorrect.

We don't know how to reproduce the issue.

Revision history for this message

Jordan Curzon (curzonj) wrote on 2015-03-24:

#18

To add to my comment just above, we have found a workload that is not under own our control (which limits what we can do with it, including sharing it) but which exacerbates the issue on our systems. This workload causes the issue less than 5% of the time and that gives us the chance to look at what callers are using dst_release on the dst in question (the one that meets conditions A, B, and C; "ABC dst"). To clarify, failure at this point for us is when the dst->ref_cnt=1 during free_fib_info_rcu, success is when dst->ref_cnt=0 and free_fib_info_rcu calls dst_destroy on it's nh_rth_input member.

In both failure and non-failure scenarios the only two callers we see on are a single call to ipv4_pktinfo_prepare, and many calls to inet_sock_destruct, and tcp_data_queue. That likely eliminates a unique function call to dst_release that can be identified as missing in failure scenarios. My interpretation is that a single call to inet_sock_destruct or tcp_data_queue is failing to occur when in the failure scenario.

Revision history for this message

Stefan Bader (smb) wrote on 2015-03-25:

#19

Jordan, which kernel version are you using to make your observations (thanks a lot for those, btw).

Revision history for this message

Steve Conklin (sconklin) wrote on 2015-03-26:

#20

Answering for Jordan

# uname -a
Linux ip-10-30-153-210 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
# dpkg -l | grep linux
ii libselinux1:amd64 2.2.2-1ubuntu0.1 amd64 SELinux runtime shared libraries
ii linux-headers-3.13.0-40 3.13.0-40.69 all Header files related to Linux kernel version 3.13.0
ii linux-headers-3.13.0-40-generic 3.13.0-40.69 amd64 Linux kernel headers for version 3.13.0 on 64 bit x86 SMP
ii linux-headers-generic 3.13.0.40.47 amd64 Generic Linux kernel headers
ii linux-headers-virtual 3.13.0.40.47 amd64 Transitional package.
ii linux-image-3.13.0-40-generic 3.13.0-40.69 amd64 Linux kernel image for version 3.13.0 on 64 bit x86 SMP
ii linux-image-virtual 3.13.0.40.47 amd64 This package will always depend on the latest minimal generic kernel image.
ii linux-libc-dev:amd64 3.13.0-43.72 amd64 Linux Kernel Headers for development
ii linux-virtual 3.13.0.40.47 amd64 Minimal Generic Linux kernel and headers
ii util-linux 2.20.1-5.1ubuntu20.3 amd64 Miscellaneous system utilities

Revision history for this message

Jordan Curzon (curzonj) wrote on 2015-03-27:

#21

An addition to my #18 comment above. I forgot that I had to remove logging on calls from skb_release_head_state to dst_release due to performance impact of the number of calls. In both failure and success scenarios, calls to dst_release by skb_release_head_state are also present.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-04-17:

#22

I am able to reliably recreate this on both Ubuntu 14.04 with kernel 3.13.0-37-generic and Ubuntu 14.10 with kernel 3.16.0-23-generic. I have test code that can hit the bug, but it's not cleaned up enough to share. The steps to repro are:

Start a container (COW clone using overlayfs from a base/parent Ubuntu 14.04 container). Start clone container, mount an NFS share (remote from machine hosting container) to /mnt in container, create a file on the NFS share in container (dd if=/dev/zero of=/mnt/foo bs=1024 count=100000), stop container, destroy container.

I have a script that performs the above sequence over and over again, running 5 such clone containers in parallel at any given time. It only takes about 30s for the kernel to get wedged with:

unregister_netdevice: waiting for lo to become free. Usage count = 1

This same script is fine if you do scp instead of writing over NFS. It seems like NFS inside the container is somehow involved here.

Revision history for this message

Steve Conklin (sconklin) wrote on 2015-04-28:

#23

David, would you like help getting the reproducer code ready to share?

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-05:

#24

Script to reproduce the bug Edit (2.6 KiB, text/x-python)

Ok, I've got a minimal reproducer ready. It causes the bug when I run it on Ubuntu 14.04 with 3.13.0-24-generic and LXC version 1.0.7-0ubuntu0.1 0. It's python, and the script has some documentation describing its usage as well as the prerequisites. You should run it as root via sudo.

Prereqs require you to install a base ubuntu trusty container, install nfs-common in that container, and modify lxc-default to allow containers to perform NFS mounts. You will also need an NFS share that your containers can mount to scribble data to. The address of this share is passed as the third parameter to the script in the form of IP_ADDRESS:/path/to/nfs/share.

My typical usage looks like:

sudo ./reproducer.py 5 10 IPADDRESS:/path/to/nfs/share

Which starts 5 threads that in each of 10 iterations, creates a container, mounts the nfs share to /mnt, then dd's over some zeros, umounts /mnt, stops, and destroys the container. I can reliably hit the unregister_netdevice hang within 1 minute of running this.

~Dave

Chris J Arges (arges) on 2015-05-06

Changed in linux (Ubuntu):
assignee:	Stefan Bader (smb) → Chris J Arges (arges)

Chris J Arges (arges) on 2015-05-06

Changed in linux (Ubuntu):
status:	Triaged → In Progress

Revision history for this message

Chris J Arges (arges) wrote on 2015-05-06:

#25

So far I haven't been able to reproduce this with 3.13.0-49, nor 3.13.0-37.

My setup is as follows:
Two vms, one with nfs-server and another for reproducer.

nfs-server setup with these commands:
  apt-get install nfs-kernel-server
  mkdir /export
  chown 777 /export
  echo "/export *(rw,no_root_squash,fsid=0,no_subtree_check)" > /etc/exports
  service nfs-kernel-server restart

Reproducer VM has LXC container that is bridged to the libvirt bridge.
  apt-get update
  apt-get install lxc
  lxc-create -t download -n base -- -d ubuntu -r trusty -a amd64
Then edit /var/lib/lxc/base/config to use:
  lxc.aa_profile = unconfined
  lxc.network.link = br0

Then run:
sudo ./reproducer.py 5 10 192.168.122.241:/export

I tried running this many times with no luck. Any other variables I need to consider?

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-07:

#26

Here's the setup I use: run the reproducer in a 14.04 server VM (Linux ubuntu 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux), VM is bridged to host machine's en0. I am using VMware Fusion on OSX 10.10 host.

Containers inside the VM use NAT via lxcbr0 (lxc.network.link = lxcbr0), getting 10. addresses via dhcp. My NFS server is on a different physical machine than the one running the reproducer VM. I am also not using unconfined profile, but rather modifying the default to allow NFS mounting.

I'll see if I can get it repro'ing using a two VM setup that you describe. If so, worst case I could send you the VMs.

Revision history for this message

Chris J Arges (arges) wrote on 2015-05-07:

#27

Ok, moved to two real machines and I can repro there:
[ 3797.901145] unregister_netdevice: waiting for lo to become free. Usage count = 1

Will continue to look at this.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-07:

#28

Great, thanks Chris! Let me know if I can help further.

Revision history for this message

Chris J Arges (arges) wrote on 2015-05-11:

#29

This is fixed in v3.18 and broken in v3.17. Doing a reverse bisect on this to determine which patch(es) fix this issue.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-11:

#30

Ok. For what it's worth, we still see the problem on Ubuntu 15.04 running v3.19. However, the problem is much harder to trigger and the attached reproducer script does not trigger the problem there, so I don't have code to reliably repro it on 3.19.

Revision history for this message

Chris J Arges (arges) wrote on 2015-05-11:

#31

Dave,
Good to know. Hopefully the bisect will implicate the right lines of code in order to find a v3.19+ fix.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-11:

#32

Agreed. Thanks for pushing this forward Chris.

Revision history for this message

Federico Alves (h-sales) wrote on 2015-05-12:

#33

It happens to me on Centos 7, with kernel 4.0.1-1.el7.elrepo.x86_64
Every time I reboot a container, without any particular heavy data load, I ge
[630362.513630] hrtimer: interrupt took 5912836 ns
[727851.858312] device eth0 left promiscuous mode
[727853.622084] eth0: renamed from mcCDGW73
[727853.632989] device eth0 entered promiscuous mode
[728077.625688] unregister_netdevice: waiting for lo to become free. Usage count = 1
[728087.655832] unregister_netdevice: waiting for lo to become free. Usage count = 1
[728097.686066] unregister_netdevice: waiting for lo to become free. Usage count = 1

So this is not fixed in 4.X

Revision history for this message

Chris J Arges (arges) wrote on 2015-05-15:

#34

The result of the reverse bisect between v3.17..v3.18 with the reproducer was:
# first bad commit: [34666d467cbf1e2e3c7bb15a63eccfb582cdd71f] netfilter: bridge: move br_netfilter out of the core
If I backport this patch plus 7276ca3f on top of v3.17 I no longer get the hang with the simple reproducer (although I suspect a more elaborate reproducer would still trigger the issue).

This isn't a fix because we obviously have issues in later kernels. The 'unregister_netdevice' message could occur for different code paths since there could potentially be many paths that modify the refcnt for the net_device. I'm going to track who is calling 'dev_put' and 'dev_hold' and figure out which code patch is not freeing before we start unregistering the device. I'll focus in on the reproducer I have for now, since this seems similar to the 'shutdown containers with network connections' case.

In addition I'm doing a bisect between earlier versions to see where the 'regression' occured. 3.11 seems to pass without regression for 5 threads / 50 iterations.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-15:

#35

Great progress, thanks Chris. I did some initial instrumentation of dev_put and dev_hold calls earlier, dumping out the call stack and comparing a good run to a bad one. Nothing definitive yet, and it's bit tricky to match up on an SMP machine, but one thing stood out: many of the bad traces had UDP in the mix (for example, calls to __udp4_lib_mcast_deliver and __udp4_lib_rcv). Maybe useful information, maybe not.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-05-30:

#36

Any updates?

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-02:

#37

Dave,
Working on this now. I tried a bisect between v3.12..v3.13 where it seemed that 3.12 worked better, but it wasn't conclusive. I tried looking at dev_put dev_hold via systemtap, but that seems to modify the timing sufficiently. Looking at comments from Jordan Curzon to see if I can start instrumenting just those sections of the kernel to track down where we're failing to decrement the reference. I'm tracing the NETDEV_UNREGISTER notifier calls now to see if that leads to something conclusive.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-06-02:

#38

Sounds good, thanks Chris.

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-06:

#39

When running the reproducer, if I mount the nfs drive using -o vers=3,proto=udp (udp instead of tcp), I cannot get any failures when running 10 threads 150 iterations multiple times.

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-09:

#40

Additional Experiments:
- No NAT, lxc bridges to VM eth0, which is bridged to host machine br0
Still fails
- Let timeout on unregister_notifier keep going
Still running after an hour, can't create new containers (original issue)
sudo conntrack -{L,F} hangs...
- Purge conntrack flowtable on host right before unmount
Still fails
- Look at conntrack table on host when its stuck
Had issues with this because of hang

Next steps:
- Track down TCP connection that's being held open

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-12:

#41

A few more results:
- I can track the dst_entry holding the dev causing the issue.
- I can reproduce on v3.19 kernels by first using 'modprobe br_netfilter'.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-06-14:

#42

Really great progress! Sounds like you're close to finding a fix. Thanks Chris.

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-17:

#43

Steve saw this patch: e53376bef2cd97d3e3f61fdc677fb8da7d03d0da as a potential fix. For completeness I tested a backport to 3.13 (since its in 3.14+). The same failure mode occurs with this patch.

Disabling sysctl_ip_early_demux still causes a failure.

To work around this issue, if you don't need CONFIG_BRIDGE_NETFILTER you can disable this and error does not seem to occur anymore. (Which follows from earlier comments.)

Revision history for this message

Joe Stringer (joestringer) wrote on 2015-06-17:

#44

Just chiming in here, I contacted Rodrigo off-list and was verging towards that same patch. More below.

I suspect there's two issues here with very similar symptoms. In
particular post #8 which mentions people reporting that 3.14 improves
the situation.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/8

I've been chasing a bug in 3.13 with docker containers and connection
tracking which is fixed in 3.14, by this patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e53376bef2cd97d3e3f61fdc677fb8da7d03d0da

Note that the commit message for the above commit fixes a different
issue, but I've been able to produce issues of the nature in this thread
(hung docker / ip netns add commands like in post #6) before applying
this patch, but cannot reproduce after.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/6

In the issue that I face, I can find a kworker thread using up an entire
core, and when I cat /proc/$pid/stack I see this:

<ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0
[<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack]
[<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170
[nf_conntrack]
[<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack]
[<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0
[<ffffffffbe084991>] process_one_work+0x171/0x470
[<ffffffffbe08563b>] worker_thread+0x11b/0x3a0
[<ffffffffbe08bb82>] kthread+0xd2/0xf0
[<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff

The kworker is looping forever and failing to clean up conntrack state.
All the while, it holds the global netns lock. Given that I've bisected
to the commit linked above which is to do with refcounting, I suspect
that borked refcounting on conntrack entries makes them impossible to
properly free/destroy, which prevents this worker from cleaning up the
namespace, which then goes on to prevent anything else from interacting
with namespaces (add/delete/etc).

Just chiming in here, I contacted Rodrigo off-list and was verging towards that same patch. More below.

I suspect there's two issues here with very similar symptoms. In
particular post #8 which mentions people reporting that 3.14 improves
the situation.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/8

I've been chasing a bug in 3.13 with docker containers and connection
tracking which is fixed in 3.14, by this patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e53376bef2cd97d3e3f61fdc677fb8da7d03d0da

Note that the commit message for the above commit fixes a different
issue, but I've been able to produce issues of the nature in this thread
(hung docker / ip netns add commands like in post #6) before applying
this patch, but cannot reproduce after.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152/comments/6

In the issue that I face, I can find a kworker thread using up an entire
core, and when I cat /proc/$pid/stack I see this:

<ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0
[<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack]
[<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170
[nf_conntrack]
[<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack]
[<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60
[<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0
[<ffffffffbe084991>] process_one_work+0x171/0x470
[<ffffffffbe08563b>] worker_thread+0x11b/0x3a0
[<ffffffffbe08bb82>] kthread+0xd2/0xf0
[<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff

The kworker is looping forever and failing to clean up conntrack state.
All the while, it holds the global netns lock. Given that I've bisected
to the commit linked above which is to do with refcounting, I suspect
that borked refcounting on conntrack entries makes them impossible to
properly free/destroy, which prevents this worker from cleaning up the
namespace, which then goes on to prevent anything else from interacting
with namespaces (add/delete/etc).

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-17:

#45

Joe,
I've created bug 1466135 to handle updating our 3.13 to fix your issue. Please add any information to that bug that would be useful in verifying the fix once it goes through our stable updates process. Thanks!

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-19:

#46

A summary of the bug so far:

Occasionally starting and stopping many containers with network traffic may
result in new containers being unable to start due to the inability to create
new network namespaces.

The following message repeats in the kernel log until reboot.

unregister_netdevice: waiting for lo to become free. Usage count = 1

Eventually when creating a new container this hung task backtrace occurs:

  schedule_preempt_disabled+0x29/0x70
  __mutex_lock_slowpath+0x135/0x1b0
  ? __kmalloc+0x1e9/0x230
  mutex_lock+0x1f/0x2f
  copy_net_ns+0x71/0x130
  create_new_namespaces+0xf9/0x180
  copy_namespaces+0x73/0xa0
  copy_process.part.26+0x9a6/0x16b0
  do_fork+0xd5/0x340
  ? call_rcu_sched+0x1d/0x20
  SyS_clone+0x16/0x20
  stub_clone+0x69/0x90
  ? system_call_fastpath+0x1a/0x1f

The following conditions I've been able to test:
- If CONFIG_BRIDGE_NETFILTER is disabled this problem does not occur.
- If net.bridge.bridge-nf-call-iptables is disabled, this problem does not occur.
- This problem can happen on single processor machines
- This problem can happen with IPv6 disabled
- This problem can happen with xt_conntrack enabled.

The unregister_netdevice warning always waits on lo. It always has reg_state
set to NETREG_UNREGISTERING. This follows that the device has been through the
unregister_netdevice_many path and is being unregistered. This path is ultimately
where net_mutex is locked and thus prevents copy_net_ns from executing.

In addition when the unregister netdevice warning happens, a crashdump reveals
the dst_busy_list always contains a dst_entry that references the device above.
This dst_entry has already been through ___dst_free since it has already been
marked DST_OBSOLETE_DEAD. 'dst->ops' is always set to ipv4_dst_ops.
dst->callback_head.next is NULL, and the next pointer is NULL. Use is also zero.

We can trace where the dst_entry is trying to be freed. When free_fib_info_rcu
is called, if nh_rth_input is set, it eventually calls dst_free. Because there
is still a refcnt held, it does not get immediately destroyed and continues on
to __dst_free. This puts the dst into the dst_garbage list, which is then
examined periodically by the dst_gc_work worker thread. Each time it tries to
clean it up it fails because it still has a non-zero refcnt.

The faulty dst_entry is being allocated via ip_rcv..ip_route_input_noref. In
addition this dst is most likely being held in response to a new packet via the
ip_rcv..inet_sk_rx_dst_set path.

A summary of the bug so far:

Occasionally starting and stopping many containers with network traffic may
result in new containers being unable to start due to the inability to create
new network namespaces.

The following message repeats in the kernel log until reboot.

unregister_netdevice: waiting for lo to become free. Usage count = 1

Eventually when creating a new container this hung task backtrace occurs:

schedule_preempt_disabled+0x29/0x70
  __mutex_lock_slowpath+0x135/0x1b0
  ? __kmalloc+0x1e9/0x230
  mutex_lock+0x1f/0x2f
  copy_net_ns+0x71/0x130
  create_new_namespaces+0xf9/0x180
  copy_namespaces+0x73/0xa0
  copy_process.part.26+0x9a6/0x16b0
  do_fork+0xd5/0x340
  ? call_rcu_sched+0x1d/0x20
  SyS_clone+0x16/0x20
  stub_clone+0x69/0x90
  ? system_call_fastpath+0x1a/0x1f

The following conditions I've been able to test:
- If CONFIG_BRIDGE_NETFILTER is disabled this problem does not occur.
- If net.bridge.bridge-nf-call-iptables is disabled, this problem does not occur.
- This problem can happen on single processor machines
- This problem can happen with IPv6 disabled
- This problem can happen with xt_conntrack enabled.

The unregister_netdevice warning always waits on lo. It always has reg_state
set to NETREG_UNREGISTERING. This follows that the device has been through the
unregister_netdevice_many path and is being unregistered. This path is ultimately
where net_mutex is locked and thus prevents copy_net_ns from executing.

In addition when the unregister netdevice warning happens, a crashdump reveals
the dst_busy_list always contains a dst_entry that references the device above.
This dst_entry has already been through ___dst_free since it has already been
marked DST_OBSOLETE_DEAD. 'dst->ops' is always set to ipv4_dst_ops.
dst->callback_head.next is NULL, and the next pointer is NULL. Use is also zero.

We can trace where the dst_entry is trying to be freed. When free_fib_info_rcu
is called, if nh_rth_input is set, it eventually calls dst_free. Because there
is still a refcnt held, it does not get immediately destroyed and continues on
to __dst_free. This puts the dst into the dst_garbage list, which is then
examined periodically by the dst_gc_work worker thread. Each time it tries to
clean it up it fails because it still has a non-zero refcnt.

The faulty dst_entry is being allocated via ip_rcv..ip_route_input_noref. In
addition this dst is most likely being held in response to a new packet via the
ip_rcv..inet_sk_rx_dst_set path.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-06-22:

#47

I can confirm that disabling CONFIG_BRIDGE_NETFILTER and/or disabling net.bridge.bridge-nf-call-iptables makes me unable to repro on Ubuntu 14.04 running 3.13.0-53-generic.

Revision history for this message

Chris J Arges (arges) wrote on 2015-06-22:

#48

Email sent to netdev about what we've found so far with this bug:
http://www.spinics.net/lists/netdev/msg333868.html

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-06-30:

#49

Looks like no progress on the netdev side yet?

Revision history for this message

Chris J Arges (arges) wrote on 2015-07-06:

#50

So it seems like disabling CONFIG_BRIDGE_NETFILTER is a red herring, I've been able to reproduce with this off as well on 3.19.

Chris J Arges (arges) on 2015-07-14

Changed in linux (Ubuntu Trusty):
assignee:	nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Utopic):
assignee:	nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Vivid):
assignee:	nobody → Chris J Arges (arges)
Changed in linux (Ubuntu Trusty):
importance:	Undecided → Critical
importance:	Critical → Medium
Changed in linux (Ubuntu Utopic):
importance:	Undecided → Medium
Changed in linux (Ubuntu Vivid):
importance:	Undecided → Medium
status:	New → In Progress
Changed in linux (Ubuntu Utopic):
status:	Confirmed → In Progress
Changed in linux (Ubuntu Trusty):
status:	Confirmed → In Progress
Changed in linux (Ubuntu):
importance:	High → Medium
description:	updated

Revision history for this message

Chris J Arges (arges) wrote on 2015-07-14:

#51

Ok so I have reports that NFS may also be a red herring, however I've found a patch that seems to greatly mitigate issues when testing with the reproducer I was provided. So I'm going to provide a test build with that for 3.13, 3.16, 3.19 and users can give me feedback if that helps.

In addition, if you still can reproduce the issue with this fix, it would be helpful to know more about your test case. I'd like to handle any of these issues as a separate bug.

Revision history for this message

Chris J Arges (arges) wrote on 2015-07-14:

#52

Build is here:
http://people.canonical.com/~arges/lp1403152/

Please test and give feedback. Thanks!

Chris J Arges (arges) on 2015-07-15

Changed in linux (Ubuntu):
assignee:	Chris J Arges (arges) → nobody
status:	In Progress → Fix Released

Brad Figg (brad-figg) on 2015-07-15

Changed in linux (Ubuntu Trusty):
status:	In Progress → Fix Committed
Changed in linux (Ubuntu Utopic):
status:	In Progress → Fix Committed
Changed in linux (Ubuntu Vivid):
status:	In Progress → Fix Committed

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-07-27:

#53

Chris, I will start testing this build and let you know how it goes. Thanks.

Revision history for this message

Dave Richardson (pudnik019) wrote on 2015-07-27:

#54

What is the patch?

Revision history for this message

Chris J Arges (arges) wrote on 2015-07-27:

#55

Dave,
I added it to the description of the bug:
commit de84d89030fa4efa44c02c96c8b4a8176042c4ff
--chris

Brad Figg (brad-figg) on 2015-08-05

no longer affects:	linux (Ubuntu Utopic)
no longer affects:	linux-lts-utopic (Ubuntu Vivid)

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-08-05:

#56

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:

added: verification-needed-trusty

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-08-05:

#57

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-vivid' to 'verification-done-vivid'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags:

added: verification-needed-vivid

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-08-17:

#58

Download full text (16.0 KiB)

This bug was fixed in the package linux-lts-utopic - 3.16.0-46.62~14.04.1

---------------
linux-lts-utopic (3.16.0-46.62~14.04.1) trusty; urgency=low

[ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1483790
  * SAUCE: REBASE-FIXUP: debian/scripts/misc/getabis: fix abi directory

[ Upstream Kernel Changes ]

* Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux-lts-utopic (3.16.0-46.61~14.04.1) trusty; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
- LP: #1478986

[ Brad Figg ]

* SAUCE: REBASE-FIXUP: debian/rules.d/1-maintainer.mk: Need '~14.04.1' as
part of the version when looking for the baseCommit for printchanges

[ Upstream Kernel Changes ]

  * Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1478852
  * storvsc: use cmd_size to allocate per-command data
    - LP: #1445195
  * storvsc: fix a bug in storvsc limits
    - LP: #1445195
  * Drivers: hv: vmbus: Support a vmbus API for efficiently sending page
    arrays
    - LP: #1445195
  * scsi: storvsc: Increase the ring buffer size
    - LP: #1445195
  * scsi: storvsc: Size the queue depth based on the ringbuffer size
    - LP: #1445195
  * scsi: storvsc: Always send on the selected outgoing channel
    - LP: #1445195
  * scsi: storvsc: Retrieve information about the capability of the target
    - LP: #1445195
  * scsi: storvsc: Don't assume that the scatterlist is not chained
    - LP: #1445195
  * scsi: storvsc: Set the tablesize based on the information given by the
    host
    - LP: #1445195
  * SUNRPC: TCP/UDP always close the old socket before reconnecting
    - LP: #1403152
  * ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
    - LP: #1468582
  * Fix kmalloc slab creation sequence
    - LP: #1475204
  * ARM: clk-imx6q: refine sata's parent
    - LP: #1478852
  * KVM: nSVM: Check for NRIPS support before updating control field
    - LP: #1478852
  * nfs: take extra reference to fl->fl_file when running a setlk
    - LP: #1478852
  * bridge: fix multicast router rlist endless loop
    - LP: #1478852
  * net: don't wait for order-3 page allocation
    - LP: #1478852
  * sctp: fix ASCONF list handling
    - LP: #1478852
  * bridge: fix br_stp_set_bridge_priority race conditions
    - LP: #1478852
  * packet: read num_members once in packet_rcv_fanout()
    - LP: #1478852
  * packet: avoid out of bounds read in round robin fanout
    - LP: #1478852
  * neigh: do not modify unlinked entries
    - LP: #1478852
  * tcp: Do not call tcp_fastopen_reset_cipher from interrupt context
    - LP: #1478852
  * net: phy: fix phy link up when limiting speed via device tree
    - LP: #1478852
  * sctp: Fix race between OOTB responce and route removal
    - LP: #1478852
  * x86/mce: Fix MCE severity messages
    - LP: #1478852
  * s5h1420: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * cx24116: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * af9013: Don't accept invalid bandwidth
    - LP: #1478852
  * cx24117: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * spi: fix race freeing dummy_tx/rx before it is unmapped
    - LP: #...

This bug was fixed in the package linux-lts-utopic - 3.16.0-46.62~14.04.1

---------------
linux-lts-utopic (3.16.0-46.62~14.04.1) trusty; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
    - LP: #1483790
  * SAUCE: REBASE-FIXUP: debian/scripts/misc/getabis: fix abi directory

[ Upstream Kernel Changes ]

* Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux-lts-utopic (3.16.0-46.61~14.04.1) trusty; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
    - LP: #1478986

[ Brad Figg ]

* SAUCE: REBASE-FIXUP: debian/rules.d/1-maintainer.mk: Need '~14.04.1' as
    part of the version when looking for the baseCommit for printchanges

[ Upstream Kernel Changes ]

* Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1478852
  * storvsc: use cmd_size to allocate per-command data
    - LP: #1445195
  * storvsc: fix a bug in storvsc limits
    - LP: #1445195
  * Drivers: hv: vmbus: Support a vmbus API for efficiently sending page
    arrays
    - LP: #1445195
  * scsi: storvsc: Increase the ring buffer size
    - LP: #1445195
  * scsi: storvsc: Size the queue depth based on the ringbuffer size
    - LP: #1445195
  * scsi: storvsc: Always send on the selected outgoing channel
    - LP: #1445195
  * scsi: storvsc: Retrieve information about the capability of the target
    - LP: #1445195
  * scsi: storvsc: Don't assume that the scatterlist is not chained
    - LP: #1445195
  * scsi: storvsc: Set the tablesize based on the information given by the
    host
    - LP: #1445195
  * SUNRPC: TCP/UDP always close the old socket before reconnecting
    - LP: #1403152
  * ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
    - LP: #1468582
  * Fix kmalloc slab creation sequence
    - LP: #1475204
  * ARM: clk-imx6q: refine sata's parent
    - LP: #1478852
  * KVM: nSVM: Check for NRIPS support before updating control field
    - LP: #1478852
  * nfs: take extra reference to fl->fl_file when running a setlk
    - LP: #1478852
  * bridge: fix multicast router rlist endless loop
    - LP: #1478852
  * net: don't wait for order-3 page allocation
    - LP: #1478852
  * sctp: fix ASCONF list handling
    - LP: #1478852
  * bridge: fix br_stp_set_bridge_priority race conditions
    - LP: #1478852
  * packet: read num_members once in packet_rcv_fanout()
    - LP: #1478852
  * packet: avoid out of bounds read in round robin fanout
    - LP: #1478852
  * neigh: do not modify unlinked entries
    - LP: #1478852
  * tcp: Do not call tcp_fastopen_reset_cipher from interrupt context
    - LP: #1478852
  * net: phy: fix phy link up when limiting speed via device tree
    - LP: #1478852
  * sctp: Fix race between OOTB responce and route removal
    - LP: #1478852
  * x86/mce: Fix MCE severity messages
    - LP: #1478852
  * s5h1420: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * cx24116: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * af9013: Don't accept invalid bandwidth
    - LP: #1478852
  * cx24117: fix a buffer overflow when checking userspace params
    - LP: #1478852
  * spi: fix race freeing dummy_tx/rx before it is unmapped
    - LP: #1478852
  * mtd: fix: avoid race condition when accessing mtd->usecount
    - LP: #1478852
  * intel_pstate: set BYT MSR with wrmsrl_on_cpu()
    - LP: #1478852
  * leds / PM: fix hibernation on arm when gpio-led used with CPU led
    trigger
    - LP: #1478852
  * crypto: talitos - avoid memleak in talitos_alg_alloc()
    - LP: #1478852
  * genirq: devres: Fix testing return value of request_any_context_irq()
    - LP: #1478852
  * ASoC: wm8737: Fixup setting VMID Impedance control register
    - LP: #1478852
  * ASoC: wm8903: Fix define for WM8903_VMID_RES_250K
    - LP: #1478852
  * mnt: Refactor the logic for mounting sysfs and proc in a user namespace
    - LP: #1478852
  * ASoC: wm8955: Fix setting wrong register for WM8955_K_8_0_MASK bits
    - LP: #1478852
  * scsi_transport_srp: Introduce srp_wait_for_queuecommand()
    - LP: #1478852
  * scsi_transport_srp: Fix a race condition
    - LP: #1478852
  * KVM: mips: use id_to_memslot correctly
    - LP: #1478852
  * w1_therm reference count family data
    - LP: #1478852
  * drm/radeon: take the mode_config mutex when dealing with hpds (v2)
    - LP: #1478852
  * usb: dwc3: gadget: return error if command sent to DGCMD register fails
    - LP: #1478852
  * rcu: Correctly handle non-empty Tiny RCU callback list with none ready
    - LP: #1478852
  * ASoC: arizona: Fix noise generator gain TLV
    - LP: #1478852
  * usb: dwc3: gadget: don't clear EP_BUSY too early
    - LP: #1478852
  * staging: rtl8712: prevent buffer overrun in recvbuf2recvframe
    - LP: #1478852
  * usb: core: Fix USB 3.0 devices lost in NOTATTACHED state after a hub
    port reset
    - LP: #1478852
  * staging: vt6655: device_rx_srv check sk_buff is NULL
    - LP: #1478852
  * fixing infinite OPEN loop in 4.0 stateid recovery
    - LP: #1478852
  * ideapad_laptop: Lenovo G50-30 fix rfkill reports wireless blocked
    - LP: #1397021, #1478852
  * powerpc/perf: Fix book3s kernel to userspace backtraces
    - LP: #1478852
  * SUNRPC: Fix a memory leak in the backchannel code
    - LP: #1478852
  * ipr: Increase default adapter init stage change timeout
    - LP: #1478852
  * ieee802154: Fix sockaddr_ieee802154 implicit padding information leak.
    - LP: #1478852
  * mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
    - LP: #1478852
  * drm/qxl: Do not cause spice-server to clean our objects
    - LP: #1478852
  * drm/qxl: Do not leak memory if qxl_release_list_add fails
    - LP: #1478852
  * selinux: fix setting of security labels on NFS
    - LP: #1478852
  * ath3k: Add support of 0489:e076 AR3012 device
    - LP: #1462614, #1478852
  * ath3k: add support of 13d3:3474 AR3012 device
    - LP: #1427680, #1478852
  * Bluetooth: btusb: Fix memory leak in Intel setup routine
    - LP: #1478852
  * ath9k: fix DMA stop sequence for AR9003+
    - LP: #1478852
  * b43: fix support for 14e4:4321 PCI dev with BCM4321 chipset
    - LP: #1478852
  * cdc-acm: Add support of ATOL FPrint fiscal printers
    - LP: #1478852
  * regulator: core: fix constraints output buffer
    - LP: #1478852
  * ACPI / PM: Add missing pm_generic_complete() invocation
    - LP: #1478852
  * x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A
    - LP: #1478852
  * pinctrl: mvebu: armada-38x: fix PCIe functions
    - LP: #1478852
  * pinctrl: mvebu: armada-370: fix spi0 pin description
    - LP: #1478852
  * pinctrl: mvebu: armada-375: remove non-existing NAND re/we pins
    - LP: #1478852
  * pinctrl: mvebu: armada-xp: remove non-existing NAND pins
    - LP: #1478852
  * pinctrl: mvebu: armada-xp: remove non-existing VDD cpu_pd functions
    - LP: #1478852
  * pinctrl: mvebu: armada-xp: fix functions of MPP48
    - LP: #1478852
  * pinctrl: mvebu: armada-375: remove incorrect space in pin description
    - LP: #1478852
  * pinctrl: mvebu: armada-38x: fix incorrect total number of GPIOs
    - LP: #1478852
  * i2c: at91: fix a race condition when using the DMA controller
    - LP: #1478852
  * dmaengine: mv_xor: bug fix for racing condition in descriptors cleanup
    - LP: #1478852
  * ASoC: wm8960: the enum of "DAC Polarity" should be wm8960_enum[1]
    - LP: #1478852
  * arm64: Do not attempt to use init_mm in reset_context()
    - LP: #1478852
  * ext4: fix race between truncate and __ext4_journalled_writepage()
    - LP: #1478852
  * Disable write buffering on Toshiba ToPIC95
    - LP: #1478852
  * fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"
    - LP: #1478852
  * jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail()
    - LP: #1478852
  * regmap: Fix regmap_bulk_read in BE mode
    - LP: #1478852
  * jbd2: fix ocfs2 corrupt when updating journal superblock fails
    - LP: #1478852
  * ideapad: fix software rfkill setting
    - LP: #1478852
  * fs/ufs: restore s_lock mutex
    - LP: #1478852
  * regmap: Fix possible shift overflow in regmap_field_init()
    - LP: #1478852
  * ima: fix ima_show_template_data_ascii()
    - LP: #1478852
  * nfs: increase size of EXCHANGE_ID name string buffer
    - LP: #1478852
  * vTPM: set virtual device before passing to ibmvtpm_reset_crq
    - LP: #1478852
  * arm: KVM: force execution of HCPTR access on VM exit
    - LP: #1478852
  * arm64: mm: Fix freeing of the wrong memmap entries with
    !SPARSEMEM_VMEMMAP
    - LP: #1478852
  * dm space map metadata: fix occasional leak of a metadata block on
    resize
    - LP: #1478852
  * dm stats: fix divide by zero if 'number_of_areas' arg is zero
    - LP: #1478852
  * x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing
    - LP: #1478852
  * pNFS: Fix a memory leak when attempted pnfs fails
    - LP: #1478852
  * ACPI / PNP: Avoid conflicting resource reservations
    - LP: #1478852
  * Bluetooth: ath3k: add support of 04ca:300f AR3012 device
    - LP: #1449730, #1478852
  * Bluetooth: ath3k: Add support of 04ca:300d AR3012 device
    - LP: #1394368, #1478852
  * libata: Do not blacklist Micron M500DC
    - LP: #1478852
  * arm64: vdso: work-around broken ELF toolchains in Makefile
    - LP: #1478852
  * iommu/amd: Handle large pages correctly in free_pagetable
    - LP: #1478852
  * ext4: call sync_blockdev() before invalidate_bdev() in put_super()
    - LP: #1478852
  * MIPS: Fix KVM guest fixmap address
    - LP: #1478852
  * xfs: fix remote symlinks on V5/CRC filesystems
    - LP: #1478852
  * ext4: don't retry file block mapping on bigalloc fs with non-extent
    file
    - LP: #1478852
  * xfs: don't truncate attribute extents if no extents exist
    - LP: #1478852
  * NET: ROSE: Don't dereference NULL neighbour pointer.
    - LP: #1478852
  * netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
    - LP: #1478852
  * fs: Fix S_NOSEC handling
    - LP: #1478852
  * stmmac: troubleshoot unexpected bits in des0 & des1
    - LP: #1478852
  * ACPI / resources: free memory on error in add_region_before()
    - LP: #1478852
  * PM / sleep: Increase default DPM watchdog timeout to 60
    - LP: #1478852
  * ARC: add compiler barrier to LLSC based cmpxchg
    - LP: #1478852
  * locking,arch,arc: Fold atomic_ops
    - LP: #1478852
  * ARC: add smp barriers around atomics per Documentation/atomic_ops.txt
    - LP: #1478852
  * mm: kmemleak: allow safe memory scanning during kmemleak disabling
    - LP: #1478852
  * dell-laptop: Fix allocating & freeing SMI buffer page
    - LP: #1478852
  * ALSA: hda - Fix Dock Headphone on Thinkpad X250 seen as a Line Out
    - LP: #1478852
  * tracing/filter: Do not WARN on operand count going below zero
    - LP: #1478852
  * tracing/filter: Do not allow infix to exceed end of string
    - LP: #1478852
  * clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug
    notifier
    - LP: #1478852
  * ALSA: hda - Add headset support to Acer Aspire V5
    - LP: #1478852
  * ALSA: hda - Fix the dock headphone output on Fujitsu Lifebook E780
    - LP: #1478852
  * agp/intel: Fix typo in needs_ilk_vtd_wa()
    - LP: #1478852
  * drm/i915: fix backlight after resume on 855gm
    - LP: #1478852
  * drm/radeon: compute ring fix hibernation (CI GPU family) v2.
    - LP: #1478852
  * drm/radeon: SDMA fix hibernation (CI GPU family).
    - LP: #1478852
  * net: mvneta: introduce compatible string "marvell, armada-xp-neta"
    - LP: #1478852
  * ARM: mvebu: update Ethernet compatible string for Armada XP
    - LP: #1478852
  * net: mvneta: disable IP checksum with jumbo frames for Armada 370
    - LP: #1478852
  * crush: fix a bug in tree bucket decode
    - LP: #1478852
  * rbd: use GFP_NOIO in rbd_obj_request_create()
    - LP: #1478852
  * arm64: Don't report clear pmds and puds as huge
    - LP: #1478852
  * fuse: initialize fc->release before calling it
    - LP: #1478852
  * vfs: Ignore unlocked mounts in fs_fully_visible
    - LP: #1478852
  * VFS: Introduce inode-getting helpers for layered/unioned fs
    environments
    - LP: #1478852
  * fs: Add helper functions for permanently empty directories.
    - LP: #1478852
  * sysctl: Allow creating permanently empty directories that serve as
    mountpoints.
    - LP: #1478852
  * proc: Allow creating permanently empty directories that serve as mount
    points
    - LP: #1478852
  * kernfs: Add support for always empty directories.
    - LP: #1478852
  * sysfs: Add support for permanently empty directories to serve as mount
    points.
    - LP: #1478852
  * mnt: Update fs_fully_visible to test for permanently empty directories
    - LP: #1478852
  * vfs: Remove incorrect debugging WARN in prepend_path
    - LP: #1478852
  * hwmon: (mcp3021) Fix broken output scaling
    - LP: #1478852
  * ACPICA: Tables: Enable both 32-bit and 64-bit FACS
    - LP: #1478852
  * ACPICA: Tables: Fix an issue that FACS initialization is performed
    twice
    - LP: #1478852
  * ACPICA: Tables: Enable default 64-bit FADT addresses favor
    - LP: #1478852
  * KVM: x86: make vapics_in_nmi_mode atomic
    - LP: #1478852
  * KVM: x86: properly restore LVT0
    - LP: #1478852
  * KVM: s390: virtio-ccw: don't overwrite config space values
    - LP: #1478852
  * 9p: forgetting to cancel request on interrupted zero-copy RPC
    - LP: #1478852
  * clk: Fix JSON output in debugfs
    - LP: #1478852
  * e1000e: Cleanup handling of VLAN_HLEN as a part of max frame size
    - LP: #1478852
  * powerpc/pseries: Fix possible leaked device node reference
    - LP: #1478852
  * clk: ti: dra7-atl-clock: Fix possible ERR_PTR dereference
    - LP: #1478852
  * ipip: fix one sparse error
    - LP: #1478852
  * HID: rmi: fix some harmless BIT() mistakes
    - LP: #1478852
  * HID: i2c-hid: fix harmless test_bit() issue
    - LP: #1478852
  * pktgen: adjust spacing in proc file interface output
    - LP: #1478852
  * USB: devio: fix a condition in async_completed()
    - LP: #1478852
  * tty: remove platform_sysrq_reset_seq
    - LP: #1478852
  * net/mlx4_core: Enhance the MAD_IFC wrapper to convert VF port to
    physical
    - LP: #1478852
  * IB/mlx4: Convert slave port before building address-handle
    - LP: #1478852
  * ath9k_htc: memory corruption calling set_bit()
    - LP: #1478852
  * rndis_wlan: harmless issue calling set_bit()
    - LP: #1478852
  * mtd: dc21285: use raw spinlock functions for nw_gpio_lock
    - LP: #1478852
  * mac80211: prevent possible crypto tx tailroom corruption
    - LP: #1478852
  * cfg80211: ignore netif running state when changing iftype
    - LP: #1478852
  * phy: twl4030-usb: remove incorrect pm_runtime_get_sync() in probe
    function.
    - LP: #1478852
  * NFS: Fix size of NFSACL SETACL operations
    - LP: #1478852
  * Btrfs: lock superblock before remounting for rw subvol
    - LP: #1478852
  * of: return NUMA_NO_NODE from fallback of_node_to_nid()
    - LP: #1478852
  * sched/fair: Prevent throttling in early pick_next_task_fair()
    - LP: #1478852
  * ASoC: imx-wm8962: Add a missing error check
    - LP: #1478852
  * ACPI / init: Switch over platform to the ACPI mode later
    - LP: #1478852
  * drm/tegra: dpaux: Fix transfers larger than 4 bytes
    - LP: #1478852
  * mmc: card: Fixup request missing in mmc_blk_issue_rw_rq
    - LP: #1478852
  * perf: Fix ring_buffer_attach() RCU sync, again
    - LP: #1478852
  * watchdog: omap: assert the counter being stopped before reprogramming
    - LP: #1478852
  * bridge: multicast: restore router configuration on port link down/up
    - LP: #1478852
  * mm/hugetlb: introduce minimum hugepage order
    - LP: #1478852
  * security_syslog() should be called once only
    - LP: #1478852
  * __bitmap_parselist: fix bug in empty string handling
    - LP: #1478852
  * x86/iosf: Add Kconfig prompt for IOSF_MBI selection
    - LP: #1478852
  * sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in
    softirq context
    - LP: #1478852
  * fs/ufs: restore s_lock mutex_init()
    - LP: #1478852
  * ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage
    - LP: #1478852
  * LZ4 : fix the data abort issue
    - LP: #1478852
  * lz4: fix system halt at boot kernel on x86_64
    - LP: #1478852
  * Linux 3.16.7-ckt15
    - LP: #1478852

-- Luis Henriques <luis.henriques@canonical.com>  Tue, 11 Aug 2015 15:59:16 +0100

Changed in linux-lts-utopic (Ubuntu Trusty):
status:	New → Fix Released

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-08-17:

#59

Download full text (30.6 KiB)

This bug was fixed in the package linux - 3.19.0-26.28

---------------
linux (3.19.0-26.28) vivid; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
- LP: #1483630

[ Upstream Kernel Changes ]

* Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux (3.19.0-26.27) vivid; urgency=low

[ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1479055
  * [Config] updateconfigs for 3.19.8-ckt4 stable update

[ Chris J Arges ]

* [Config] Add MTD_POWERNV_FLASH and OPAL_PRD
- LP: #1464560

[ Mika Kuoppala ]

* SAUCE: i915_bpo: drm/i915: Fix divide by zero on watermark update
- LP: #1473175

[ Tim Gardner ]

  * [Config] ACORN_PARTITION=n
    - LP: #1453117
  * [Config] Add i40e[vf] to d-i
    - LP: #1476393

[ Timo Aaltonen ]

  * SAUCE: i915_bpo: Rebase to v4.2-rc3
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "mm/fault, drm/i915: Use pagefault_disabled()
    to check for disabled pagefaults"
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "drm: i915: Port to new backlight interface
    selection API"
    - LP: #1473175

[ Upstream Kernel Changes ]

  * Revert "tools/vm: fix page-flags build"
    - LP: #1473547
  * Revert "ALSA: hda - Add mute-LED mode control to Thinkpad"
    - LP: #1473547
  * Revert "drm/radeon: adjust pll when audio is not enabled"
    - LP: #1473547
  * Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1479048
  * module: Call module notifier on failure after complete_formation()
    - LP: #1473547
  * gpio: gpio-kempld: Fix get_direction return value
    - LP: #1473547
  * ARM: dts: imx27: only map 4 Kbyte for fec registers
    - LP: #1473547
  * ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
    - LP: #1473547
  * x86/mce: Fix MCE severity messages
    - LP: #1473547
  * mac80211: don't use napi_gro_receive() outside NAPI context
    - LP: #1473547
  * iwlwifi: mvm: Free fw_status after use to avoid memory leak
    - LP: #1473547
  * iwlwifi: mvm: clean net-detect info if device was reset during suspend
    - LP: #1473547
  * drm/plane-helper: Adapt cursor hack to transitional helpers
    - LP: #1473547
  * ARM: dts: set display clock correctly for exynos4412-trats2
    - LP: #1473547
  * hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE
    - LP: #1473547
  * mfd: da9052: Fix broken regulator probe
    - LP: #1473547
  * ALSA: hda - Fix noise on AMD radeon 290x controller
    - LP: #1473547
  * lguest: fix out-by-one error in address checking.
    - LP: #1473547
  * xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
    - LP: #1473547
  * xfs: xfs_iozero can return positive errno
    - LP: #1473547
  * fs, omfs: add NULL terminator in the end up the token list
    - LP: #1473547
  * omfs: fix sign confusion for bitmap loop counter
    - LP: #1473547
  * d_walk() might skip too much
    - LP: #1473547
  * dm: fix casting bug in dm_merge_bvec()
    - LP: #1473547
  * hwmon: (nct6775) Add missing sysfs attribute initialization
    - LP: #1473547
  * hwmon: (nct6683) Add missing sysfs attribute initialization
    - LP: #1473547
  * target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
    - LP: #1473547
  * net...

This bug was fixed in the package linux - 3.19.0-26.28

---------------
linux (3.19.0-26.28) vivid; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
    - LP: #1483630

[ Upstream Kernel Changes ]

* Revert "Bluetooth: ath3k: Add support of 04ca:300d AR3012 device"

linux (3.19.0-26.27) vivid; urgency=low

[ Luis Henriques ]

* Release Tracking Bug
    - LP: #1479055
  * [Config] updateconfigs for 3.19.8-ckt4 stable update

[ Chris J Arges ]

* [Config] Add MTD_POWERNV_FLASH and OPAL_PRD
    - LP: #1464560

[ Mika Kuoppala ]

* SAUCE: i915_bpo: drm/i915: Fix divide by zero on watermark update
    - LP: #1473175

[ Tim Gardner ]

* [Config] ACORN_PARTITION=n
    - LP: #1453117
  * [Config] Add i40e[vf] to d-i
    - LP: #1476393

[ Timo Aaltonen ]

* SAUCE: i915_bpo: Rebase to v4.2-rc3
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "mm/fault, drm/i915: Use pagefault_disabled()
    to check for disabled pagefaults"
    - LP: #1473175
  * SAUCE: i915_bpo: Revert "drm: i915: Port to new backlight interface
    selection API"
    - LP: #1473175

[ Upstream Kernel Changes ]

* Revert "tools/vm: fix page-flags build"
    - LP: #1473547
  * Revert "ALSA: hda - Add mute-LED mode control to Thinkpad"
    - LP: #1473547
  * Revert "drm/radeon: adjust pll when audio is not enabled"
    - LP: #1473547
  * Revert "crypto: talitos - convert to use be16_add_cpu()"
    - LP: #1479048
  * module: Call module notifier on failure after complete_formation()
    - LP: #1473547
  * gpio: gpio-kempld: Fix get_direction return value
    - LP: #1473547
  * ARM: dts: imx27: only map 4 Kbyte for fec registers
    - LP: #1473547
  * ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
    - LP: #1473547
  * x86/mce: Fix MCE severity messages
    - LP: #1473547
  * mac80211: don't use napi_gro_receive() outside NAPI context
    - LP: #1473547
  * iwlwifi: mvm: Free fw_status after use to avoid memory leak
    - LP: #1473547
  * iwlwifi: mvm: clean net-detect info if device was reset during suspend
    - LP: #1473547
  * drm/plane-helper: Adapt cursor hack to transitional helpers
    - LP: #1473547
  * ARM: dts: set display clock correctly for exynos4412-trats2
    - LP: #1473547
  * hwmon: (ntc_thermistor) Ensure iio channel is of type IIO_VOLTAGE
    - LP: #1473547
  * mfd: da9052: Fix broken regulator probe
    - LP: #1473547
  * ALSA: hda - Fix noise on AMD radeon 290x controller
    - LP: #1473547
  * lguest: fix out-by-one error in address checking.
    - LP: #1473547
  * xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
    - LP: #1473547
  * xfs: xfs_iozero can return positive errno
    - LP: #1473547
  * fs, omfs: add NULL terminator in the end up the token list
    - LP: #1473547
  * omfs: fix sign confusion for bitmap loop counter
    - LP: #1473547
  * d_walk() might skip too much
    - LP: #1473547
  * dm: fix casting bug in dm_merge_bvec()
    - LP: #1473547
  * hwmon: (nct6775) Add missing sysfs attribute initialization
    - LP: #1473547
  * hwmon: (nct6683) Add missing sysfs attribute initialization
    - LP: #1473547
  * target/pscsi: Don't leak scsi_host if hba is VIRTUAL_HOST
    - LP: #1473547
  * net: phy: bcm7xxx: Fix 7425 PHY ID and flags
    - LP: #1473547
  * fs/binfmt_elf.c:load_elf_binary(): return -EINVAL on zero-length
    mappings
    - LP: #1473547
  * i2c: hix5hd2: Fix modalias to make module auto-loading work
    - LP: #1473547
  * i2c: s3c2410: fix oops in suspend callback for non-dt platforms
    - LP: #1473547
  * iio: adis16400: Report pressure channel scale
    - LP: #1473547
  * iio: adis16400: Use != channel indices for the two voltage channels
    - LP: #1473547
  * iio: adis16400: Compute the scan mask from channel indices
    - LP: #1473547
  * iio: adis16400: Fix burst mode
    - LP: #1473547
  * iio: adis16400: Fix burst transfer for adis16448
    - LP: #1473547
  * USB: serial: ftdi_sio: Add support for a Motion Tracker Development
    Board
    - LP: #1473547
  * iio: adc: twl6030-gpadc: Fix modalias
    - LP: #1473547
  * usb: make module xhci_hcd removable
    - LP: #1473547
  * usb: host: xhci: add mutex for non-thread-safe data
    - LP: #1473547
  * serial: imx: Fix DMA handling for IDLE condition aborts
    - LP: #1473547
  * usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros
    - LP: #1473547
  * brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
    - LP: #1473547
  * ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
    - LP: #1473547
  * n_tty: Fix auditing support for cannonical mode
    - LP: #1473547
  * drivers/base: cacheinfo: handle absence of caches
    - LP: #1473547
  * drm/i915/hsw: Fix workaround for server AUX channel clock divisor
    - LP: #1473547
  * MIPS: ralink: Fix clearing the illegal access interrupt
    - LP: #1473547
  * x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
    - LP: #1473547
  * lib: Fix strnlen_user() to not touch memory after specified maximum
    - LP: #1473547
  * Input: elantech - fix detection of touchpads where the revision matches
    a known rate
    - LP: #1473547
  * ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
    - LP: #1473547
  * ALSA: usb-audio: add MAYA44 USB+ mixer control names
    - LP: #1473547
  * ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)
    - LP: #1473547
  * USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
    - LP: #1473547
  * of/dynamic: Fix test for PPC_PSERIES
    - LP: #1473547
  * Input: elantech - add new icbody type
    - LP: #1473547
  * Input: alps - do not reduce trackpoint speed by half
    - LP: #1473547
  * MIPS: Fix enabling of DEBUG_STACKOVERFLOW
    - LP: #1473547
  * usb: musb: fix order of conditions for assigning end point operations
    - LP: #1473547
  * xfrm: fix a race in xfrm_state_lookup_byspi
    - LP: #1473547
  * iommu/vt-d: Allow RMRR on graphics devices too
    - LP: #1473547
  * iommu/vt-d: Fix passthrough mode with translation-disabled devices
    - LP: #1473547
  * ata: ahci_mvebu: Fix wrongly set base address for the MBus window
    setting
    - LP: #1473547
  * ARM: dts: am335x-boneblack: disable RTC-only sleep to avoid hardware
    damage
    - LP: #1473547
  * virtio_pci: Clear stale cpumask when setting irq affinity
    - LP: #1473547
  * irqchip: sunxi-nmi: Fix off-by-one error in irq iterator
    - LP: #1473547
  * pata_octeon_cf: fix broken build
    - LP: #1473547
  * ALSA: usb-audio: add native DSD support for JLsounds I2SoverUSB
    - LP: #1473547
  * Input: synaptics - add min/max quirk for Lenovo S540
    - LP: #1473547
  * drm/i915: Fix DDC probe for passive adapters
    - LP: #1473547
  * cfg80211: wext: clear sinfo struct before calling driver
    - LP: #1473547
  * mm/memory_hotplug.c: set zone->wait_table to null after freeing it
    - LP: #1473547
  * sched, numa: do not hint for NUMA balancing on VM_MIXEDMAP mappings
    - LP: #1473547
  * ring-buffer-benchmark: Fix the wrong sched_priority of producer
    - LP: #1473547
  * drm/radeon: fix freeze for laptop with Turks/Thames GPU.
    - LP: #1473547
  * drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO
    - LP: #1473547
  * block: fix ext_dev_lock lockdep report
    - LP: #1473547
  * net: bcmgenet: power on MII block for all MII modes
    - LP: #1473547
  * bridge: use _bh spinlock variant for br_fdb_update to avoid lockup
    - LP: #1473547
  * bridge: fix multicast router rlist endless loop
    - LP: #1473547
  * iser-target: Fix variable-length response error completion
    - LP: #1473547
  * iser-target: release stale iser connections
    - LP: #1473547
  * iser-target: Fix possible use-after-free
    - LP: #1473547
  * ALSA: hda - adding a DAC/pin preference map for a HP Envy TS machine
    - LP: #1473547
  * drm/mgag200: Reject non-character-cell-aligned mode widths
    - LP: #1473547
  * KVM: x86: fix lapic.timer_mode on restore
    - LP: #1473547
  * crypto: caam - improve initalization for context state saves
    - LP: #1473547
  * crypto: caam - fix RNG buffer cache alignment
    - LP: #1473547
  * tracing: Have filter check for balanced ops
    - LP: #1473547
  * drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query
    - LP: #1473547
  * clk: at91: pll: fix input range validity check
    - LP: #1473547
  * clk: at91: fix h32mx prototype inclusion in pmc header
    - LP: #1473547
  * xfs: don't truncate attribute extents if no extents exist
    - LP: #1473547
  * Linux 3.19.8-ckt3
    - LP: #1473547
  * net/mlx4_core: double free of dev_vfs
    - LP: #1473883
  * net/mlx4_core: need to call close fw if alloc icm is called twice
    - LP: #1473883
  * ALSA: hda - Fix audio crackles on Dell Latitude E7x40
    - LP: #1468582
  * ALSA: hda - Fix noisy outputs on Dell XPS13 (2015 model)
    - LP: #1468582
  * ALSA: hda - restore the MIC FIXUP for some Dell machines
    - LP: #1468582
  * ata: ahci_platform: fix owner module reference mismatch for scsi host
    - LP: #1473818
  * libahci: Refactoring of ahci_single_irq_intr function.
    - LP: #1473818
  * libahci: Add support to handle HOST_IRQ_STAT as edge trigger latch.
    - LP: #1473818
  * ata: ahci_xgene: Add AHCI Support for 2nd HW version of APM X-Gene SoC
    AHCI SATA Host controller.
    - LP: #1473818
  * Documentation: dts: xgene: Update interrupt field description
    - LP: #1473818
  * dtb: xgene: Add interrupt for Tx completion
    - LP: #1473818
  * drivers: net: xgene: Add separate tx completion ring
    - LP: #1473818
  * drivers: net: xgene: Change ring manager to use function pointers
    - LP: #1473818
  * drivers: net: xgene: Add ring manager v2 functions
    - LP: #1473818
  * drivers: net: xgene: Add 10GbE support with ring manager v2
    - LP: #1473818
  * drivers: net: xgene: Add SGMII based 1GbE support with ring manager v2
    - LP: #1473818
  * (no-up) arm64: dts: add APM Merlin Board device tree
    - LP: #1473818
  * SUNRPC: TCP/UDP always close the old socket before reconnecting
    - LP: #1403152
  * efi: Rename efi_guid_unparse to efi_guid_to_str
    - LP: #1473341
  * x86/efi: Add a "debug" option to the efi= cmdline
    - LP: #1473341
  * efi: efivar_create_sysfs_entry() should return negative error codes
    - LP: #1473341
  * efi: Add esrt support
    - LP: #1473341
  * x86, doc: Remove cmdline_size from list of fields to be filled in for
    EFI handover
    - LP: #1473341
  * efi/esrt: Fix some compiler warnings
    - LP: #1473341
  * efi: dmi: List SMBIOS3 table before SMBIOS table
    - LP: #1473341
  * efi: Add 'systab' information to Documentation/ABI
    - LP: #1473341
  * powerpc/powernv: Move OPAL API definitions to opal-api.h
    - LP: #1464560
  * powerpc/powernv: Move opal-api.h closer to the Skiboot version
    - LP: #1464560
  * powerpc/powernv: Add interfaces for flash device access
    - LP: #1464560
  * mtd: powernv: Add powernv flash MTD abstraction driver
    - LP: #1464560
  * powerpc/powernv: Expose OPAL APIs required by PRD interface
    - LP: #1464560
  * powerpc/powernv: Remove powernv RTAS support
    - LP: #1464560
  * powerpc/powernv: Add opal-prd channel
    - LP: #1464560
  * powerpc/powernv: fix construction of opal PRD messages
    - LP: #1464560
  * powerpc/include: Add opal-prd to installed uapi headers
    - LP: #1464560
  * powerpc/powernv: Fix vma page prot flags in opal-prd driver
    - LP: #1464560
  * ipvlan: fix addr hash list corruption
    - LP: #1475434
  * Fix kmalloc slab creation sequence
    - LP: #1475204
  * sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
    - LP: #1461620
  * net: don't wait for order-3 page allocation
    - LP: #1479048
  * sctp: fix ASCONF list handling
    - LP: #1479048
  * bridge: fix br_stp_set_bridge_priority race conditions
    - LP: #1479048
  * packet: read num_members once in packet_rcv_fanout()
    - LP: #1479048
  * packet: avoid out of bounds read in round robin fanout
    - LP: #1479048
  * neigh: do not modify unlinked entries
    - LP: #1479048
  * tcp: Do not call tcp_fastopen_reset_cipher from interrupt context
    - LP: #1479048
  * net/mlx4_en: Release TX QP when destroying TX ring
    - LP: #1479048
  * net/mlx4_en: Wake TX queues only when there's enough room
    - LP: #1479048
  * net/mlx4_en: Fix wrong csum complete report when rxvlan offload is
    disabled
    - LP: #1479048
  * net: phy: fix phy link up when limiting speed via device tree
    - LP: #1479048
  * bnx2x: fix lockdep splat
    - LP: #1479048
  * sctp: Fix race between OOTB responce and route removal
    - LP: #1479048
  * amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation
    - LP: #1479048
  * net: mvneta: introduce compatible string "marvell, armada-xp-neta"
    - LP: #1479048
  * ARM: mvebu: update Ethernet compatible string for Armada XP
    - LP: #1479048
  * net: mvneta: disable IP checksum with jumbo frames for Armada 370
    - LP: #1479048
  * sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in
    softirq context
    - LP: #1479048
  * s5h1420: fix a buffer overflow when checking userspace params
    - LP: #1479048
  * cx24116: fix a buffer overflow when checking userspace params
    - LP: #1479048
  * af9013: Don't accept invalid bandwidth
    - LP: #1479048
  * cx24117: fix a buffer overflow when checking userspace params
    - LP: #1479048
  * saa7164: fix querycap warning
    - LP: #1479048
  * cx18: add missing caps for the PCM video device
    - LP: #1479048
  * bus: arm-ccn: Fix node->XP config conversion
    - LP: #1479048
  * ARM: tegra20: Store CPU "resettable" status in IRAM
    - LP: #1479048
  * iio: accel: kxcjk-1013: add the "KXCJ9000" ACPI id
    - LP: #1479048
  * video: mxsfb: Make sure axi clock is enabled when accessing registers
    - LP: #1479048
  * spi: fix race freeing dummy_tx/rx before it is unmapped
    - LP: #1479048
  * mtd: fix: avoid race condition when accessing mtd->usecount
    - LP: #1479048
  * rc-core: fix dib0700 scancode generation for RC5
    - LP: #1479048
  * intel_pstate: set BYT MSR with wrmsrl_on_cpu()
    - LP: #1479048
  * leds / PM: fix hibernation on arm when gpio-led used with CPU led
    trigger
    - LP: #1479048
  * crypto: talitos - avoid memleak in talitos_alg_alloc()
    - LP: #1479048
  * genirq: devres: Fix testing return value of request_any_context_irq()
    - LP: #1479048
  * ASoC: wm8737: Fixup setting VMID Impedance control register
    - LP: #1479048
  * ASoC: wm8903: Fix define for WM8903_VMID_RES_250K
    - LP: #1479048
  * media: Fix regression in some more dib0700 based devices
    - LP: #1479048
  * mnt: Refactor the logic for mounting sysfs and proc in a user namespace
    - LP: #1479048
  * ASoC: wm8955: Fix setting wrong register for WM8955_K_8_0_MASK bits
    - LP: #1479048
  * of/pci: Fix pci_address_to_pio() conversion of CPU address to I/O port
    - LP: #1479048
  * scsi_transport_srp: Introduce srp_wait_for_queuecommand()
    - LP: #1479048
  * scsi_transport_srp: Fix a race condition
    - LP: #1479048
  * IB/srp: Remove an extraneous scsi_host_put() from an error path
    - LP: #1479048
  * IB/srp: Fix a connection setup race
    - LP: #1479048
  * IB/srp: Fix connection state tracking
    - LP: #1479048
  * IB/srp: Fix reconnection failure handling
    - LP: #1479048
  * KVM: mips: use id_to_memslot correctly
    - LP: #1479048
  * ima: skip measurement of cgroupfs files and update documentation
    - LP: #1479048
  * ima: do not measure or appraise the NSFS filesystem
    - LP: #1479048
  * KEYS: fix "ca_keys=" partial key matching
    - LP: #1479048
  * PCI: Propagate the "ignore hotplug" setting to parent
    - LP: #1479048
  * mei: txe: reduce suspend/resume time
    - LP: #1479048
  * w1_therm reference count family data
    - LP: #1479048
  * tty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send
    - LP: #1479048
  * spi: orion: Fix maximum baud rates for Armada 370/XP
    - LP: #1479048
  * rtlwifi: Remove the clear interrupt routine from all drivers
    - LP: #1479048
  * drm/radeon: take the mode_config mutex when dealing with hpds (v2)
    - LP: #1479048
  * rcu: Correctly handle non-empty Tiny RCU callback list with none ready
    - LP: #1479048
  * ASoC: arizona: Fix noise generator gain TLV
    - LP: #1479048
  * usb: dwc3: gadget: don't clear EP_BUSY too early
    - LP: #1479048
  * dm cache: fix race when issuing a POLICY_REPLACE operation
    - LP: #1479048
  * PCI: Add pci_bus_addr_t
    - LP: #1479048
  * staging: rtl8712: prevent buffer overrun in recvbuf2recvframe
    - LP: #1479048
  * usb: core: Fix USB 3.0 devices lost in NOTATTACHED state after a hub
    port reset
    - LP: #1479048
  * staging: vt6655: device_rx_srv check sk_buff is NULL
    - LP: #1479048
  * fixing infinite OPEN loop in 4.0 stateid recovery
    - LP: #1479048
  * megaraid_sas : Modify return value of megasas_issue_blocked_cmd() and
    wait_and_poll() to consider command status returned by firmware
    - LP: #1479048
  * ideapad_laptop: Lenovo G50-30 fix rfkill reports wireless blocked
    - LP: #1397021, #1479048
  * powerpc/perf: Fix book3s kernel to userspace backtraces
    - LP: #1479048
  * gpio: crystalcove: set IRQCHIP_SKIP_SET_WAKE for the irqchip
    - LP: #1479048
  * SUNRPC: Fix a memory leak in the backchannel code
    - LP: #1479048
  * ipr: Increase default adapter init stage change timeout
    - LP: #1479048
  * Btrfs: don't invalidate root dentry when subvolume deletion fails
    - LP: #1479048
  * ARM: at91/dt: sama5d4ek: mci0 uses slot 0
    - LP: #1479048
  * mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
    - LP: #1479048
  * ASoC: tas2552: Fix kernel crash when the codec is loaded but not part
    of a card
    - LP: #1479048
  * ASoC: tas2552: Fix kernel crash caused by wrong kcontrol entry
    - LP: #1479048
  * drm/qxl: Do not cause spice-server to clean our objects
    - LP: #1479048
  * drm/qxl: Do not leak memory if qxl_release_list_add fails
    - LP: #1479048
  * ASoC: rt5645: Init jack_detect_work before registering irq
    - LP: #1479048
  * selinux: fix setting of security labels on NFS
    - LP: #1479048
  * ath3k: Add support of 0489:e076 AR3012 device
    - LP: #1462614, #1479048
  * ath3k: add support of 13d3:3474 AR3012 device
    - LP: #1427680, #1479048
  * Bluetooth: btusb: Fix memory leak in Intel setup routine
    - LP: #1479048
  * ath9k: fix DMA stop sequence for AR9003+
    - LP: #1479048
  * b43: fix support for 14e4:4321 PCI dev with BCM4321 chipset
    - LP: #1479048
  * cdc-acm: Add support of ATOL FPrint fiscal printers
    - LP: #1479048
  * NFC: st21nfcb: Remove inappropriate kfree on a devm_kzalloc pointer
    - LP: #1479048
  * NFC: st21nfcb: Do not remove header once the payload is sent
    - LP: #1479048
  * NFC: st21nfcb: remove st21nfcb_nci_i2c_disable
    - LP: #1479048
  * PCI: pciehp: Wait for hotplug command completion where necessary
    - LP: #1479048
  * regulator: core: fix constraints output buffer
    - LP: #1479048
  * ACPI / PM: Add missing pm_generic_complete() invocation
    - LP: #1479048
  * x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A
    - LP: #1479048
  * pinctrl: mvebu: armada-38x: fix PCIe functions
    - LP: #1479048
  * pinctrl: mvebu: armada-370: fix spi0 pin description
    - LP: #1479048
  * pinctrl: mvebu: armada-375: remove non-existing NAND re/we pins
    - LP: #1479048
  * pinctrl: mvebu: armada-xp: remove non-existing NAND pins
    - LP: #1479048
  * pinctrl: mvebu: armada-xp: remove non-existing VDD cpu_pd functions
    - LP: #1479048
  * pinctrl: mvebu: armada-xp: fix functions of MPP48
    - LP: #1479048
  * pinctrl: mvebu: armada-375: remove incorrect space in pin description
    - LP: #1479048
  * pinctrl: mvebu: armada-38x: fix incorrect total number of GPIOs
    - LP: #1479048
  * i2c: at91: fix a race condition when using the DMA controller
    - LP: #1479048
  * dmaengine: mv_xor: bug fix for racing condition in descriptors cleanup
    - LP: #1479048
  * ASoC: wm8960: the enum of "DAC Polarity" should be wm8960_enum[1]
    - LP: #1479048
  * arm64: Do not attempt to use init_mm in reset_context()
    - LP: #1479048
  * ext4: fix race between truncate and __ext4_journalled_writepage()
    - LP: #1479048
  * Disable write buffering on Toshiba ToPIC95
    - LP: #1479048
  * mei: me: wait for power gating exit confirmation
    - LP: #1479048
  * fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"
    - LP: #1479048
  * jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail()
    - LP: #1479048
  * regmap: Fix regmap_bulk_read in BE mode
    - LP: #1479048
  * jbd2: fix ocfs2 corrupt when updating journal superblock fails
    - LP: #1479048
  * ideapad: fix software rfkill setting
    - LP: #1479048
  * fs/ufs: restore s_lock mutex
    - LP: #1479048
  * regmap: Fix possible shift overflow in regmap_field_init()
    - LP: #1479048
  * ima: fix ima_show_template_data_ascii()
    - LP: #1479048
  * ima: add support for new "euid" policy condition
    - LP: #1479048
  * ima: extend "mask" policy matching support
    - LP: #1479048
  * nfs: increase size of EXCHANGE_ID name string buffer
    - LP: #1479048
  * vTPM: set virtual device before passing to ibmvtpm_reset_crq
    - LP: #1479048
  * Input: pixcir_i2c_ts - fix receive error
    - LP: #1479048
  * arm: KVM: force execution of HCPTR access on VM exit
    - LP: #1479048
  * ARM: kvm: psci: fix handling of unimplemented functions
    - LP: #1479048
  * arm64: entry: fix context tracking for el0_sp_pc
    - LP: #1479048
  * i2c: mux: Use __i2c_transfer() instead of calling parent's
    master_xfer()
    - LP: #1479048
  * i2c: mux: pca954x: Use __i2c_transfer because of quirks
    - LP: #1479048
  * arm64: mm: Fix freeing of the wrong memmap entries with
    !SPARSEMEM_VMEMMAP
    - LP: #1479048
  * dm space map metadata: fix occasional leak of a metadata block on
    resize
    - LP: #1479048
  * KVM: arm/arm64: vgic: Avoid injecting reserved IRQ numbers
    - LP: #1479048
  * ARM: mvebu: fix suspend to RAM on big-endian configurations
    - LP: #1479048
  * dm stats: fix divide by zero if 'number_of_areas' arg is zero
    - LP: #1479048
  * x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing
    - LP: #1479048
  * pNFS: Fix a memory leak when attempted pnfs fails
    - LP: #1479048
  * NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes
    - LP: #1479048
  * ACPI / PNP: Avoid conflicting resource reservations
    - LP: #1479048
  * Bluetooth: ath3k: add support of 04ca:300f AR3012 device
    - LP: #1449730, #1479048
  * Bluetooth: ath3k: Add support of 04ca:300d AR3012 device
    - LP: #1394368, #1479048
  * libata: Do not blacklist Micron M500DC
    - LP: #1479048
  * arm64: vdso: work-around broken ELF toolchains in Makefile
    - LP: #1479048
  * iommu/amd: Handle large pages correctly in free_pagetable
    - LP: #1479048
  * ext4: call sync_blockdev() before invalidate_bdev() in put_super()
    - LP: #1479048
  * MIPS: Fix KVM guest fixmap address
    - LP: #1479048
  * xfs: fix remote symlinks on V5/CRC filesystems
    - LP: #1479048
  * ext4: don't retry file block mapping on bigalloc fs with non-extent
    file
    - LP: #1479048
  * drm/dp/mst: make sure mst_primary mstb is valid in work function
    - LP: #1479048
  * drm/dp/mst: take lock around looking up the branch device on hpd irq
    - LP: #1479048
  * NET: ROSE: Don't dereference NULL neighbour pointer.
    - LP: #1479048
  * netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook
    - LP: #1479048
  * of/address: use atomic allocation in pci_register_io_range()
    - LP: #1479048
  * fs: Fix S_NOSEC handling
    - LP: #1479048
  * stmmac: troubleshoot unexpected bits in des0 & des1
    - LP: #1479048
  * ACPI / resources: free memory on error in add_region_before()
    - LP: #1479048
  * PM / sleep: Increase default DPM watchdog timeout to 60
    - LP: #1479048
  * rtc: snvs: fix wakealarm by call enable_irq_wake earlier
    - LP: #1479048
  * ARC: add compiler barrier to LLSC based cmpxchg
    - LP: #1479048
  * ARC: add smp barriers around atomics per Documentation/atomic_ops.txt
    - LP: #1479048
  * mm: kmemleak: allow safe memory scanning during kmemleak disabling
    - LP: #1479048
  * mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
    - LP: #1479048
  * drm/dp/mst: close deadlock in connector destruction.
    - LP: #1479048
  * dell-laptop: Fix allocating & freeing SMI buffer page
    - LP: #1479048
  * ALSA: hda - Fix Dock Headphone on Thinkpad X250 seen as a Line Out
    - LP: #1479048
  * ALSA: hda - set proper caps for newer AMD hda audio in KB/KV
    - LP: #1479048
  * s390/kdump: fix REGSET_VX_LOW vector register ELF notes
    - LP: #1479048
  * ARM64: smp: Fix suspicious RCU usage with ipi tracepoints
    - LP: #1479048
  * arm64: bpf: fix out-of-bounds read in bpf2a64_offset()
    - LP: #1479048
  * tracing/filter: Do not WARN on operand count going below zero
    - LP: #1479048
  * tracing/filter: Do not allow infix to exceed end of string
    - LP: #1479048
  * arm64: bpf: fix endianness conversion bugs
    - LP: #1479048
  * clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug
    notifier
    - LP: #1479048
  * ALSA: hda - Add headset support to Acer Aspire V5
    - LP: #1479048
  * ALSA: hda - Fix the dock headphone output on Fujitsu Lifebook E780
    - LP: #1479048
  * agp/intel: Fix typo in needs_ilk_vtd_wa()
    - LP: #1479048
  * drm/i915: fix backlight after resume on 855gm
    - LP: #1479048
  * drm/radeon: compute ring fix hibernation (CI GPU family) v2.
    - LP: #1479048
  * drm/radeon: SDMA fix hibernation (CI GPU family).
    - LP: #1479048
  * crush: fix a bug in tree bucket decode
    - LP: #1479048
  * rbd: use GFP_NOIO in rbd_obj_request_create()
    - LP: #1479048
  * arm64: Don't report clear pmds and puds as huge
    - LP: #1479048
  * fuse: initialize fc->release before calling it
    - LP: #1479048
  * vfs: Ignore unlocked mounts in fs_fully_visible
    - LP: #1479048
  * VFS: Introduce inode-getting helpers for layered/unioned fs
    environments
    - LP: #1479048
  * fs: Add helper functions for permanently empty directories.
    - LP: #1479048
  * sysctl: Allow creating permanently empty directories that serve as
    mountpoints.
    - LP: #1479048
  * proc: Allow creating permanently empty directories that serve as mount
    points
    - LP: #1479048
  * kernfs: Add support for always empty directories.
    - LP: #1479048
  * sysfs: Add support for permanently empty directories to serve as mount
    points.
    - LP: #1479048
  * sysfs: Create mountpoints with sysfs_create_mount_point
    - LP: #1479048
  * mnt: Update fs_fully_visible to test for permanently empty directories
    - LP: #1479048
  * vfs: Remove incorrect debugging WARN in prepend_path
    - LP: #1479048
  * hwmon: (nct7802) fix visibility of temp3
    - LP: #1479048
  * hwmon: (mcp3021) Fix broken output scaling
    - LP: #1479048
  * ACPICA: Tables: Enable both 32-bit and 64-bit FACS
    - LP: #1479048
  * ACPICA: Tables: Fix an issue that FACS initialization is performed
    twice
    - LP: #1479048
  * ACPICA: Tables: Enable default 64-bit FADT addresses favor
    - LP: #1479048
  * KVM: x86: make vapics_in_nmi_mode atomic
    - LP: #1479048
  * KVM: x86: properly restore LVT0
    - LP: #1479048
  * KVM: s390: virtio-ccw: don't overwrite config space values
    - LP: #1479048
  * 9p: forgetting to cancel request on interrupted zero-copy RPC
    - LP: #1479048
  * bridge: multicast: restore router configuration on port link down/up
    - LP: #1479048
  * ath10k: clear htt.freq
    - LP: #1479048
  * cfg80211: ignore netif running state when changing iftype
    - LP: #1479048
  * mm/hugetlb: introduce minimum hugepage order
    - LP: #1479048
  * mmc: sdhci: Restore behavior while creating OCR mask
    - LP: #1479048
  * hrtimer: Allow concurrent hrtimer_start() for self restarting timers
    - LP: #1479048
  * ARM: dove: fix legacy dove IRQ numbers
    - LP: #1479048
  * sched/fair: Prevent throttling in early pick_next_task_fair()
    - LP: #1479048
  * watchdog: omap: assert the counter being stopped before reprogramming
    - LP: #1479048
  * ufs: Fix possible deadlock when looking up directories
    - LP: #1479048
  * net: dsa: bcm_sf2: properly propagate carrier down state for MoCA
    - LP: #1479048
  * gpiolib: Add missing dummies for the unified device properties
    interface
    - LP: #1479048
  * ASoC: imx-wm8962: Add a missing error check
    - LP: #1479048
  * phy: twl4030-usb: remove incorrect pm_runtime_get_sync() in probe
    function.
    - LP: #1479048
  * IB/mlx4: Convert slave port before building address-handle
    - LP: #1479048
  * drm/tegra: dpaux: Fix transfers larger than 4 bytes
    - LP: #1479048
  * ath10k: add extra check for frame tracing
    - LP: #1479048
  * perf: Fix ring_buffer_attach() RCU sync, again
    - LP: #1479048
  * mmc: card: Fixup request missing in mmc_blk_issue_rw_rq
    - LP: #1479048
  * ipip: fix one sparse error
    - LP: #1479048
  * __bitmap_parselist: fix bug in empty string handling
    - LP: #1479048
  * powerpc/pseries: Fix possible leaked device node reference
    - LP: #1479048
  * ath9k_htc: memory corruption calling set_bit()
    - LP: #1479048
  * ARM: 8371/1: always select IRQ_WORK on SMP
    - LP: #1479048
  * tty: remove platform_sysrq_reset_seq
    - LP: #1479048
  * ath10k: fix insufficient tracing buffer size
    - LP: #1479048
  * pktgen: adjust flag NO_TIMESTAMP to be more pktgen compliant
    - LP: #1479048
  * mtd: dc21285: use raw spinlock functions for nw_gpio_lock
    - LP: #1479048
  * rndis_wlan: harmless issue calling set_bit()
    - LP: #1479048
  * clk: ti: dra7-atl-clock: Fix possible ERR_PTR dereference
    - LP: #1479048
  * MIPS: Octeon: Set OHCI and EHCI MMIO byte order to match CPU
    - LP: #1479048
  * NFS: Fix size of NFSACL SETACL operations
    - LP: #1479048
  * security_syslog() should be called once only
    - LP: #1479048
  * pktgen: adjust spacing in proc file interface output
    - LP: #1479048
  * ARM: 8372/1: KGDB does not build on BE32
    - LP: #1479048
  * of: return NUMA_NO_NODE from fallback of_node_to_nid()
    - LP: #1479048
  * HID: i2c-hid: fix harmless test_bit() issue
    - LP: #1479048
  * iwlwifi: mvm: fix ROC reference accounting
    - LP: #1479048
  * samples/bpf: fix in-source build of samples with clang
    - LP: #1479048
  * ACPI / init: Switch over platform to the ACPI mode later
    - LP: #1479048
  * HID: rmi: fix some harmless BIT() mistakes
    - LP: #1479048
  * mac80211: prevent possible crypto tx tailroom corruption
    - LP: #1479048
  * mac80211: fix the beacon csa counter for mesh and ibss
    - LP: #1479048
  * USB: devio: fix a condition in async_completed()
    - LP: #1479048
  * e1000e: Cleanup handling of VLAN_HLEN as a part of max frame size
    - LP: #1479048
  * clk: Fix JSON output in debugfs
    - LP: #1479048
  * net/mlx4_core: Enhance the MAD_IFC wrapper to convert VF port to
    physical
    - LP: #1479048
  * Btrfs: lock superblock before remounting for rw subvol
    - LP: #1479048
  * Linux 3.19.8-ckt4
    - LP: #1479048

-- Luis Henriques <luis.henriques@canonical.com>  Tue, 11 Aug 2015 11:11:19 +0100

Changed in linux (Ubuntu Vivid):
status:	Fix Committed → Fix Released

Revision history for this message

In Linux Kernel Bug Tracker #81211, justin (justin-linux-kernel-bugs) wrote on 2015-10-14:

#107

I'm also seeing this problem; on debian 7 hosts with 3.16 kernel and debian 8 with 4.2.0; happens every 1-2 days for us (across a few dozen machines). We haven't found a way to reliable reproduce the bug yet; but we are unable to shutdown the system cleanly, we have to force power off to be able to continue to run containers.

Brad Figg (brad-figg) on 2015-10-29

Changed in linux (Ubuntu Trusty):
status:	Fix Committed → Fix Released

Revision history for this message

Launchpad Janitor (janitor) wrote on 2015-11-23:

#60

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-utopic (Ubuntu):
status:	New → Confirmed

Revision history for this message

Andrew Ruthven (andrew-etc) wrote on 2015-11-23:

#61

We're still seeing the same issue on Ubuntu Trusty running the linux-image-3.19.0-30-generic kernel.

Looking at this thread[0] on the Docker site which is referenced to in the kernel bugzilla for this issue, there is a reference[1] to a patch[2] on the netdev mailing list from 2015-11-05 by Francesco Ruggeri, later that day David Miller accepted it and queued it for -stable. The patch is currently in the master branch of Linus's tree, but hasn't made it into the 4.2 branch yet.

The patch doesn't apply cleanly to the Canonical 3.19 branch, but it looks like it is mostly line skew. I haven't tried hacking it in yet.

[0] https://github.com/docker/docker/issues/5618
[1] https://github.com/docker/docker/issues/5618#issuecomment-154334933
[2] http://www.spinics.net/lists/netdev/msg351337.html

Revision history for this message

Rodrigo Vaz (rodrigo-vaz) wrote on 2015-12-15:

#62

FWIW here is an update of what I've tried in the last couple months trying to fix this problem (unsuccessfully):

- We tried to deny packets to the container's network before we destroy the namesapce
- Backported the patch mentioned in the previous comment to ubuntu kernel 3.19 and 4.2 (linux-lts-vivid and linux-lts-wily)
- Applied patches listed in this thread: http://www.spinics.net/lists/netdev/msg345205.html on trusty kernel, vivid kernel already has
- 4.4-rc3 net-next branch

All experiments above failed to fix the problem and the bug was triggered in production machines within 24h using these experimental kernels.

We still can't reliably reproduce this issue outside production but it is easy to validate any proposed solution with a few hours of production load.

Rodrigo.

Revision history for this message

James Dempsey (jamespd) wrote on 2015-12-15:

#63

Download full text (4.2 KiB)

We also backported [1] to 4.2 (linux-lts-wily) and deployed it to our production OpenStack cloud. We just installed it yesterday and our MTBF is between two and twenty days, so we won't know if this has made any difference for a while now.

Some details about our configuration / failure mode:

Three OpenStack "Layer 3" hosts (running 3.19.0-30-generic #34~14.04.1-Ubuntu) providing virtual routers/VPNs/Metadata via network namespaces.

Our most recent failures occurred on hosts B and C (within 30 minutes of each other, after having been fine for weeks) while removing routers from A and re-creating them on B and C.

Our stack traces are a slightly different from the ones posted above...

Dec 14 15:37:05 hostname kernel: [961050.119727] INFO: task ip:9865 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.126707] Tainted: G C 3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.135073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.144094] ip D ffff88097e3e3de8 0 9865 9864 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.144098] ffff88097e3e3de8 ffff880e982693a0 0000000000013e80 ffff88097e3e3fd8
Dec 14 15:37:05 hostname kernel: [961050.144100] 0000000000013e80 ffff88101a8993a0 ffff880e982693a0 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.144102] ffffffff81cdb2a0 ffffffff81cdb2a4 ffff880e982693a0 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.144104] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.144109] [<ffffffff817b2fa9>] schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.144111] [<ffffffff817b4c95>] __mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.144115] [<ffffffff811cfd66>] ? __kmalloc+0x226/0x280
Dec 14 15:37:05 hostname kernel: [961050.144117] [<ffffffff816a14a1>] ? net_alloc_generic+0x21/0x30
Dec 14 15:37:05 hostname kernel: [961050.144120] [<ffffffff817b4d23>] mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.144122] [<ffffffff816a1c75>] copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.144125] [<ffffffff810943ad>] create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.144127] [<ffffffff810945ba>] unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.144130] [<ffffffff8107439b>] SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.144133] [<ffffffff817b6e4d>] system_call_fastpath+0x16/0x1b
Dec 14 15:37:05 hostname kernel: [961050.144135] INFO: task ip:9896 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.151109] Tainted: G C 3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.159558] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.168551] ip D ffff8804591cfde8 0 9896 9895 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.168556] ffff8804591cfde8 ffff880814031d70 0000000000013e80 ffff8804591cffd8
Dec 14 15:37:05 hostname kernel: [961050.168558] 0000...

We also backported [1] to 4.2 (linux-lts-wily) and deployed it to our production OpenStack cloud.  We just installed it yesterday and our MTBF is between two and twenty days, so we won't know if this has made any difference for a while now.

Some details about our configuration / failure mode:

Three OpenStack "Layer 3" hosts (running 3.19.0-30-generic #34~14.04.1-Ubuntu) providing virtual routers/VPNs/Metadata via network namespaces.

Our most recent failures occurred on hosts B and C (within 30 minutes of each other, after having been fine for weeks) while removing routers from A and re-creating them on B and C.

Our stack traces are a slightly different from the ones posted above...

Dec 14 15:37:05 hostname kernel: [961050.119727] INFO: task ip:9865 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.126707]       Tainted: G         C     3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.135073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.144094] ip              D ffff88097e3e3de8     0  9865   9864 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.144098]  ffff88097e3e3de8 ffff880e982693a0 0000000000013e80 ffff88097e3e3fd8
Dec 14 15:37:05 hostname kernel: [961050.144100]  0000000000013e80 ffff88101a8993a0 ffff880e982693a0 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.144102]  ffffffff81cdb2a0 ffffffff81cdb2a4 ffff880e982693a0 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.144104] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.144109]  [<ffffffff817b2fa9>] schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.144111]  [<ffffffff817b4c95>] __mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.144115]  [<ffffffff811cfd66>] ? __kmalloc+0x226/0x280
Dec 14 15:37:05 hostname kernel: [961050.144117]  [<ffffffff816a14a1>] ? net_alloc_generic+0x21/0x30
Dec 14 15:37:05 hostname kernel: [961050.144120]  [<ffffffff817b4d23>] mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.144122]  [<ffffffff816a1c75>] copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.144125]  [<ffffffff810943ad>] create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.144127]  [<ffffffff810945ba>] unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.144130]  [<ffffffff8107439b>] SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.144133]  [<ffffffff817b6e4d>] system_call_fastpath+0x16/0x1b
Dec 14 15:37:05 hostname kernel: [961050.144135] INFO: task ip:9896 blocked for more than 120 seconds.
Dec 14 15:37:05 hostname kernel: [961050.151109]       Tainted: G         C     3.19.0-30-generic #34~14.04.1-Ubuntu
Dec 14 15:37:05 hostname kernel: [961050.159558] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 14 15:37:05 hostname kernel: [961050.168551] ip              D ffff8804591cfde8     0  9896   9895 0x00000000
Dec 14 15:37:05 hostname kernel: [961050.168556]  ffff8804591cfde8 ffff880814031d70 0000000000013e80 ffff8804591cffd8
Dec 14 15:37:05 hostname kernel: [961050.168558]  0000000000013e80 ffffffff81c1d4e0 ffff880814031d70 0000000000000000
Dec 14 15:37:05 hostname kernel: [961050.168560]  ffffffff81cdb2a0 ffffffff81cdb2a4 ffff880814031d70 00000000ffffffff
Dec 14 15:37:05 hostname kernel: [961050.168562] Call Trace:
Dec 14 15:37:05 hostname kernel: [961050.168568]  [<ffffffff817b2fa9>] schedule_preempt_disabled+0x29/0x70
Dec 14 15:37:05 hostname kernel: [961050.168571]  [<ffffffff817b4c95>] __mutex_lock_slowpath+0x95/0x100
Dec 14 15:37:05 hostname kernel: [961050.168573]  [<ffffffff817b4d23>] mutex_lock+0x23/0x37
Dec 14 15:37:05 hostname kernel: [961050.168577]  [<ffffffff816a1c75>] copy_net_ns+0x75/0x150
Dec 14 15:37:05 hostname kernel: [961050.168581]  [<ffffffff810943ad>] create_new_namespaces+0xfd/0x180
Dec 14 15:37:05 hostname kernel: [961050.168584]  [<ffffffff810945ba>] unshare_nsproxy_namespaces+0x5a/0xc0
Dec 14 15:37:05 hostname kernel: [961050.168587]  [<ffffffff8107439b>] SyS_unshare+0x15b/0x2e0
Dec 14 15:37:05 hostname kernel: [961050.168589]  [<ffffffff817b6e4d>] system_call_fastpath+0x16/0x1b

[1] http://www.spinics.net/lists/netdev/msg351337.html

Cheers,
James Dempsey

Revision history for this message

Dan Streetman (ddstreet) wrote on 2015-12-22:

#64

Andrew, Rodrigo, James, do any of you have a crash dump from a system where the problem has happened? or any way to more easily reproduce it?

penalvch (penalvch) on 2015-12-26

Changed in linux-lts-utopic (Ubuntu):
importance:	Undecided → Medium

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-01:

#65

I should note that I'm pretty sure this new problem is almost certainly a different bug (different cause) than the original bug fixed in comment 59 and earlier; just the symptoms appear the same.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-01:

#66

James Dempsey, the upstream commit you referenced is in trusty lts-vivid at Ubuntu-lts-3.19.0-49.55_14.04.1

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-01:

#67

> James Dempsey, the upstream commit you referenced is in trusty lts-vivid at Ubuntu-lts-3.19.0-49.55_14.04.1

specifically, upstream it's commit 30f7ea1c2b5f5fb7462c5ae44fe2e40cb2d6a474 and in the trusty lts branch it's commit a8412804321b96aac2665527535ac28a934ed0e2

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-01:

#68

Andrew Ruthven that's your patch also ^

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-01:

#69

Rodrigo, just in case the problem you're seeing is fixed by ^ commit, please try the latest trusty lts-vivid (at version I listed above, or later).

Revision history for this message

In Linux Kernel Bug Tracker #81211, mitsuto (mitsuto-linux-kernel-bugs) wrote on 2016-02-08:

#108

I'm also seeing this behaviour on docker 1.9.1 and kernel 4.4.1 from kernel-ml package on centos7. It's intermittent and not reliably reproducible.

Revision history for this message

Cristian Calin (cristi-calin) wrote on 2016-02-11:

#70

@ddstreet, the Ubuntu-lts-3.19.0-49.55_14.04.1 doesn't seem to fix this issue.
We just upgraded a Juno openstack environment to this kernel version and we are still seeing this on a network node.

[controller] root@node-28:~# uname -a
Linux node-28.domain.tld 3.19.0-49-generic #55~14.04.1-Ubuntu SMP Fri Jan 22 11:24:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[controller] root@node-28:~# dmesg -T | tail -n 3
[Thu Feb 11 07:14:15 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[Thu Feb 11 07:14:26 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[Thu Feb 11 07:14:36 2016] unregister_netdevice: waiting for lo to become free. Usage count = 4
[controller] root@node-28:~# dmesg -T | grep -c unregister_netdevice..waiting
6401

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-02-11:

#71

Cristian ok thnx. I'll keep investigating.

Revision history for this message

In Linux Kernel Bug Tracker #81211, gopinath.taget (gopinath.taget-linux-kernel-bugs) wrote on 2016-03-04:

#109

Guys,

I am seeing this issue on CoreOS 970.1.0 with Docker 1.10.1. This issue has been festering for more than 1.5 years. What can I do to help fix it? What would be the direction of investigation? Any pointers would help.

Thanks
Gopinath

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-03-24:

#72

If anyone is able to reproduce this, I have a debug kernel at this ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

it prints a debug line when any loopback interface is dev_put/dev_hold, which may help show where the extra loopback reference is coming from when this problem happens. If anyone can reproduce the problem using the debug kernel, please attach the logs (and/or sosreport).

Revision history for this message

In Linux Kernel Bug Tracker #81211, schnitzelbub (schnitzelbub-linux-kernel-bugs) wrote on 2016-04-21:

#110

I can confirm that network IO will trigger this issue.
On a Debian 8.4 host running Docker 1.11.0 containers on top of kernel 4.4.6 I too face this problem.
My repro is to build an image (based on the default Debian image debian:jessie) that first dist-upgrades system packages and then installs some more packages.
I haven't experienced this running a similar setup on VirtualBox/Vagrant so far, though.
Any hint on how to tackle this highly appreciated.

Cheers,
Hannes

Revision history for this message

Michael Richardson (mjrichardson) wrote on 2016-06-06:

#73

Credit where credit is due: we've slowly rolled the patch above (per #63) into production over the last six months, initially through custom compiled kernels and more recently via mainline LTS wily packages; and have seen no further issues with hung namespaces. Kudos to ddstreet.

Revision history for this message

In Linux Kernel Bug Tracker #81211, philporada (philporada-linux-kernel-bugs) wrote on 2016-07-13:

#111

I'm running on CentOS Linux release 7.2.1511 (Core) with the Linux 4.6.3-1.el7.elrepo.x86_64 kernel. My docker version is 1.11.2. I am running on Amazon EC2. This does NOT happen to me in VirtualBox with the same kernel/docker version.

There's quite a bit of discussion on https://github.com/docker/docker/issues/5618

Revision history for this message

In Linux Kernel Bug Tracker #81211, bastienphilbert (bastienphilbert-linux-kernel-bugs) wrote on 2016-07-15:

#112

Can you git bisect it as you known what kernel version it seemed to stop working on and report back with the offending commit id when the bisection is complete.

Revision history for this message

In Linux Kernel Bug Tracker #81211, pegerto (pegerto-linux-kernel-bugs) wrote on 2016-08-11:

#113

Hi,

According to the docker issue is raise by creating destroying multiple containers, form my undertanding is a raise condition as this is not replicable with a single cpu vm.

I reproduced the issue with CoreOS 1068.8.0 over VirtualBox, I can take a kernel dump from the hypervisor if this helps.

Regards.

Revision history for this message

In Linux Kernel Bug Tracker #81211, bastienphilbert (bastienphilbert-linux-kernel-bugs) wrote on 2016-08-12:

#114

Can please do, and you mean a race condition not a raise condition I think.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-09-12:

#74

There are upstream commits that may fix this problem, I've updated the same ppa with a new test build (with only the commits, and without any debug). If anyone can still reproduce this problem, please try the latest kernel in the ppa and report if it fixes the problem or not:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

the upstream commits are:
751eb6b6042a596b0080967c1a529a9fe98dac1d ("ipv6: addrconf: fix dev refcont leak when DAD failed")
a5d0dc810abf3d6b241777467ee1d6efb02575fc ("vti: flush x-netns xfrm cache when vti interface is removed")

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-09-16:

#75

This kernel bug, is affecting all the container solutions also on Xenial / 16.04.1 up to 4.4.0-38.

Lots of people are hitting this on many distributions / container solutions:
https://github.com/docker/docker/issues/5618 (mentions Docker, LXC, kubernetes, etc.)
https://github.com/coreos/bugs/issues/254
...

Apparently this was fixed in 12.04 and now reintroduced in 16.04.

Once you get the following dmesg message:
unregister_netdevice: waiting for lo to become free. Usage count = 1
... The only fix is to reboot the host.

What I gathered from the docker issue tracking this, it seems a fix was merged in net-next very recently:
http://www.spinics.net/lists/netdev/msg393688.html

... and is in mainline now:
https://github.com/torvalds/linux/blob/master/net/ipv6/addrconf.c

As I wasn't sure this thread was tracking Xenial (I just added it to the list), I'm trying to figure out if Ubuntu is tracking this issue upstream and can backport it into 16.04's 4.4.0-? branch, as a lot of container use case start to break down because of this bug for the past weeks?

Thanks!

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-09-16:

#76

Bug was reintroduced in Ubuntu Xenial's kernel.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-09-16:

#77

Alexandre, please see my last comment, that links to a test kernel PPA with EXACTLY that upstream commit, 751eb6b6042a596b0080967c1a529a9fe98dac1d ("ipv6: addrconf: fix dev refcont leak when DAD failed")

If you can reproduce this, please test with the PPA and report if it fixes the problem.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-09-16:

#78

Ah, you need a patched test kernel on xenial...I'll upload one to the PPA for testing.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-09-16:

#79

PPA updated with patched test kernels for xenial as well as trusty lts-xenial.
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

Revision history for this message

Paul van Schayck (paulvanschayck) wrote on 2016-09-23:

#80

Dan, thanks for your work. I've tried your PPA with the included patch (4.4.0-36-generic #55hf1403152v20160916b1-Ubuntu). As I've also reported on the Github docker issue [1], this does not fix the issue for me.

Would it help you test again with this patch, and with your debug patches included?

[1] https://github.com/docker/docker/issues/5618#issuecomment-249149641

Revision history for this message

Launchpad Janitor (janitor) wrote on 2016-09-23:

#81

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-lts-xenial (Ubuntu Trusty):
status:	New → Confirmed
Changed in linux-lts-xenial (Ubuntu):
status:	New → Confirmed

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-09-26:

#83

> I've tried your PPA with the included patch (4.4.0-36-generic #55hf1403152v20160916b1-Ubuntu).
> As I've also reported on the Github docker issue [1], this does not fix the issue for me.

that's unfortunate.

> Would it help you test again with this patch, and with your debug patches included?

I suspect adding the debug would change the timing enough to work around the bug, as it did before. I'll take a deeper look to see if there is a different way to add debug.

Joseph Salisbury (jsalisbury) on 2016-09-30

tags:	added: verification-done-vivid removed: verification-needed-vivid
tags:	added: verification-done-trusty removed: verification-needed-trusty

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-10-11:

#84

Apparently the fix is in:

https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-10-13:

#85

> Apparently the fix is in:
>
> https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22

Hopefully the commit does fix things, although Paul's comment above indicated it may not, at least not for everyone.

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-10-13:

#86

Oh oops, didn't realize this was the patch with the 751eb6b6042a596b0080967c1a529a9fe98dac1d commit Paul mentionned.

Well, if that made it into Ubuntu's kernel before and including 4.4.0-42, it didn't fix it (I still get the bug on many machines & VMs).

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-10-14:

#87

> Well, if that made it into Ubuntu's kernel before and including 4.4.0-42, it didn't fix it
> (I still get the bug on many machines & VMs).

No, it's not in the released xenial kernel yet, the upstream 4.4.22 stable commits are still in the xenial master-next branch kernel; it'll be a bit longer before xenial's normal kernel is updated with the patch.

Revision history for this message

Xav Paice (xavpaice) wrote on 2016-10-24:

#88

From the logs it looks like the patch is now a part of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627730 which hit 4.4.0-46.67~14.04.1 (proposed) on 22nd Oct?

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-10-24:

#89

> From the logs it looks like the patch is now a part of
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1627730 which hit 4.4.0-46.67~14.04.1
> (proposed) on 22nd Oct?

yes, the patch is included in the 4.4.0-46.67 kernel (both linux-lts-xenial on trusty, and regular linux on xenial). anyone able to reproduce the problem with that kernel level?

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-11-08:

#90

I was running 4.4.0-46 (not sure if different than the latest .67?) on October 27th and yesterday had to reboot for the "... count = 1" problem.

Was there previous 4.4.0-46 releases? If not, the bug is still present...

Otherwise, I'll confirm if it happens again, or if it doesn't within a few weeks.

Revision history for this message

costinel (costinel) wrote on 2016-11-15:

#91

seeing this on xenial after this sequence of operations:

lxc stop run2
lxc delete run2
lxc copy run1 run2
lxc start run2 <- hangs here, with later dmesg

[337766.146479] unregister_netdevice: waiting for lo to become free. Usage count = 1
[337772.435786] INFO: task lxd:20665 blocked for more than 120 seconds.
[337772.435856] Tainted: P OE 4.4.0-47-generic #68-Ubuntu
[337772.435922] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[337772.436002] lxd D ffff88006b6cbcb8 0 20665 1 0x00000004
[337772.436006] ffff88006b6cbcb8 ffffffff821d0560 ffff880235aa8000 ffff8801b57d2580
[337772.436009] ffff88006b6cc000 ffffffff81ef5f24 ffff8801b57d2580 00000000ffffffff
[337772.436010] ffffffff81ef5f28 ffff88006b6cbcd0 ffffffff81830f15 ffffffff81ef5f20
[337772.436012] Call Trace:
[337772.436020] [<ffffffff81830f15>] schedule+0x35/0x80
[337772.436022] [<ffffffff818311be>] schedule_preempt_disabled+0xe/0x10
[337772.436024] [<ffffffff81832df9>] __mutex_lock_slowpath+0xb9/0x130
[337772.436026] [<ffffffff81832e8f>] mutex_lock+0x1f/0x30
[337772.436029] [<ffffffff8171f98e>] copy_net_ns+0x6e/0x120
[337772.436033] [<ffffffff810a148b>] create_new_namespaces+0x11b/0x1d0
[337772.436035] [<ffffffff810a15ad>] copy_namespaces+0x6d/0xa0
[337772.436038] [<ffffffff8107f135>] copy_process+0x905/0x1b70
[337772.436040] [<ffffffff81080530>] _do_fork+0x80/0x360
[337772.436044] [<ffffffff8120bf28>] ? SyS_access+0x1e8/0x230
[337772.436046] [<ffffffff810808b9>] SyS_clone+0x19/0x20
[337772.436048] [<ffffffff81834ff2>] entry_SYSCALL_64_fastpath+0x16/0x71

fwiw, at the time of stop, run2 had an additional IP address on dev lo, and at the time of copy, run1 also was running and had an additional ip address on dev lo

Revision history for this message

Paul van Schayck (paulvanschayck) wrote on 2016-11-23:

#92

Dear Dan, user reshen on the github Docker issue thread [1] did some extensive testing using kernel 4.8.8. He has not been able to reproduce the issue using that version. I've also done testing using the ubuntu mainline kernel builds and also not been able to reproduce the issue anymore.

He has also pointed to two possible kernel patches responsible for the fix. Would it be possible for you to create a backport build of those again?

[1] https://github.com/docker/docker/issues/5618#issuecomment-262081488

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-11-24:

#93

Update:
I've hit the issue with 4.4.0-47... now testing 4.4.0-49 for the next weeks.

Revision history for this message

Alexandre (totalworlddomination) wrote on 2016-11-28:

#94

Seems like the fix isn't in or isn't working for this problem as I've just hit this issue again on 4.4.0-49, at least twice:
[434074.636075] unregister_netdevice: waiting for lo to become free. Usage count = 1

Since, I see a lot of people talking about 4.8+ kernels fixing this issue, is there a generic package (or future plans for one, say when 4.9 LTS comes out in a week or two) that will allow anyone to upgrade, but also keep up with updates without following versions by hand? (Like the linux-image-generic package)

cheers!

Revision history for this message

Alessandro Polverini (polve) wrote on 2016-11-29:

#95

I reproduce the problem with LXC 2.0.6-1 and kernel 4.6.4

Revision history for this message

Dan Streetman (ddstreet) wrote on 2016-12-14:

#96

I added the 2 commits referenced in the docker thread:
https://github.com/docker/docker/issues/5618#issuecomment-262081488

which is building right now in the test ppa:
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1403152

kernel version 4.4.0-53.74+hf1403152v20161214b1 is the latest xenial kernel, plus those 2 patches.

As far as testing with the 4.8 kernel on xenial, you can use the kernel team's build ppa:
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa

obviously kernels there are completely unsupported and may break everything, etc., etc., and should be used for testing/debug ONLY.

to install the yakkety 4.8 kernel on a xenial system, after adding the ppa to your system:

# sudo apt install linux-generic-hwe-16.04-edge

Revision history for this message

In Linux Kernel Bug Tracker #81211, kevin (kevin-linux-kernel-bugs) wrote on 2017-01-12:

#115

Download full text (10.0 KiB)

Hi,

I reproduced the issue using kernel with this patch:

(Log each refcnt increase and decrease for netdev with its function name)

https://lkml.org/lkml/2008/7/20/5

It shows the following logs (for loopback device inside container namespace) when the issue occured:

Jan 11 16:17:43 <hostname> kernel: [ 2196.943640] lo: dev_hold 1 rx_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943652] lo: dev_hold 2 netdev_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943654] lo: dev_hold 3 register_netdevice
Jan 11 16:17:43 <hostname> kernel: [ 2196.943658] lo: dev_hold 4 neigh_parms_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2196.943660] lo: dev_hold 5 inetdev_init

(I patched Docker to rename lo inside container to lodebug for ease of tracking)
Jan 11 16:17:43 <hostname> kernel: [ 2196.981741] lodebug: renamed from lo

Jan 11 16:17:43 <hostname> kernel: [ 2197.001771] lodebug: dev_hold 6 qdisc_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2197.001815] lodebug: dev_hold 7 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001824] lodebug: dev_hold 8 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001831] lodebug: dev_hold 9 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001836] lodebug: dev_hold 10 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001843] lodebug: dev_hold 11 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001849] lodebug: dev_hold 12 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001852] lodebug: dev_hold 13 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001856] lodebug: dev_hold 14 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.025451] lodebug: dev_put 14 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025473] lodebug: dev_put 13 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025475] lodebug: dev_put 12 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025477] lodebug: dev_put 11 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025480] lodebug: dev_put 10 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025482] lodebug: dev_put 9 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.414900] lodebug: dev_hold 9 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418634] lodebug: dev_hold 10 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418638] lodebug: dev_hold 11 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.445496] lodebug: dev_put 11 dst_destroy
Jan 11 16:17:44 <hostname> kernel: [ 2197.614240] lodebug: dev_hold 11 dst_init

(after __neigh_create function call, the reference count for loopback device is incorrectly increased to 13)

Jan 11 16:20:41 <hostname> kernel: [ 2374.898917] lodebug: dev_hold 13 __neigh_create
Jan 11 16:23:42 <hostname> kernel: [ 2555.900160] lodebug: dev_hold 14 dst_init
Jan 11 16:24:10 <hostname> kernel: [ 2583.573193] lodebug: dev_hold 15 dst_init
Jan 11 16:26:09 <hostname> kernel: [ 2702.992174] lodebug: dev_hold 16 dst_init
Jan 11 16:27:15 <hostname> kernel: [ 2769.188044] lodebug: dev_hold 17 dst_init
Jan 11 16:31:57 <hostname> kernel: [ 3050.821593] lodebug: dev_hold 18 dst_init
Jan 11 16:37:41 <hostname> kernel: [ 3394.962714] lodebug:...

Hi,

I reproduced the issue using kernel with this patch:

(Log each refcnt increase and decrease for netdev with its function name)

https://lkml.org/lkml/2008/7/20/5

It shows the following logs (for loopback device inside container namespace) when the issue occured:

Jan 11 16:17:43 <hostname> kernel: [ 2196.943640] lo: dev_hold 1 rx_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943652] lo: dev_hold 2 netdev_queue_add_kobject
Jan 11 16:17:43 <hostname> kernel: [ 2196.943654] lo: dev_hold 3 register_netdevice
Jan 11 16:17:43 <hostname> kernel: [ 2196.943658] lo: dev_hold 4 neigh_parms_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2196.943660] lo: dev_hold 5 inetdev_init

(I patched Docker to rename lo inside container to lodebug for ease of tracking)
Jan 11 16:17:43 <hostname> kernel: [ 2196.981741] lodebug: renamed from lo

Jan 11 16:17:43 <hostname> kernel: [ 2197.001771] lodebug: dev_hold 6 qdisc_alloc
Jan 11 16:17:43 <hostname> kernel: [ 2197.001815] lodebug: dev_hold 7 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001824] lodebug: dev_hold 8 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001831] lodebug: dev_hold 9 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001836] lodebug: dev_hold 10 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001843] lodebug: dev_hold 11 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001849] lodebug: dev_hold 12 dev_get_by_index
Jan 11 16:17:43 <hostname> kernel: [ 2197.001852] lodebug: dev_hold 13 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.001856] lodebug: dev_hold 14 fib_check_nh
Jan 11 16:17:43 <hostname> kernel: [ 2197.025451] lodebug: dev_put 14 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025473] lodebug: dev_put 13 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025475] lodebug: dev_put 12 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025477] lodebug: dev_put 11 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025480] lodebug: dev_put 10 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.025482] lodebug: dev_put 9 free_fib_info_rcu
Jan 11 16:17:43 <hostname> kernel: [ 2197.414900] lodebug: dev_hold 9 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418634] lodebug: dev_hold 10 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.418638] lodebug: dev_hold 11 dst_init
Jan 11 16:17:43 <hostname> kernel: [ 2197.445496] lodebug: dev_put 11 dst_destroy
Jan 11 16:17:44 <hostname> kernel: [ 2197.614240] lodebug: dev_hold 11 dst_init

(after __neigh_create function call, the reference count for loopback device is incorrectly increased to 13)

Jan 11 16:20:41 <hostname> kernel: [ 2374.898917] lodebug: dev_hold 13 __neigh_create
Jan 11 16:23:42 <hostname> kernel: [ 2555.900160] lodebug: dev_hold 14 dst_init
Jan 11 16:24:10 <hostname> kernel: [ 2583.573193] lodebug: dev_hold 15 dst_init
Jan 11 16:26:09 <hostname> kernel: [ 2702.992174] lodebug: dev_hold 16 dst_init
Jan 11 16:27:15 <hostname> kernel: [ 2769.188044] lodebug: dev_hold 17 dst_init
Jan 11 16:31:57 <hostname> kernel: [ 3050.821593] lodebug: dev_hold 18 dst_init
Jan 11 16:37:41 <hostname> kernel: [ 3394.962714] lodebug: dev_hold 19 dst_init
Jan 11 16:46:27 <hostname> kernel: [ 3921.376163] lodebug: dev_hold 20 dst_init
Jan 11 16:46:30 <hostname> kernel: [ 3923.759813] lodebug: dev_hold 21 dst_init

(container and networking namespace tearing down)

Jan 11 16:47:01 <hostname> kernel: [ 3954.802588] lodebug: dev_hold 22 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3954.802596] lodebug: dev_hold 23 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3954.802599] lodebug: dev_hold 24 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3954.802602] lodebug: dev_hold 25 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3954.802605] lodebug: dev_hold 26 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3954.802609] lodebug: dev_hold 27 dst_ifdown
Jan 11 16:47:01 <hostname> kernel: [ 3955.482563] lodebug: dev_put 27 dst_destroy
Jan 11 16:47:01 <hostname> kernel: [ 3955.482566] lodebug: dev_put 26 dst_destroy
Jan 11 16:47:01 <hostname> kernel: [ 3955.482568] lodebug: dev_put 25 dst_destroy
Jan 11 16:47:01 <hostname> kernel: [ 3955.482570] lodebug: dev_put 24 dst_destroy
Jan 11 16:47:01 <hostname> kernel: [ 3955.482572] lodebug: dev_put 23 dst_destroy
Jan 11 16:47:01 <hostname> kernel: [ 3955.482574] lodebug: dev_put 22 dst_destroy

Jan 11 16:49:20 <hostname> kernel: [ 4093.651505] lodebug: dev_put 21 neigh_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.653397] lodebug: dev_put 20 qdisc_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.653432] lodebug: dev_put 19 neigh_parms_release
Jan 11 16:49:20 <hostname> kernel: [ 4093.653463] lodebug: dev_put 18 rx_queue_release
Jan 11 16:49:20 <hostname> kernel: [ 4093.653474] lodebug: dev_put 17 netdev_queue_release
Jan 11 16:49:20 <hostname> kernel: [ 4093.653520] lodebug: dev_put 16 rollback_registered_many
Jan 11 16:49:20 <hostname> kernel: [ 4093.663424] lodebug: dev_put 15 free_fib_info_rcu
Jan 11 16:49:20 <hostname> kernel: [ 4093.667401] lodebug: dev_put 14 free_fib_info_rcu
Jan 11 16:49:20 <hostname> kernel: [ 4093.667430] lodebug: dev_put 13 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667432] lodebug: dev_put 12 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667434] lodebug: dev_put 11 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667436] lodebug: dev_put 10 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667438] lodebug: dev_put 9 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667441] lodebug: dev_put 8 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667446] lodebug: dev_put 7 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667453] lodebug: dev_put 6 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667459] lodebug: dev_put 5 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667461] lodebug: dev_put 4 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667463] lodebug: dev_put 3 dst_destroy
Jan 11 16:49:20 <hostname> kernel: [ 4093.667478] lodebug: dev_put 2 in_dev_finish_destroy

Jan 11 16:49:20 <hostname> kernel: [ 4093.667493] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:20 <hostname> kernel: [ 4093.667495] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:21 <hostname> kernel: [ 4094.691406] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:21 <hostname> kernel: [ 4094.691409] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:22 <hostname> kernel: [ 4095.715386] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:22 <hostname> kernel: [ 4095.715390] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:23 <hostname> kernel: [ 4096.739367] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:23 <hostname> kernel: [ 4096.739370] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:24 <hostname> kernel: [ 4097.763338] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:24 <hostname> kernel: [ 4097.763342] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:25 <hostname> kernel: [ 4098.787312] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:25 <hostname> kernel: [ 4098.787315] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:26 <hostname> kernel: [ 4099.811293] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:26 <hostname> kernel: [ 4099.811297] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:27 <hostname> kernel: [ 4100.835270] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:27 <hostname> kernel: [ 4100.835273] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:28 <hostname> kernel: [ 4101.859249] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:28 <hostname> kernel: [ 4101.859252] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:29 <hostname> kernel: [ 4102.883228] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:29 <hostname> kernel: [ 4102.883231] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:30 <hostname> kernel: [ 4103.907196] unregister_netdevice: waiting for lodebug to become free. Usage count = 1
Jan 11 16:49:30 <hostname> kernel: [ 4103.916753] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:30 <hostname> kernel: [ 4103.916755] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:31 <hostname> kernel: [ 4104.939171] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:31 <hostname> kernel: [ 4104.939174] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:32 <hostname> kernel: [ 4105.963147] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:32 <hostname> kernel: [ 4105.963150] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:33 <hostname> kernel: [ 4106.987135] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:33 <hostname> kernel: [ 4106.987138] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:34 <hostname> kernel: [ 4108.011109] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:34 <hostname> kernel: [ 4108.011112] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:35 <hostname> kernel: [ 4109.035084] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:35 <hostname> kernel: [ 4109.035087] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:36 <hostname> kernel: [ 4110.059057] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:36 <hostname> kernel: [ 4110.059060] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:37 <hostname> kernel: [ 4111.095053] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:37 <hostname> kernel: [ 4111.095056] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:38 <hostname> kernel: [ 4112.119017] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:38 <hostname> kernel: [ 4112.119020] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:39 <hostname> kernel: [ 4113.142987] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:39 <hostname> kernel: [ 4113.142990] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:40 <hostname> kernel: [ 4114.166957] unregister_netdevice: waiting for lodebug to become free. Usage count = 1
Jan 11 16:49:40 <hostname> kernel: [ 4114.176525] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:40 <hostname> kernel: [ 4114.176527] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:41 <hostname> kernel: [ 4115.198945] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:41 <hostname> kernel: [ 4115.198949] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:42 <hostname> kernel: [ 4116.222927] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:42 <hostname> kernel: [ 4116.222931] lodebug: dev_put 2 dst_ifdown
Jan 11 16:49:43 <hostname> kernel: [ 4117.246894] lodebug: dev_hold 2 dst_ifdown
Jan 11 16:49:43 <hostname> kernel: [ 4117.246897] lodebug: dev_put 2 dst_ifdown

Thanks,
Kevin

Revision history for this message

Timo Furrer (tuxtimo) wrote on 2017-07-28:

#97

I've hit the same issue today with 4.4.0-87-generic on a xenial.
Is there a confirmed fix for a particular released kernel for Ubuntu?

Revision history for this message

Timo Furrer (tuxtimo) wrote on 2017-07-28:

#98

I've hit the same issue today with 4.4.0-87-generic on a xenial.
Is there a confirmed fix for a particular released kernel for Ubuntu?

And same for 4.10.0-28-generic

Revision history for this message

Dan Streetman (ddstreet) wrote on 2017-08-17:

#99

Since this bug has been long-since marked 'fix released', I opened a new bug 1711407. Let's all please use that one for continued discussion of this problem.

Revision history for this message

In Linux Kernel Bug Tracker #81211, stephen (stephen-linux-kernel-bugs) wrote on 2018-01-05:

#116

(In reply to pegerto from comment #13)
> Hi,
>
> According to the docker issue is raise by creating destroying multiple
> containers, form my undertanding is a raise condition as this is not
> replicable with a single cpu vm.
>
> I reproduced the issue with CoreOS 1068.8.0 over VirtualBox, I can take a
> kernel dump from the hypervisor if this helps.
>
> Regards.

Remember that this upstream kernel bugzilla.
If any out of tree modules like VirtualBox are present then the bug is suspect and will be marked as INVALID.

Revision history for this message

In Linux Kernel Bug Tracker #81211, de.techno (de.techno-linux-kernel-bugs) wrote on 2018-03-26:

#117

Happens with 4.9 Debian kernel too.

Bug Watch Updater (bug-watch-updater) on 2018-07-19

Changed in linux:
importance:	Unknown → Medium
status:	Unknown → Confirmed

Revision history for this message

In Linux Kernel Bug Tracker #81211, maitoor (maitoor-linux-kernel-bugs) wrote on 2018-12-06:

#118

The system ubuntu16.0.4 don't fix it,and
$ uname -a
Linux hostname-test 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ docker version
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:24:56 2018
OS/Arch: linux/amd64
Experimental: false

Server:
Engine:
  Version: 18.06.1-ce
  API version: 1.38 (minimum version 1.12)
  Go version: go1.10.3
  Git commit: e68fc7a
  Built: Tue Aug 21 17:23:21 2018
  OS/Arch: linux/amd64
  Experimental: false
$ sudo dmesg |tail -20
[268985.837007] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[268995.856958] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269005.868864] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269015.876471] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269026.128173] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269036.143902] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269046.387621] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269056.411322] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269066.431065] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269076.438783] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269086.446489] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269096.490222] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269106.505820] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269116.545378] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269126.564839] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269136.568241] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269146.571673] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269156.827151] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269167.090594] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269177.330099] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1

The system ubuntu16.0.4 don't fix it,and 
$ uname -a
Linux hostname-test 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false
$ sudo dmesg |tail -20
[268985.837007] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[268995.856958] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269005.868864] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269015.876471] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269026.128173] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269036.143902] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269046.387621] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269056.411322] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269066.431065] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269076.438783] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269086.446489] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269096.490222] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269106.505820] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269116.545378] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269126.564839] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269136.568241] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269146.571673] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269156.827151] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269167.090594] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1
[269177.330099] unregister_netdevice: waiting for veth34afdb4 to become free. Usage count = 1

Revision history for this message

In Linux Kernel Bug Tracker #81211, jdelvare (jdelvare-linux-kernel-bugs) wrote on 2018-12-14:

#119

(In reply to Kevin from comment #15)
> Jan 11 16:17:44 <hostname> kernel: [ 2197.614240] lodebug: dev_hold 11
> dst_init
>
> (after __neigh_create function call, the reference count for loopback device
> is incorrectly increased to 13)
>
> Jan 11 16:20:41 <hostname> kernel: [ 2374.898917] lodebug: dev_hold 13
> __neigh_create

This is hard to believe or explain. The only way to increase the counter is to call dev_hold(), and whenever dev_hold() is called, a message saying so would be logged. The only explanation I can think of would be if the memory area holding the per-cpu counters would have been accidentally overwritten (say by overrun of an array in nearby memory).

But a more likely explanation is that there is a missing line in the log, either due to copy and paste error, or because somehow the message never made it to the log buffer (are there still contexts in which printk doesn't work?)

Question: do you see the same apparent +2 counter increase whenever you hit the "waiting for <dev> to become free" error, or was it a one-time thing?

Bug Watch Updater (bug-watch-updater) on 2018-12-15

Changed in linux:
status:	Confirmed → Incomplete

Dan Streetman (ddstreet) on 2019-05-25

Changed in linux-lts-xenial (Ubuntu Trusty):
status:	Confirmed → Won't Fix
Changed in linux-lts-utopic (Ubuntu):
status:	Confirmed → Won't Fix
Changed in linux-lts-xenial (Ubuntu):
status:	Confirmed → Won't Fix

Brad Figg (brad-figg) on 2019-07-24

tags:

added: cscc

Revision history for this message

Andreas Schildbach (schildbach) wrote on 2021-07-30:

#120

I've got this issue on
```
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
# uname -a
Linux ultima 5.4.0-80-generic #90-Ubuntu SMP Fri Jul 9 22:49:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
```

The message is

```
kernel:[864004.039343] unregister_netdevice: waiting for lo to become free. Usage count = 1
```

Nivedita Singhvi (niveditasinghvi) on 2021-10-06

tags:

added: sts

Ubuntu
linux package

unregister_netdevice: waiting for lo to become free. Usage count

Bug Description

Related branches

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
Linux	Incomplete	Medium	linux-kernel-bugs #81211
linux (Ubuntu)	Fix Released	Medium	Unassigned
Trusty	Fix Released	Medium	Chris J Arges
Vivid	Fix Released	Medium	Chris J Arges
linux-lts-utopic (Ubuntu)	Won't Fix	Medium	Unassigned
Trusty	Fix Released	Undecided	Unassigned
linux-lts-xenial (Ubuntu)	Won't Fix	Undecided	Unassigned
Trusty	Won't Fix	Undecided	Unassigned

Ubuntulinux package

unregister_netdevice: waiting for lo to become free. Usage count

Bug Description

Related branches

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package