Comment 34 for bug 1403152

Revision history for this message
Chris J Arges (arges) wrote :

The result of the reverse bisect between v3.17..v3.18 with the reproducer was:
# first bad commit: [34666d467cbf1e2e3c7bb15a63eccfb582cdd71f] netfilter: bridge: move br_netfilter out of the core
If I backport this patch plus 7276ca3f on top of v3.17 I no longer get the hang with the simple reproducer (although I suspect a more elaborate reproducer would still trigger the issue).

This isn't a fix because we obviously have issues in later kernels. The 'unregister_netdevice' message could occur for different code paths since there could potentially be many paths that modify the refcnt for the net_device. I'm going to track who is calling 'dev_put' and 'dev_hold' and figure out which code patch is not freeing before we start unregistering the device. I'll focus in on the reproducer I have for now, since this seems similar to the 'shutdown containers with network connections' case.

In addition I'm doing a bisect between earlier versions to see where the 'regression' occured. 3.11 seems to pass without regression for 5 threads / 50 iterations.