Comment 32 for bug 1711407

Revision history for this message
Jiri Horky (jiri-horky) wrote :

his information might be relevant.

We are able to reproduce the problem with unregister_netdevice: waiting for lo to become free. Usage count = 1 with 4.14.0-rc3 kernel with CONFIG_PREEMPT_NONE=y and running only on one CPU with following boot kernel options:

BOOT_IMAGE=/boot/vmlinuz-4.14.0-rc3 root=/dev/mapper/vg0-root ro quiet vsyscall=emulate nosmp
Once we hit this state, it stays in this state and reboot is needed. No more containers can be spawned. We reproduce it by running images doing ipsec/openvpn connections + downloading a small file inside the tunnels. Then the instances exist (usually they run < 10s). We run 10s of such containers a minute on one machine. With the abovementioned settings (only 1cpu), the machine hits it in ~2 hours.

Another reproducer with the same kernel, but without limiting number of CPUs, is to jus run iperf in UDP mode for 3 seconds inside the container (so there is no TCP communication at all). If we run 10 of such containers in parallel, wait for all of them to finish and do it again, we hit the trouble in less than 10 minutes (on 40 cores machine).

In both of our reproducers, we added "ip route flush table all; ifconfig down; sleep 10" before existing from containers. It does not seem to have any effect.