Comment 8 for bug 1616268

Revision history for this message
Ross Krumbeck (rkrum) wrote :

As per the man page:

http://man7.org/linux/man-pages/man8/ip-netns.8.html

A named network namespace is an object at /var/run/netns/NAME that can be opened.

When calling:

ip netns list

It is opening the files in /var/run/netns as namespace objects. The errors are caused when a file descriptor exists in /var/run/netns which is just an ordinary file. I.e:

[root@network01 netns]# ip netns
qdhcp-48073cb3-df15-41ed-b625-ad893155668e
qdhcp-b4510557-fd11-4ea9-bcc1-e5ff92806a5c
qdhcp-d304d7f1-6a7c-4dee-8541-c1c97e182223
qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
snat-374de265-8def-4ecd-88e0-16a6ba172501
snat-3f92788c-ab75-4f19-a1eb-162b40635a7f
qrouter-3f92788c-ab75-4f19-a1eb-162b40635a7f
qrouter-374de265-8def-4ecd-88e0-16a6ba172501
[root@network01 netns]# touch /var/run/netns/test
[root@network01 netns]# ip netns
RTNETLINK answers: Invalid argument
RTNETLINK answers: Invalid argument
test
qdhcp-48073cb3-df15-41ed-b625-ad893155668e
qdhcp-b4510557-fd11-4ea9-bcc1-e5ff92806a5c
qdhcp-d304d7f1-6a7c-4dee-8541-c1c97e182223
qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
snat-374de265-8def-4ecd-88e0-16a6ba172501
snat-3f92788c-ab75-4f19-a1eb-162b40635a7f
qrouter-3f92788c-ab75-4f19-a1eb-162b40635a7f
qrouter-374de265-8def-4ecd-88e0-16a6ba172501

From Jeffrey's example the problematic files are identified by the 000 permissions which seem to be just a plain empty file like what you would make with the touch command.

Where as the typical namespace file looks like an empty file, it is actually a file descriptor that keeps a name space open and can't be read like a normal file:

[root@network01 netns]# ls -la /var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
-r--r--r-- 1 root root 0 Aug 30 04:36 /var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
[root@network01 netns]# file /var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
/var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6: empty
[root@network01 netns]# cat /var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6
cat: /var/run/netns/qdhcp-114ffc50-28ea-4fad-8f6b-298d63313ed6: Invalid argument

So I think the problem comes down an issue with the way a namespace is being deleted and due to a race condition instead of a namespace being completely removed it is replace by a standard empty file with 000 permissions.

In testing I have found that the problematic namespaces appear to be locked by one of the neutron agent docker containers and can't be removed without shutting them down.