It appears to be that there are some code-paths (probably only relevant for worker nodes) that change networking sysctls via /proc/sys/net/ipv4/conf/<interface>/<config_key>:
# LXD
ubuntu@juju-fa887c-11-lxd-2:~$ mount | grep /proc
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
lxcfs on /proc/cpuinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/diskstats type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/meminfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/stat type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/swaps type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /proc/uptime type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
proc on /dev/.lxc/proc type proc (rw,relatime)
Looked at this https:/ /github. com/projectcali co/calicoctl/ issues/ 310
It appears to be that there are some code-paths (probably only relevant for worker nodes) that change networking sysctls via /proc/sys/ net/ipv4/ conf/<interface >/<config_ key>:
https:/ /github. com/projectcali co/felix/ blob/v3. 6.0/dataplane/ linux/endpoint_ mgr.go# L862-L879
And Docker without --privileged sets up /proc/sys as read-only even with additional capabilities:
$ docker exec -it calico-node /bin/sh
/ # mount | grep /proc/sys
proc on /proc/sys type proc (ro,relatime)
/ # capsh --print | grep net_admin cap_dac_ override, cap_fowner, cap_fsetid, cap_kill, cap_setgid, cap_setuid, cap_setpcap, cap_net_ bind_service, cap_net_ admin,cap_ net_raw, cap_sys_ chroot, cap_sys_ admin,cap_ mknod,cap_ audit_write, cap_setfcap+ eip cap_dac_ override, cap_fowner, cap_fsetid, cap_kill, cap_setgid, cap_setuid, cap_setpcap, cap_net_ bind_service, cap_net_ admin,cap_ net_raw, cap_sys_ chroot, cap_sys_ admin,cap_ mknod,cap_ audit_write, cap_setfcap
Current: = cap_chown,
Bounding set =cap_chown,
/ # capsh --print | grep sys_admin cap_dac_ override, cap_fowner, cap_fsetid, cap_kill, cap_setgid, cap_setuid, cap_setpcap, cap_net_ bind_service, cap_net_ admin,cap_ net_raw, cap_sys_ chroot, cap_sys_ admin,cap_ mknod,cap_ audit_write, cap_setfcap+ eip cap_dac_ override, cap_fowner, cap_fsetid, cap_kill, cap_setgid, cap_setuid, cap_setpcap, cap_net_ bind_service, cap_net_ admin,cap_ net_raw, cap_sys_ chroot, cap_sys_ admin,cap_ mknod,cap_ audit_write, cap_setfcap
Current: = cap_chown,
Bounding set =cap_chown,
# LXD juju-fa887c- 11-lxd- 2:~$ mount | grep /proc nodev,noexec, relatime) nodev,relatime, user_id= 0,group_ id=0,allow_ other) nodev,relatime, user_id= 0,group_ id=0,allow_ other) nodev,relatime, user_id= 0,group_ id=0,allow_ other) nodev,relatime, user_id= 0,group_ id=0,allow_ other) nodev,relatime, user_id= 0,group_ id=0,allow_ other) nodev,relatime, user_id= 0,group_ id=0,allow_ other) fs/binfmt_ misc type binfmt_misc (rw,relatime)
ubuntu@
proc on /proc type proc (rw,nosuid,
lxcfs on /proc/cpuinfo type fuse.lxcfs (rw,nosuid,
lxcfs on /proc/diskstats type fuse.lxcfs (rw,nosuid,
lxcfs on /proc/meminfo type fuse.lxcfs (rw,nosuid,
lxcfs on /proc/stat type fuse.lxcfs (rw,nosuid,
lxcfs on /proc/swaps type fuse.lxcfs (rw,nosuid,
lxcfs on /proc/uptime type fuse.lxcfs (rw,nosuid,
binfmt_misc on /proc/sys/
proc on /dev/.lxc/proc type proc (rw,relatime)
There was some work to address this:
https:/ /github. com/moby/ moby/issues/ 21649 /github. com/moby/ moby/pull/ 21751 /github. com/moby/ moby/issues/ 36597
https:/
https:/
https:/ /github. com/moby/ moby/pull/ 36644
https:/ /docs.docker. com/engine/ release- notes/# 18060-ce
"RawAccess allows a set of paths to be not set as masked or readonly. moby/moby#36644"
CLI integration: /github. com/docker/ cli/pull/ 1347 (per-path config support) /github. com/docker/ cli/pull/ 1808 (--security-opt systempaths= unconfined)
https:/
https:/