[Caracal][Debian 12] IPs missing in k8s-container with "sys:rw"

Bug #2070281 reported by Mariusz Karpiarz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
New
Undecided
Unassigned

Bug Description

When "sys:rw" is set in `lxc_container_mount_auto`, as per instructions in the docs (https://docs.openstack.org/openstack-ansible-ops/2024.1/mcapi.html#openstack-ansible-configuration-for-magnum-cluster-api-driver), the deployed `k8s-container` comes up with no IPs on its network interfaces:

```
# lxc-attach -n aio1-k8s-container-XXXXXXXX ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0@if90: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:b0:18:7c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:feb0:187c/64 scope link
       valid_lft forever preferred_lft forever
3: eth1@if91: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:92:d6:80 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fe92:d680/64 scope link
       valid_lft forever preferred_lft forever
```
I can see that both veths are connected to the bridges the way they should be (so L2 connectivity seems to be fine) and if I switch back to the default "sys:ro" and recreate the container, interfaces get IPs.
Do you know why read-write `/sys` affects the DHCP this way and what adverse effects should be expected if this mountpoint is left read-only?

Debian 12.5 (minbase), kernel 6.1.76-1 (2024-02-01) x86_64, LXC version 5.0.2, bridge-utils version 1.7.1, OSA release: 2024.1 (commit 729a95e90329f1754431afb82c9c5d4e7446a6bf).

Also, please keep in mind that I disable the osbpo.debian.net on the host. Could this be related?

Revision history for this message
Jonathan Rosser (jrosser) wrote :

Hi Mariusz,

So far the mcapi driver has only been tested on ubuntu, though it would be great if it was also working on debian so thanks for the bug report.

I don't know why the network interfaces is affected, but if i remember correctly sys:rw was needed to allow the cilium cni to use ebpf inside the lxc container.

I think that the IP for the container interfaces should be statically configured with systemd-networkd, apart from eth0 which should dhcp.

root@aio1-cinder-api-container-c5bc4396:/# networkctl -l
IDX LINK TYPE OPERATIONAL SETUP
  1 lo loopback carrier unmanaged
  2 eth0 ether routable configured
  3 eth1 ether routable configured
  4 eth2 ether routable configured

4 links listed.
root@aio1-cinder-api-container-c5bc4396:/# networkctl status
● State: routable
  Online state: online
       Address: 10.255.255.119 on eth0
                172.29.237.98 on eth1
                172.29.245.6 on eth2
                fe80::216:3eff:fee4:617 on eth0
                fe80::216:3eff:fe8c:503c on eth1
                fe80::216:3eff:fe1d:7a0b on eth2
       Gateway: 10.255.255.1 on eth0
           DNS: 10.255.255.1

Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: eth0: Gained carrier
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: lo: Link UP
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: lo: Gained carrier
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: Enumeration completed
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd[1]: Started Network Configuration.
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: eth0: DHCPv4 address 10.255.255.119/24 via 10.255.255.1
Jun 20 09:34:35 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: Could not set hostname: Access denied
Jun 20 09:34:36 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: eth0: Gained IPv6LL
Jun 20 09:34:36 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: eth1: Gained IPv6LL
Jun 20 09:34:36 aio1-cinder-api-container-c5bc4396 systemd-networkd[27]: eth2: Gained IPv6LL

Revision history for this message
Mariusz Karpiarz (mkarpiarz) wrote :

Thanks Jon,
I'll have another look and try to get to the bottom of this.
I'll also try rebuilding my AIO with Ubuntu 22.04 using the same config in case this is a hardware-related issue.

Revision history for this message
Jonathan Rosser (jrosser) wrote :

The first part of this issue on debian-12 is because the minimal image we build with debootstrap does not include udev.

This is a back-portable patch to fix that.

https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/923167

The next issue is that debian-12 observes PEP-663, so we run into trouble here https://github.com/vexxhost/ansible-collection-kubernetes/blob/e3216da75348d167c4c6ac78bc7aaae744f6f19b/roles/kubernetes/tasks/control-plane.yml#L46-L48

It is no longer possible to mix pip packages with the system python install from apt.

I will work on a fix for this too.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.