Openstack Integrator Charm

Strange internal IP is reported when a worker node has multiple NICs

Bug #1976318 reported by Nikolay Vinogradov on 2022-05-31

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Kubernetes Control Plane Charm	Triaged	Medium	Unassigned
	Openstack Integrator Charm	Triaged	Medium	Unassigned

Bug Description

Hi all,

I'm running charmed kubernetes 1.24.1 on top of OpenStack Focal Ussuri. My control-plane node and worker nodes are Juju manual machines. Control plane node has just 1 network interface connected to a Neutron network and it has an IP address from 192.168.150.0/24 subnet. Worker nodes have 4 network interfaces: 1 connected to Neutron overlay network, configured with IP address, 3 active, but not configured.

After deployment is settled (see the bundle attached), from time to time I see the following picture: worker-3 reports internal-ip from the subnet that belongs to a non-configured interface (192.168.140.45) while I expect it to be 192.168.150.53, see below:

        $ kubectl get nodes -A -o wide
        NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
        master-1 Ready <none> 125m v1.24.1 192.168.150.50 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-1 Ready <none> 125m v1.24.1 192.168.150.51 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-2 Ready <none> 125m v1.24.1 192.168.150.52 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-3 Ready <none> 125m v1.24.1 192.168.140.45 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9

192.168.140.45 is indeed a valid address of the Neutron port from one of the VM's NICs but it is commented out in netplan's config:

        $ juju ssh -m k8s kubernetes-worker/2 'grep 192.168.140 /etc/netplan/* -C 2'
                ens7:
                    addresses: []
                            #- 192.168.140.45/24
                    match:
                        macaddress: fa:16:3e:3b:6f:38
        --
        ## routes:
        ## - to: 0.0.0.0/0
        ## via: 192.168.140.1
                    set-name: ens7
                ens8:
                    addresses: []
                    #- 192.168.140.46/24
                    match:
                        macaddress: fa:16:3e:fe:46:fa
        --
        ## routes:
        ## - to: 0.0.0.0/0
        ## via: 192.168.140.1
                    set-name: ens8
        Connection to 192.168.150.53 closed.

and deconfigured on the host side:

        $ juju ssh -m k8s kubernetes-worker/2 ip -4 a s
        1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
            inet 127.0.0.1/8 scope host lo
               valid_lft forever preferred_lft forever
        2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8942 qdisc fq_codel state UP group default qlen 1000
            inet 192.168.150.53/24 brd 192.168.150.255 scope global dynamic ens3
               valid_lft 41677sec preferred_lft 41677sec
        6: fan-252: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
            inet 252.53.0.1/8 scope global fan-252
               valid_lft forever preferred_lft forever
        10: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8892 qdisc noqueue state UNKNOWN group default
            inet 192.168.200.128/32 scope global vxlan.calico
               valid_lft forever preferred_lft forever
        Connection to 192.168.150.53 closed.

I suspect openstack-integrator / openstack-cloud-controller-manager is to blame here, but so far I can't find the correct configuration for it so that it doesn't report deconfigured interfaces as the primary node's private IP:

    $ kubectl edit node worker-3 # similar would be for worker-[12] as well
...
    spec:
      providerID: openstack:///6022776c-fcdc-45d3-87e8-41c6c096c7f2
...

        status:
          addresses:
          - address: 192.168.140.45
            type: InternalIP
          - address: 192.168.141.41
            type: InternalIP
          - address: 192.168.140.46
            type: InternalIP
          - address: 192.168.150.53 # this is the only correct IP address
            type: InternalIP
          - address: worker-3
            type: Hostname

...

If I then do

$ juju remove-relation -m k8s kubernetes-worker openstack-integratorhooks

then after execution of necessary hooks I get the expected view of the nodes:

        $ kubectl get nodes -A -o wide
        NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
        master-1 Ready <none> 117m v1.24.1 192.168.150.50 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-1 Ready <none> 116m v1.24.1 192.168.150.51 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-2 Ready <none> 116m v1.24.1 192.168.150.52 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-3 Ready <none> 116m v1.24.1 192.168.150.53 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9

I'm still looking for the root cause for this behavior, please let me know if I'm missing anything here.

Revision history for this message

Nikolay Vinogradov (nikolay.vinogradov) wrote on 2022-05-31:

k8s-openstack.bundle Edit (4.0 KiB, text/plain)

Revision history for this message

Nikolay Vinogradov (nikolay.vinogradov) wrote on 2022-05-31:

juju-status.txt Edit (5.2 KiB, text/plain)

Revision history for this message

George Kraft (cynerva) wrote on 2022-05-31:

Can you run `juju run --unit kubernetes-worker/2 -- network-get kube-control`? This should show you the ingress addresses that the charm sees. The charm chooses the first ingress address and passes that to kubelet via the `--node-ip` config option. I don't think this is the issue, but it would help if you can rule it out.

I suspect you're right that openstack-cloud-controller-manager is responsible for adding the other addresses. It looks like this behavior can be controlled in openstack-cloud-controller-manager by setting the internal-network-name option[1]. Sadly, our charms don't support this in the generated config[2], so there's no way for you to set it. You won't be able to manually edit the cloud config; cdk-addons will revert your changes every 5 minutes.

I can't think of a good workaround to recommend, but I'm guessing it could be fixed by adding support for the internal-network-name option to our charms. It would probably need to be a config option on the openstack-integrator charm that gets passed through to kubernetes-control-plane.

[1]: https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/openstack-cloud-controller-manager/using-openstack-cloud-controller-manager.md#networking
[2]: https://github.com/charmed-kubernetes/layer-kubernetes-common/blob/1df6fc7fd08b14324d993e64f9b6459020229df0/lib/charms/layer/kubernetes_common.py#L550-L606

Changed in charm-kubernetes-master:
importance:	Undecided → Medium
Changed in charm-openstack-integrator:
importance:	Undecided → Medium
Changed in charm-kubernetes-master:
status:	New → Triaged
Changed in charm-openstack-integrator:
status:	New → Triaged

Revision history for this message

Nikolay Vinogradov (nikolay.vinogradov) wrote on 2022-06-01:

Hi George, thank you for looking in to this.

I reproduced the issue by adding relation between worker charm and the openstack-integrator charm.

        $ kubectl get nodes -o wide
        NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
        master-1 Ready <none> 27h v1.24.1 192.168.150.50 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-1 Ready <none> 27h v1.24.1 192.168.150.51 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-2 Ready <none> 27h v1.24.1 192.168.150.52 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9
        worker-3 Ready <none> 27h v1.24.1 192.168.141.41 <none> Ubuntu 20.04.4 LTS 5.4.0-109-generic containerd://1.5.9

Here is the requested output:

        $ juju run -m k8s --unit kubernetes-worker/2 -- network-get kube-control
        bind-addresses:
        - mac-address: fa:16:3e:e2:d7:70
          interface-name: ens3
          addresses:
          - hostname: ""
            address: 192.168.150.53
            cidr: 192.168.150.0/24
          macaddress: fa:16:3e:e2:d7:70
          interfacename: ens3
        - mac-address: 7e:54:d9:66:e6:20
          interface-name: fan-252
          addresses:
          - hostname: ""
            address: 252.53.0.1
            cidr: 252.0.0.0/8
          macaddress: 7e:54:d9:66:e6:20
          interfacename: fan-252
        egress-subnets:
        - 192.168.150.53/32
        ingress-addresses:
        - 192.168.150.53
        - 252.53.0.1

So it seems that the charm doesn't see the other Openstack networks. By the way, --node-ip in the kubelet's cmdline arguments is also correct:

        $ juju ssh -m k8s kubernetes-worker/2 'ps aux | grep kubelet | grep node-ip'
        root 554863 1.3 0.3 3038536 102496 ? Ssl 01:28 0:06 /snap/kubelet/2423/kubelet --kubeconfig=/root/cdk/kubeconfig --v=0 --logtostderr=true --node-ip=192.168.150.53 --container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --cloud-provider=external --config=/root/cdk/kubelet/config.yaml --pod-infra-container-image=rocks.canonical.com:443/cdk/pause:3.6
        ubuntu 560091 0.0 0.0 8616 3176 pts/0 Ss+ 01:36 0:00 bash -c ps aux | grep kubelet | grep node-ip
        Connection to 192.168.150.53 closed

Hi George, thank you for looking in to this.

I reproduced the issue by adding relation between worker charm and the openstack-integrator charm.

$ kubectl get nodes -o wide 
        NAME       STATUS   ROLES    AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
        master-1   Ready    <none>   27h   v1.24.1   192.168.150.50   <none>        Ubuntu 20.04.4 LTS   5.4.0-109-generic   containerd://1.5.9
        worker-1   Ready    <none>   27h   v1.24.1   192.168.150.51   <none>        Ubuntu 20.04.4 LTS   5.4.0-109-generic   containerd://1.5.9
        worker-2   Ready    <none>   27h   v1.24.1   192.168.150.52   <none>        Ubuntu 20.04.4 LTS   5.4.0-109-generic   containerd://1.5.9
        worker-3   Ready    <none>   27h   v1.24.1   192.168.141.41   <none>        Ubuntu 20.04.4 LTS   5.4.0-109-generic   containerd://1.5.9

Here is the requested output:

So it seems that the charm doesn't see the other Openstack networks. By the way, --node-ip in the kubelet's cmdline arguments is also correct:

$ juju ssh -m k8s kubernetes-worker/2 'ps aux | grep kubelet | grep node-ip'
        root      554863  1.3  0.3 3038536 102496 ?      Ssl  01:28   0:06 /snap/kubelet/2423/kubelet --kubeconfig=/root/cdk/kubeconfig --v=0 --logtostderr=true --node-ip=192.168.150.53 --container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --cloud-provider=external --config=/root/cdk/kubelet/config.yaml --pod-infra-container-image=rocks.canonical.com:443/cdk/pause:3.6
        ubuntu    560091  0.0  0.0   8616  3176 pts/0    Ss+  01:36   0:00 bash -c ps aux | grep kubelet | grep node-ip
        Connection to 192.168.150.53 closed

Revision history for this message

George Kraft (cynerva) wrote on 2022-06-01:

If this is causing `kubectl exec` and `kubectl logs` commands to fail, you might be able to work around that by setting the kubelet-preferred-address-types config of kube-apiserver[1] to prefer Hostname first:

juju config kubernetes-control-plane api-extra-args="kubelet-preferred-address-types=Hostname,InternalIP,InternalDNS,ExternalDNS,ExternalIP"

But that will only work if hostnames like worker-3 are resolveable to good IPs.

[1]: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

Revision history for this message

Nikolay Vinogradov (nikolay.vinogradov) wrote on 2022-08-04:

Thanks George, I checked that and the suggested workaround helped:

ubuntu@jumphost:~$ kubectl exec -it -n kube-system openstack-cloud-controller-manager-rg2l8 cat /etc/config/cloud.conf
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[Global]
auth-url = https://keystone.pc1.canonical.com:5000/v3
region = PartnerCloud1
username = sriov-in-k8s-user
password = quodAjAs2
tenant-name = sriov-in-k8s
domain-name = admin_domain
tenant-domain-name = admin_domain
ca-file = /etc/config/endpoint-ca.cert

[LoadBalancer]
use-octavia = true
lb-method = ROUND_ROBIN

Without the workaround I couldn't do kubectl exec, because it hangs trying to connect to incorrect node IP address:

Also from what read in Openstack Cloud Controller Manager docs [1] and [2], a few other options are worth checking: public-network-name and internal-network-name. Currently they're not exposed as charm configuration options, and not propagated to openstack cloud controller manager as it has been mentioned in the previous comments.

We know from the initial description of this bug that the problem is in multiple internal IPs assigned to nodes, and based on [3] internal-network-name option directly affects which IP address is selected as the internal node IP. So internal-network-name seems to be the fix for the issue if we need nodes to be accessible via their Internal IPs, not their names.

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/openstack-cloud-controller-manager/using-openstack-cloud-controller-manager.md#networking
[2] https://github.com/charmed-kubernetes/cdk-addons/blob/main/get-addon-templates#L86
[3] https://github.com/kubernetes/cloud-provider-openstack/blob/master/pkg/openstack/instances.go#L551

Thanks George, I checked that and the suggested workaround helped:

[LoadBalancer]
use-octavia = true
lb-method = ROUND_ROBIN

Without the workaround I couldn't do kubectl exec, because it hangs trying to connect to incorrect node IP address:

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.