workloads not getting IP from Octavia LB when using provider networking

Bug #1877692 reported by Jeff Hillman
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
Fix Released
High
Samuel Allan
Openstack Integrator Charm
Fix Released
High
Samuel Allan

Bug Description

Kubernetes 1.17.5
Openstack Integrator charm at stable on jaas.ai
Openstack Train (Stein for Octavia)

When configuring openstack-integrator with charmed kubernetes as both load-balancer endpoint for k8s-master/worker, but also for workloads; workloads are not getting their assigned IP addresses from the LoadBalancer created in Octavia.

The k8s-master integrates just fine, octavia is used as a load-balancer in place of kube-api-lb.

However, workloads are not getting the IP addresses from the octavia LoadBalancer.

---
$ openstack loadbalancer amphora list
+--------------------------------------+-----------+--------+-----------------------------------------+--------------+
| 9f43a456-20ee-4d53-b73e-f490eb5f5bda | 9b4a5796-4e49-49fd-acbb-742cbc50ea46 | ALLOCATED | BACKUP | fc00:5819:c3a2:8856:f816:3eff:feb7:86f4 | 172.16.7.180 |
| a852569f-5eb2-4b57-8f8b-eddb8e4e5795 | 9b4a5796-4e49-49fd-acbb-742cbc50ea46 | ALLOCATED | MASTER | fc00:5819:c3a2:8856:f816:3eff:fea3:90d4 | 172.16.7.180 |
| 8c1c5353-f5b5-41b9-9243-fea58a2d65f0 | dbfacebd-7aa3-4be6-8264-3a4c53136434 | ALLOCATED | BACKUP | fc00:5819:c3a2:8856:f816:3eff:fe59:9e6a | 172.16.7.201 |
| 390a8a5b-bc3d-451f-966d-7aa72d30239d | dbfacebd-7aa3-4be6-8264-3a4c53136434 | ALLOCATED | MASTER | fc00:5819:c3a2:8856:f816:3eff:fe29:3776 | 172.16.7.201 |
| b13c22aa-99c1-419d-9a9a-dab4211572cc | None | READY | None | fc00:5819:c3a2:8856:f816:3eff:feae:caf5 | None |
| 142cfba0-c1e9-4d65-a4cc-372fc2a62bcc | None | BOOTING | None | fc00:5819:c3a2:8856:f816:3eff:feba:ef83 | None |
| 3a279cba-b902-4a9d-9c7e-fe03e9f8863f | None | BOOTING | None | fc00:5819:c3a2:8856:f816:3eff:feba:9f6f | None |
| 9e3fd179-997e-4a59-8438-063b7b4f3be0 | None | BOOTING | None | fc00:5819:c3a2:8856:f816:3eff:fe63:68fb | None |
+--------------------------------------+--------------------------------------+-----------+--------+-----------------------------------------+--------------+

$ openstack loadbalancer list
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+
| id | name | project_id | vip_address | provisioning_status | provider |
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+
| 9b4a5796-4e49-49fd-acbb-742cbc50ea46 | openstack-integrator-96aa51bbf042-kubernetes-master | 80971bff31dd49bcb91e718e895b8b31 | 172.16.7.180 | ACTIVE | amphora |
| dbfacebd-7aa3-4be6-8264-3a4c53136434 | kube_service_kubernetes-inl7iln9zcumkgzr7xinq34rhuwcg2c4_default_cdk-cats | 80971bff31dd49bcb91e718e895b8b31 | 172.16.7.201 | ACTIVE | amphora |
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+

---

In the above output, 172.16.7.180 is the VIP for the k8s-master cluster. 172.16.7.201 is the VIP for the workload.

---
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/cdk-cats-5fcdf89599-6xbph 1/1 Running 0 9m8s
pod/cdk-cats-5fcdf89599-7ml6p 1/1 Running 0 9m9s
pod/cdk-cats-5fcdf89599-dh2hb 1/1 Running 0 9m8s
pod/cdk-cats-5fcdf89599-jfq2t 1/1 Running 0 9m9s
pod/cdk-cats-5fcdf89599-q2vgn 1/1 Running 0 9m9s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cdk-cats LoadBalancer 10.152.183.38 <pending> 80:30836/TCP 9m8s
service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 127m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cdk-cats 5/5 5 5 9m9s

NAME DESIRED CURRENT READY AGE
replicaset.apps/cdk-cats-5fcdf89599 5 5 5 9m9s

---

The cdk-cats service is waiting on the 172.16.7.201 IP address. A describe of that service shows the following:

---

$ kubectl describe service/cdk-cats
Name: cdk-cats
Namespace: default
Labels: <none>
Annotations: Selector: app=cdk-cats
Type: LoadBalancer
IP: 10.152.183.38
Port: cdk-cats 80/TCP
TargetPort: 80/TCP
NodePort: cdk-cats 30836/TCP
Endpoints: 10.1.78.12:80,10.1.78.13:80,10.1.78.14:80 + 2 more...
Session Affinity: None
External Traffic Policy: Cluster
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Normal EnsuringLoadBalancer 3m1s (x7 over 9m40s) service-controller Ensuring load balancer
  Warning SyncLoadBalancerFailed 2m58s (x7 over 8m31s) service-controller Error syncing load balancer: failed to ensure load balancer: error creating LB floatingip {Description:Floating IP for Kubernetes external service default/cdk-cats from cluster kubernetes-inl7iln9zcumkgzr7xinq34rhuwcg2c4 FloatingNetworkID:68b55143-efce-449c-8879-d40f3a8cc42e FloatingIP: PortID:fcd1203f-b456-4deb-bb5c-e495dbeea638 FixedIP: SubnetID: TenantID: ProjectID:}: Resource not found

---

As can be seen from the above output, it is assigning a FloatingNetworkID and attempting to assign a FloatingIP. There are no floating IPs in this scenario. There is only 1 network and it is directly on a VLAN and configured as a flat provider going to physnet1. There are no internal networks or Floating IP ranges.

Bundle config for openstack-integrator is as follows.

---

  openstack-integrator:
    charm: cs:~containers/openstack-integrator
    num_units: 1
    trust: true
    options:
      manage-security-groups: true
      subnet-id: 78114d37-a175-46f9-a451-78a5e101db0f

---

The subnet-id above is the subnet for the provider network.

---

$ openstack subnet list
+--------------------------------------+------------------+--------------------------------------+--------------------------+
| ID | Name | Network | Subnet |
+--------------------------------------+------------------+--------------------------------------+--------------------------+
| 713a43d7-81ca-4d0b-9b45-c66107a84e31 | lb-mgmt-subnetv6 | 76bb992e-914f-40b6-b921-5819c3a28856 | fc00:5819:c3a2:8856::/64 |
| 78114d37-a175-46f9-a451-78a5e101db0f | prov-sub | 68b55143-efce-449c-8879-d40f3a8cc42e | 172.16.7.0/24 |
+--------------------------------------+------------------+--------------------------------------+--------------------------+

---

full openstack-integrator config

---

$ juju config openstack-integrator
application: openstack-integrator
application-config:
  trust:
    default: false
    description: Does this application have access to trusted credentials
    source: user
    type: bool
    value: true
charm: openstack-integrator
settings:
  auth-url:
    default: ""
    description: |
      The URL of the keystone API used to authenticate. On OpenStack control panels,
      this can be found at Access and Security > API Access > Credentials.
    source: default
    type: string
    value: ""
  bs-version:
    description: |
      Used to override automatic version detection for block storage usage.
      Valid values are v1, v2, v3 and auto. When auto is specified automatic
      detection will select the highest supported version exposed by the
      underlying OpenStack cloud. If not set, will use the upstream default.
    source: unset
    type: string
  credentials:
    default: ""
    description: |
      The base64-encoded contents of a JSON file containing OpenStack credentials.

      The credentials must contain the following keys: auth-url, username, password,
      project-name, user-domain-name, and project-domain-name.

      It could also contain a base64-encoded CA certificate in endpoint-tls-ca key value.

      This can be used from bundles with 'include-base64://' (see
      https://jujucharms.com/docs/stable/charms-bundles#setting-charm-configurations-options-in-a-bundle),
      or from the command-line with 'juju config openstack credentials="$(base64 /path/to/file)"'.

      It is strongly recommended that you use 'juju trust' instead, if available.
    source: default
    type: string
    value: ""
  endpoint-tls-ca:
    default: ""
    description: |
      A CA certificate that can be used to verify the target cloud API endpoints.
      Use 'include-base64://' in a bundle to include a certificate. Otherwise,
      pass a base64-encoded certificate (base64 of "-----BEGIN" to "-----END")
      as a config option in a Juju CLI invocation.
    source: default
    type: string
    value: ""
  floating-network-id:
    default: ""
    description: |
      If set, it will be passed to integrated workloads to indicate that
      floating IPs should be created in the given network for load balancers
      that those workloads manage. For example, this will determine whether and
      where FIPs will be created by Kubernetes for LoadBalancer type services
      in the cluster.
    source: default
    type: string
    value: ""
  ignore-volume-az:
    description: |
      Used to influence availability zone use when attaching Cinder volumes.
      When Nova and Cinder have different availability zones, this should be
      set to true. This is most commonly the case where there are many Nova
      availability zones but only one Cinder availability zone. If not set,
      will use the upstream default.
    source: unset
    type: boolean
  lb-floating-network:
    default: ""
    description: |
      If set, this charm will assign a floating IP in this network (name or ID)
      for load balancers created for other charms related on the loadbalancer
      endpoint.
    source: default
    type: string
    value: ""
  lb-method:
    default: ROUND_ROBIN
    description: |
      Algorithm that will be used by load balancers, which must be one of:
      ROUND_ROBIN, LEAST_CONNECTIONS, SOURCE_IP. This applies both to load
      balancers managed by this charm for applications related via the
      loadbalancer endpoint, as well as to load balancers managed by integrated
      workloads, such as Kubernetes.
    source: default
    type: string
    value: ROUND_ROBIN
  lb-port:
    default: 443
    description: |
      Port to use for load balancers created by this charm for other charms
      related on the loadbalancer endpoint.
    source: default
    type: int
    value: 443
  lb-subnet:
    default: ""
    description: |
      Override the subnet (name or ID) in which this charm will create load
      balancers for other charms related on the loadbalancer endpoint. If not
      set, the subnet over which the requesting application is related will be
      used.
    source: default
    type: string
    value: ""
  manage-security-groups:
    default: false
    description: |
      Whether or not each load balancer should have its own security group, or
      if all load balancers should use the default security group for the
      project. This applies both to load balancers managed by this charm for
      applications related via the loadbalancer endpoint, as well as to load
      balancers managed by integrated workloads, such as Kubernetes.
    source: user
    type: boolean
    value: true
  password:
    default: ""
    description: Password of a valid user set in keystone.
    source: default
    type: string
    value: ""
  project-domain-name:
    default: ""
    description: Name of the project domain where you want to create your resources.
    source: default
    type: string
    value: ""
  project-name:
    default: ""
    description: Name of project where you want to create your resources.
    source: default
    type: string
    value: ""
  region:
    default: ""
    description: Name of the region where you want to create your resources.
    source: default
    type: string
    value: ""
  snap_proxy:
    default: ""
    description: |
      DEPRECATED. Use snap-http-proxy and snap-https-proxy model configuration settings. HTTP/HTTPS web proxy for Snappy to use when accessing the snap store.
    source: default
    type: string
    value: ""
  snap_proxy_url:
    default: ""
    description: |
      DEPRECATED. Use snap-store-proxy model configuration setting. The address of a Snap Store Proxy to use for snaps e.g. http://snap-proxy.example.com
    source: default
    type: string
    value: ""
  snapd_refresh:
    default: ""
    description: |
      How often snapd handles updates for installed snaps. The default (an empty string) is 4x per day. Set to "max" to check once per month based on the charm deployment date. You may also set a custom string as described in the 'refresh.timer' section here:
        https://forum.snapcraft.io/t/system-options/87
    source: default
    type: string
    value: ""
  subnet-id:
    default: ""
    description: |
      If set, it will be passed to integrated workloads to indicate in what
      subnet load balancers should be created. For example, this will determine
      what subnet Kubernetes uses for LoadBalancer type services in the
      cluster.
    source: user
    type: string
    value: 78114d37-a175-46f9-a451-78a5e101db0f
  trust-device-path:
    description: |
      In most scenarios the block device names provided by Cinder (e.g.
      /dev/vda) can not be trusted. This boolean toggles this behavior. Setting
      it to true results in trusting the block device names provided by Cinder.
      The value of false results in the discovery of the device path
      based on its serial number and /dev/disk/by-id mapping and is the
      recommended approach. If not set, will use the upstream default.
    source: unset
    type: boolean
  user-domain-name:
    default: ""
    description: Name of the user domain where you want to create your resources.
    source: default
    type: string
    value: ""
  username:
    default: ""
    description: Username of a valid user set in keystone.
    source: default
    type: string
    value: ""

---

Tags: cpe-onsite
Jeff Hillman (jhillman)
description: updated
George Kraft (cynerva)
Changed in charm-openstack-integrator:
status: New → Triaged
importance: Undecided → Critical
Changed in charm-kubernetes-master:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
George Kraft (cynerva) wrote :

I wonder if cloud-provider-openstack supports this use case.

Jeff, can you try cloud-provider-openstack service annotations[1] on the cdk-cats service and see if any of them help you? In particular, it looks like this config will make the cloud provider skip the Floating IP step:

service.beta.kubernetes.io/openstack-internal-load-balancer: 'true'

[1]: https://github.com/kubernetes/cloud-provider-openstack/blob/v1.17.0/docs/expose-applications-using-loadbalancer-type-service.md#service-annotations

Revision history for this message
Jeff Hillman (jhillman) wrote :
Download full text (4.5 KiB)

I did, and that appears to work.

---

$ openstack loadbalancer amphora list; openstack loadbalancer list
+--------------------------------------+--------------------------------------+-----------+--------+-----------------------------------------+--------------+
| id | loadbalancer_id | status | role | lb_network_ip | ha_ip |
+--------------------------------------+--------------------------------------+-----------+--------+-----------------------------------------+--------------+
| 958bdcb4-a977-4214-bfe9-db9dbaca4c75 | 62e8f78b-6480-4b8c-944b-72554947905c | ALLOCATED | BACKUP | fc00:db05:2f69:60f7:f816:3eff:fe8d:bf9b | 172.16.7.191 |
| 2beb7dc7-5447-470e-abb6-e4616e2a2e44 | 76a20534-9399-4d58-a457-ab0c78a40e74 | ALLOCATED | MASTER | fc00:db05:2f69:60f7:f816:3eff:fe35:71cd | 172.16.7.205 |
| 5d97f6eb-a19c-4e4b-aea0-27fc8e9b8ff1 | 76a20534-9399-4d58-a457-ab0c78a40e74 | ALLOCATED | BACKUP | fc00:db05:2f69:60f7:f816:3eff:fe81:5347 | 172.16.7.205 |
| 5ebd2fe6-e1ee-42cb-8826-77d00b4f72f5 | 62e8f78b-6480-4b8c-944b-72554947905c | ALLOCATED | MASTER | fc00:db05:2f69:60f7:f816:3eff:fe6a:7338 | 172.16.7.191 |
| 3ac0f9c9-f971-492b-ac26-45b539461d37 | None | READY | None | fc00:db05:2f69:60f7:f816:3eff:fe82:2c18 | None |
| 973e075f-8e60-4459-9fa2-0796cabea19d | None | READY | None | fc00:db05:2f69:60f7:f816:3eff:fe5c:63ba | None |
| 9e746b61-80d5-4f8d-9d67-72691937c7ab | None | READY | None | fc00:db05:2f69:60f7:f816:3eff:fe82:201b | None |
+--------------------------------------+--------------------------------------+-----------+--------+-----------------------------------------+--------------+
$ openstack loadbalancer list
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+
| id | name | project_id | vip_address | provisioning_status | provider |
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+
| 62e8f78b-6480-4b8c-944b-72554947905c | openstack-integrator-aa2c28472727-kubernetes-master | 17d25fe98bf5406982f645043c727fe4 | 172.16.7.191 | ACTIVE | amphora |
| 76a20534-9399-4d58-a457-ab0c78a40e74 | kube_service_kubernetes-wcovhst42yf6tcdx6qe72supw3apczr1_default_cdk-cats | 17d25fe98bf5406982f645043c727fe4 | 172.16.7.205 | ACTIVE | amphora |
+--------------------------------------+---------------------------------------------------------------------------+----------------------------------+--------------+---------------------+----------+

---

As shown above, the loadbalancer come up fine, and with this kubectl output we see that the service gets the correct IP....

Read more...

Revision history for this message
George Kraft (cynerva) wrote :

> Is this a mandatory setting? or can the charm be configured to detect that no floating-ip-subnet was given and skip the need for this?

I think setting the annotation on services is a workaround, but not a good long-term solution. The charm should probably set internal-lb=true in cloud-provider-openstack config[1] for your use case so that the annotation is not needed.

[1]: https://github.com/kubernetes/cloud-provider-openstack/blob/v1.17.0/docs/provider-configuration.md#load-balancer-optional-parameters

George Kraft (cynerva)
Changed in charm-kubernetes-master:
importance: Critical → High
Changed in charm-openstack-integrator:
importance: Critical → High
Revision history for this message
Nobuto Murata (nobuto) wrote :

Hello, we are hit by this issue in the field and had to spend some time to reach here.

> I think setting the annotation on services is a workaround, but not a good long-term solution. The charm should probably set internal-lb=true in cloud-provider-openstack config[1] for your use case so that the annotation is not needed.

This sounds like a good addition to the charm config for openstack-integrator still.

Revision history for this message
Samuel Allan (samuelallan) wrote :

Here is a dump of my notes from reading code to discover where this value can be updated. May or may not be accurate, but this is my understanding of the system right now.

This is where the openstack cloud.conf lives:

```
juju ssh kuberenets-master/0
cat /var/snap/cdk-addons/current/config/openstack-cloud-conf | base64 -d
```

the live value applied to k8s can be retrieved with:

```
kubectl -n kube-system describe secret cloud-config
# to actually get the contents of cloud.conf:
kubectl -n kube-system get secret cloud-config -o jsonpath='{.data}' | jq -r '."cloud.conf"' | base64 -d
```

Data flow:
layer-k8s-common:generate_openstack_cloud_config() ->
charm-kubernetes-control-plane:configure_cdk_addons() ->
snap set cdk-addons <args>

Then cdk-addons uses this value when creating templates out for k8s.
The value passed to the template context is under the key `cloud_conf`.
I think the templates come from cloud-provider-openstack.
None of these templates reference `cloud_conf` though.

The template that uses `cloud_conf` is in the cdk-addons repo:
https://github.com/charmed-kubernetes/cdk-addons/blob/63a7eaff01da0718c3c19e012cc089948263b764/bundled-templates/cloud-config-secret-openstack.yaml#L12
(cloud-config-secret-openstack.yaml)
This creates the secret that contains the `cloud.conf` key,
that is consumed by the cloud-provider-openstack

To update `internal-lb` conditionally,
it would seem that the place is layer-k8s-common:generate_openstack_cloud_config()
https://github.com/charmed-kubernetes/layer-kubernetes-common/blob/4b91682011a33aee3c20acbb5870f59ad8f8b0c2/lib/charms/layer/kubernetes_common.py#L550

Revision history for this message
Samuel Allan (samuelallan) wrote :

How I'm not sure exactly what the condition should be to set the internal-lb to false. What condition should trigger this?

Revision history for this message
Samuel Allan (samuelallan) wrote :

> How I'm not sure exactly what the condition should be to set the internal-lb to false. What condition should trigger this?

Follow up: Nobuto discussed with me, and the new plan is to expose `internal-lb` as a config option in the openstack-integrator charm.

This value will be passed through to the k8s master charm, so fixing this will require patching charm-openstack-integrator, charm-kubernetes-control-plane, and interface-openstack-integration.

Revision history for this message
Samuel Allan (samuelallan) wrote :
Changed in charm-openstack-integrator:
assignee: nobody → Samuel Walladge (swalladge)
Changed in charm-kubernetes-master:
assignee: nobody → Samuel Walladge (swalladge)
status: Triaged → In Progress
Changed in charm-openstack-integrator:
status: Triaged → In Progress
Revision history for this message
Samuel Allan (samuelallan) wrote :

There will probably be some discussion about backwards compatibility, but I've done some initial testing using my WIP patches and have confirmed that this successfully passes through the internal-lb value. Example:

```
$ juju run --unit kubernetes-control-plane/0 'cat /var/snap/cdk-addons/current/config/openstack-cloud-conf | base64 -d'
[Global]
auth-url = https://192.168.151.136:5000/v3
region = RegionOne
username = admin
password = Ciiphe6queC1ziej
tenant-name = admin
domain-name = admin_domain
tenant-domain-name = admin_domain
ca-file = /etc/config/endpoint-ca.cert

[LoadBalancer]
use-octavia = true
lb-method = ROUND_ROBIN
$ juju config openstack-integrator internal-lb=true
$ juju run --unit kubernetes-control-plane/0 'cat /var/snap/cdk-addons/current/config/openstack-cloud-conf | base64 -d'
[Global]
auth-url = https://192.168.151.136:5000/v3
region = RegionOne
username = admin
password = Ciiphe6queC1ziej
tenant-name = admin
domain-name = admin_domain
tenant-domain-name = admin_domain
ca-file = /etc/config/endpoint-ca.cert

[LoadBalancer]
use-octavia = true
lb-method = ROUND_ROBIN
internal-lb = true
```

Revision history for this message
Nobuto Murata (nobuto) wrote :

> There will probably be some discussion about backwards compatibility

This is just my guess, but adding more properties in an interface should be fine? Because existing properties are untouched and it's not removed. My 2 cents.

Revision history for this message
Samuel Allan (samuelallan) wrote :

> but adding more properties in an interface should be fine?

Yes that part seems to be fine, but I'm meaning modifying one of the function signatures in the interface library - set_lbaas_config[1]. What is the usual process here when we need to make a change to one of these functions?

[1]: https://github.com/juju-solutions/interface-openstack-integration/pull/12#discussion_r844713418

Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Samuel Allan (samuelallan) wrote :

Related requests have been merged! :)

Changed in charm-openstack-integrator:
status: In Progress → Fix Committed
Changed in charm-kubernetes-master:
status: In Progress → Fix Committed
George Kraft (cynerva)
Changed in charm-kubernetes-master:
milestone: none → 1.24
Changed in charm-openstack-integrator:
milestone: none → 1.24
Changed in charm-kubernetes-master:
status: Fix Committed → Fix Released
Changed in charm-openstack-integrator:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.