On a customer cloud, we're seeing a problem where the openstack-cloud-controller-manager pods appear to be trying to resize load balancers to target the current set of running K8s workers, but are somehow failing to do so.
Here is a concrete example, with customer-identifying information removed:
$ kubectl get events -n customer-namespace |grep UpdateLoadBalancerFailed | tail -n1
3m8s Warning UpdateLoadBalancerFailed service/customer-service Error updating load balancer with new hosts map[juju-123456-customer-k8s-1:{} juju-123456-customer-k8s-2:{} juju-123456-customer-k8s-3:{} juju-123456-customer-k8s-4:{} juju-123456-customer-k8s-5:{} juju-123456-customer-k8s-6:{} juju-123456-customer-k8s-7:{} juju-123456-customer-k8s-8:{} juju-123456-customer-k8s-9:{} juju-123456-customer-k8s-10:{} juju-123456-customer-k8s-11:{} juju-123456-customer-k8s-12:{}]: failed to find object
A "kubectl describe <service>" shows it is of type LoadBalancer, with something like this under Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning UpdateLoadBalancerFailed 3m33s (x4686 over 5d19h) service-controller Error updating load balancer with new hosts map[juju-123456-customer-k8s-1:{} juju-123456-customer-k8s-2:{} juju-123456-customer-k8s-3:{} juju-123456-customer-k8s-4:{} juju-123456-customer-k8s-5:{} juju-123456-customer-k8s-6:{} juju-123456-customer-k8s-7:{} juju-123456-customer-k8s-8:{} juju-123456-customer-k8s-9:{} juju-123456-customer-k8s-10:{} juju-123456-customer-k8s-11:{} juju-123456-customer-k8s-12:{}]: failed to find object
It appears that one of the openstack-cloud-controller-manager pods is generating this message.
"kubectl logs" from the pod in question look something like this:
E0929 19:41:26.551170 1 service_controller.go:667] failed to update load balancer hosts for service customer-namespace/interruption-service: failed to find object
I0929 19:41:26.551337 1 event.go:258] Event(v1.ObjectReference{Kind:"Service", Namespace:"customer-namespace", Name:"interruption-service", UID:"deadbeef-dead-beef-dead-beefdeadbeef", APIVersion:"v1", ResourceVersion:"87136827", FieldPath:""}): type: 'Warning' reason: 'UpdateLoadBalancerFailed' Error updating load balancer with new hosts map[juju-123456-customer-k8s-1:{} juju-123456-customer-k8s-2:{} juju-123456-customer-k8s-3:{} juju-123456-customer-k8s-4:{} juju-123456-customer-k8s-5:{} juju-123456-customer-k8s-6:{} juju-123456-customer-k8s-7:{} juju-123456-customer-k8s-8:{} juju-123456-customer-k8s-9:{} juju-123456-customer-k8s-10:{} juju-123456-customer-k8s-11:{} juju-123456-customer-k8s-12:{}]: failed to find object
"kubectl version" reports the following:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.8", GitCommit:"9f2892aab98fe339f3bd70e3c470144299398ace", GitTreeState:"clean", BuildDate:"2020-08-26T20:32:49Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.14", GitCommit:"d2a081c8e14e21e28fe5bdfa38a817ef9c0bb8e3", GitTreeState:"clean", BuildDate:"2020-08-26T20:35:13Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
The customer is running OpenStack Rocky with Octavia.
The K8s model is running cs:~containers/openstack-integrator-81.
> failed to find object
Looks like this error is defined here: https:/ /github. com/kubernetes/ cloud-provider- openstack/ blob/v1. 15.0/pkg/ cloudprovider/ providers/ openstack/ openstack. go#L67
I suspect it is being raised somewhere in here: https:/ /github. com/kubernetes/ cloud-provider- openstack/ blob/v1. 15.0/pkg/ cloudprovider/ providers/ openstack/ openstack_ loadbalancer. go
Looking through the code for where that error can be raised, I have the following questions:
1. Can you confirm that the LoadBalancer exists in OpenStack? It should have a name like kube_service_ kubernetes- ******* ******* ******* ******* ****_<namespace >_<service> status= ACTIVE?
2. Does the OpenStack LB have provisioning_
3. Does the OpenStack LB have a floating IP?
4. Does the OpenStack LB have attached listeners? Do the listeners each have a single pool?
5. Does the OpenStack LB have members attached?
6. Are all master/worker instances located in the same OpenStack region?