Metallb assigns an IP but it is unreachable

Bug #1919479 reported by Camille Rodriguez
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MetalLB Operator
Fix Released
High
Camille Rodriguez

Bug Description

I deployed metallb operator and assigned a range that is available in my network. I then deployed mcirobot to test the service. It gets an external IP, but the service isn't reachable on it.

$ kubectl apply -n microbot -f https://raw.githubusercontent.com/charmed-kubernetes/metallb-operator/master/docs/example-microbot-lb.yaml
deployment.apps/microbot-lb created
service/microbot-lb created

$ kubectl get all -n microbot
NAME READY STATUS RESTARTS AGE
pod/microbot-lb-57db6678d5-cbrkj 1/1 Running 0 5s
pod/microbot-lb-57db6678d5-mxnt6 1/1 Running 0 5s
pod/microbot-lb-57db6678d5-xlcgj 1/1 Running 0 5s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/microbot-lb LoadBalancer 10.152.183.143 172.24.70.220 80:30217/TCP 5s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/microbot-lb 3/3 3 3 5s

NAME DESIRED CURRENT READY AGE
replicaset.apps/microbot-lb-57db6678d5 3 3 3 5s

$ curl 172.24.70.220
curl: (7) Failed to connect to 172.24.70.220 port 80: No route to host

The logs in one of the metallb-speaker pods show an error:

$ kubectl get pods -n metallb-system
NAME READY STATUS RESTARTS AGE
metallb-controller-7fb4d6984b-6bqvn 1/1 Running 0 13m
metallb-controller-operator-0 1/1 Running 0 15m
metallb-speaker-72zlv 1/1 Running 0 14m
metallb-speaker-operator-0 1/1 Running 0 15m
metallb-speaker-rfvq9 1/1 Running 0 14m
metallb-speaker-wj54v 1/1 Running 0 14m
modeloperator-5f7c78db58-kzhz6 1/1 Running 0 19m

{"caller":"main.go:267","event":"startUpdate","msg":"start of service update","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.497253916Z"}
{"caller":"main.go:343","event":"serviceAnnounced","ip":"172.24.70.220","msg":"service has IP, announcing","pool":"default","protocol":"layer2","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.497451117Z"}
E0317 15:58:38.497547 1 event.go:296] Could not construct reference to: '&v1.Service{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"microbot-lb", GenerateName:"", Namespace:"microbot", SelfLink:"", UID:"0b81bb45-5e45-4e7f-88c6-9d9850c01ee1", ResourceVersion:"334080", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751593516, loc:(*time.Location)(0x1f50960)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"name\":\"microbot-lb\",\"namespace\":\"microbot\"},\"spec\":{\"ports\":[{\"name\":\"microbot-lb\",\"port\":80,\"protocol\":\"TCP\",\"targetPort\":80}],\"selector\":{\"app\":\"microbot-lb\"},\"type\":\"LoadBalancer\"}}\n"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"controller", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc0004efae0), Fields:(*v1.Fields)(nil)}, v1.ManagedFieldsEntry{Manager:"kubectl-client-side-apply", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc0004efb00), Fields:(*v1.Fields)(nil)}}}, Spec:v1.ServiceSpec{Ports:[]v1.ServicePort{v1.ServicePort{Name:"microbot-lb", Protocol:"TCP", Port:80, TargetPort:intstr.IntOrString{Type:0, IntVal:80, StrVal:""}, NodePort:30217}}, Selector:map[string]string{"app":"microbot-lb"}, ClusterIP:"10.152.183.143", Type:"LoadBalancer", ExternalIPs:[]string(nil), SessionAffinity:"None", LoadBalancerIP:"", LoadBalancerSourceRanges:[]string(nil), ExternalName:"", ExternalTrafficPolicy:"Cluster", HealthCheckNodePort:0, PublishNotReadyAddresses:false, SessionAffinityConfig:(*v1.SessionAffinityConfig)(nil)}, Status:v1.ServiceStatus{LoadBalancer:v1.LoadBalancerStatus{Ingress:[]v1.LoadBalancerIngress{v1.LoadBalancerIngress{IP:"172.24.70.220", Hostname:""}}}}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'nodeAssigned' 'announcing from node "juju-f20297-9-kvm-0"'
{"caller":"main.go:346","event":"endUpdate","msg":"end of service update","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.497767368Z"}
{"caller":"main.go:267","event":"startUpdate","msg":"start of service update","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.508612389Z"}
{"caller":"main.go:343","event":"serviceAnnounced","ip":"172.24.70.220","msg":"service has IP, announcing","pool":"default","protocol":"layer2","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.508728795Z"}
E0317 15:58:38.508814 1 event.go:296] Could not construct reference to: '&v1.Service{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"microbot-lb", GenerateName:"", Namespace:"microbot", SelfLink:"", UID:"0b81bb45-5e45-4e7f-88c6-9d9850c01ee1", ResourceVersion:"334080", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751593516, loc:(*time.Location)(0x1f50960)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"name\":\"microbot-lb\",\"namespace\":\"microbot\"},\"spec\":{\"ports\":[{\"name\":\"microbot-lb\",\"port\":80,\"protocol\":\"TCP\",\"targetPort\":80}],\"selector\":{\"app\":\"microbot-lb\"},\"type\":\"LoadBalancer\"}}\n"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"controller", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc0004efae0), Fields:(*v1.Fields)(nil)}, v1.ManagedFieldsEntry{Manager:"kubectl-client-side-apply", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc0004efb00), Fields:(*v1.Fields)(nil)}}}, Spec:v1.ServiceSpec{Ports:[]v1.ServicePort{v1.ServicePort{Name:"microbot-lb", Protocol:"TCP", Port:80, TargetPort:intstr.IntOrString{Type:0, IntVal:80, StrVal:""}, NodePort:30217}}, Selector:map[string]string{"app":"microbot-lb"}, ClusterIP:"10.152.183.143", Type:"LoadBalancer", ExternalIPs:[]string(nil), SessionAffinity:"None", LoadBalancerIP:"", LoadBalancerSourceRanges:[]string(nil), ExternalName:"", ExternalTrafficPolicy:"Cluster", HealthCheckNodePort:0, PublishNotReadyAddresses:false, SessionAffinityConfig:(*v1.SessionAffinityConfig)(nil)}, Status:v1.ServiceStatus{LoadBalancer:v1.LoadBalancerStatus{Ingress:[]v1.LoadBalancerIngress{v1.LoadBalancerIngress{IP:"172.24.70.220", Hostname:""}}}}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'nodeAssigned' 'announcing from node "juju-f20297-9-kvm-0"'
{"caller":"main.go:346","event":"endUpdate","msg":"end of service update","service":"microbot/microbot-lb","ts":"2021-03-17T15:58:38.509050613Z"}

More info about the whole setup:

This is a charmed kubernetes 1.20 deployment, there are 3 worker nodes in virtual machines. The charm deployed without issue, status is green/idle.

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

Additionally, I've tested deploying the upstream metallb manifests, and I was able to reach the microbot service at its external ip.

Revision history for this message
Cory Johns (johnsca) wrote :

It looks like selfLink was dropped in 1.20 and this error probably means the metallb-system image needs to be updated to include newer K8s libraries.

https://github.com/kubernetes/kubernetes/issues/94660

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :
Download full text (3.5 KiB)

I tested changing the image path in the metallb charm to point to metallb/speaker:v0.9.5, rebuilt the charms, and the same issue occurs again. It must be something else in the implementation of the spec that leads to that issue.

E0318 19:36:30.916504 1 event.go:296] Could not construct reference to: '&v1.Service{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"microbot-lb", GenerateName:"", Namespace:"microbot", SelfLink:"", UID:"e596a273-47c2-4767-8a66-757a9797b929", ResourceVersion:"513625", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751692988, loc:(*time.Location)(0x1f589e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{},\"name\":\"microbot-lb\",\"namespace\":\"microbot\"},\"spec\":{\"ports\":[{\"name\":\"microbot-lb\",\"port\":80,\"protocol\":\"TCP\",\"targetPort\":80}],\"selector\":{\"app\":\"microbot-lb\"},\"type\":\"LoadBalancer\"}}\n"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"controller", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc00043fba0), Fields:(*v1.Fields)(nil)}, v1.ManagedFieldsEntry{Manager:"kubectl-client-side-apply", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc00043fbc0), Fields:(*v1.Fields)(nil)}}}, Spec:v1.ServiceSpec{Ports:[]v1.ServicePort{v1.ServicePort{Name:"microbot-lb", Protocol:"TCP", Port:80, TargetPort:intstr.IntOrString{Type:0, IntVal:80, StrVal:""}, NodePort:31752}}, Selector:map[string]string{"app":"microbot-lb"}, ClusterIP:"10.152.183.120", Type:"LoadBalancer", ExternalIPs:[]string(nil), SessionAffinity:"None", LoadBalancerIP:"", LoadBalancerSourceRanges:[]string(nil), ExternalName:"", ExternalTrafficPolicy:"Cluster", HealthCheckNodePort:0, PublishNotReadyAddresses:false, SessionAffinityConfig:(*v1.SessionAffinityConfig)(nil)}, Status:v1.ServiceStatus{LoadBalancer:v1.LoadBalancerStatus{Ingress:[]v1.LoadBalancerIngress{v1.LoadBalancerIngress{IP:"172.24.70.25", Hostname:""}}}}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'nodeAssigned' 'announcing from node "juju-f20297-7-kvm-0"'
{"caller":"main.go:278","event":"endUpdate","msg":"end of service update","service":"microbot/microbot-lb","ts":"2021-03-18T19:36:30.916769016Z"}

Confirmation that the right image was in use :
$ kubectl describe pod/metallb-speaker-djcgs -n metallb-system
[..]
Containers:
  speaker:
    Container ID: containerd://c7d1916c9f0aa16a2bacfc44c5682c025a6c81297fd7d613b28b8e90059c2843
    Image: metallb/speaker:v0.9.5

kubectl describe pod/metallb...

Read more...

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

I'm going to tag this field-high as I'm supposed to use the charms now for the deployment of metallb, but this bug prevents that. One workaround is to deploy the upstream version directly, but that isn't ideal for the UA team.

Revision history for this message
Cory Johns (johnsca) wrote :

This definitely seems to be an upstream metallb bug [1] but I'm still unclear as to why would might see different behavior when deploying metallb 0.9.5 directly vs the same container via a charm.

[1]: https://github.com/metallb/metallb/issues/794

Revision history for this message
Cory Johns (johnsca) wrote :

For reference: In chat, Camille indicated that she was deploying manually using the manifest from [1] which should also be metallb/speaker:v0.9.5 and which is 4 months old and definitely won't have that fix. There is a newer image available at quay.io/metallb/speaker:v0.9 which seems like it should have the upstream fix [2] (or quay.io/metallb/speaker:main, which is a daily build and definitely would have the fix), but we're still confused as to why the manual test with v0.9.5 seemed to work.

[1]: https://metallb.universe.tf/installation/#installation-by-manifest
[2]: https://github.com/metallb/metallb/pull/812

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

From discussion on mattermost, the selfLink issue seems to be linked to https://github.com/metallb/metallb/issues/794 , which was fixed in the main build. I tested with the image quay.io/metallb/speaker:main and the error doesn't appear anymore in the speaker logs (yay) but the external ip is still unreachable. So, there's another problem in the charm causing this issue, I think the SelfLink issue might have been a distraction, and not the actual problem.

Actual problem is that the speakers cannot arp on the network because "hostNetwork:True" is missing in the pod spec. This patch proposes a fix for it : https://github.com/charmed-kubernetes/metallb-operator/pull/17

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

fix release in cs:~containers/metallb-speaker-15, cs:~containers/metallb-controller-16

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

Actually, let's call this a partial fix, since the error message detected still requires for the images to be updated to the latest release (which should be 0.9.6 once released. The fix for the SelfLink is only in Main currently)

Changed in operator-metallb:
assignee: nobody → Camille Rodriguez (camille.rodriguez)
status: New → Confirmed
Revision history for this message
George Kraft (cynerva) wrote :

I've split the SelfLink error out into a separate bug: https://bugs.launchpad.net/operator-metallb/+bug/1920216

That will allow us to track the release of the hostNetwork fix and the SelfLink fix separately. We'll keep this bug focused on the connectivity problem and the hostNetwork fix.

summary: - Metallb assigns an IP but it unable to construct the reference to it
+ Metallb assigns an IP but it is unreachable
tags: added: backport-needed
Changed in operator-metallb:
milestone: none → 1.20+ck2
importance: Undecided → High
status: Confirmed → Fix Committed
Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

1.20+ck2 is no longer planned; this will go out with the upcoming 1.21 release.

Changed in operator-metallb:
milestone: 1.20+ck2 → 1.21
Changed in operator-metallb:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.