Cannot ping pods from the gateway peered over BGP with Calico nodes

Bug #1944552 reported by Nikolay Vinogradov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Calico Charm
Incomplete
Undecided
Unassigned

Bug Description

Hi all,

I have a Calico deployment on top of OpenStack, the bundle used to deploy the cluster is attached.
The bundle has service and external service CIDRs advertisement enabled as per [1].

The tenant networking topology is described in the network-topo.txt (ASCII diagram).

Calico is configured without overlay, with BGP full-mesh and with external peer being my Linux box router that is running BIRD v2 on Ubuntu 20.04. 192.168.101.1 is the router, the others are kubernetes-worker nodes, see also the attached juju_status.log:

$ juju ssh kubernetes-master/0 sudo calicoctl node status
Calico process is running.

IPv4 BGP status
+-----------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+-----------------+-------------------+-------+----------+-------------+
| 192.168.101.62 | node-to-node mesh | up | 03:15:08 | Established |
| 192.168.101.251 | node-to-node mesh | up | 03:15:07 | Established |
| 192.168.101.172 | node-to-node mesh | up | 03:15:07 | Established |
| 192.168.101.1 | node specific | up | 08:57:57 | Established |
+-----------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

Connection to 192.168.101.184 closed.

The problem is when I try to access the pod CIDR from the router I get random issues depending on if the packet first hits the node that is actually running the pod or the neighbor calico node.

Calico advertises the following routes to my BGP router: birdc_show_route.txt

As you can notice in the routes output, /26 pod prefixes are advertised from all calico nodes, which means that a calico node should forward the packets to the correct node should it receive a packet not intended for it, but I don't see this while 'tcpdump'ing the interfaces on the node that is supposed to forward.

[1] https://github.com/charmed-kubernetes/layer-calico/commit/c6f2af819cd8e0c42e05b3823e817a0dc7f5d300

Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :

Adding diagram as an image in case of the fonts problems in the browser.

Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :
Revision history for this message
Nikolay Vinogradov (nikolay.vinogradov) wrote :

It seems that the issue is observed only when node-to-node-mesh=true in Calico charm and the calico is peered with external router. Setting node-to-node-mesh to false resolves the issue. Probably this issue is relevant: https://github.com/projectcalico/calico/issues/3246 (see the last comment there as well).

Revision history for this message
George Kraft (cynerva) wrote :

I'm unable to reproduce or otherwise confirm this issue. Can you share output of `ip route` from your kubernetes-master and kubernetes-worker units? Can you show me what you are seeing with tcpdump?

Changed in charm-calico:
status: New → Incomplete
Revision history for this message
George Kraft (cynerva) wrote :

Can you give the router and the Calico nodes the same AS number to do iBGP instead of eBGP? I believe that would allow you to enable node-to-node-mesh=true without getting the 2-hop routes on your router.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.