2020-06-29 14:59:55 |
Ghada Khalil |
bug |
|
|
added bug |
2020-06-29 15:01:41 |
Ghada Khalil |
starlingx: assignee |
|
Joseph Richard (josephrichard) |
|
2020-06-29 15:07:58 |
Ghada Khalil |
bug |
|
|
added subscriber Matt Peters |
2020-06-29 15:09:16 |
Ghada Khalil |
summary |
calico binds to the floating IP causing failures on swact |
calico binds to the floating IP after pod restart, causing failures on swact |
|
2020-06-29 15:17:15 |
Ghada Khalil |
description |
Brief Description
-----------------
It was observed that occasionally after a swact, the calico BGP peering is failing.
This is the result of Calico choosing the floating IP on the cluster-host. The unit IP should be used instead. This happens if the calico-node pod restarts on the same host that currently has the floating IP.
If a system is in this condition, a swact results in the floating IP moving, so Calico loses communication with the BGP peers.
Severity
--------
Major - calico issues after swact
Steps to Reproduce
------------------
- Bring up system
- Check address calico is using >> should be the unit IP
- Restart the calico pod on the host w/ the floating IP
- Check the address calico is using >> will not be the floating IP
- Perform a swact
- Verify that calico loses peering with the BGP peers
Expected Behavior
------------------
calico should always use the unit IP address
Actual Behavior
----------------
calico uses the floating IP address if the calico pod is restarted
Reproducibility
---------------
Reproducible given the steps above
System Configuration
--------------------
Any 2-node system
Branch/Pull Time/Commit
-----------------------
Seen on stx master 2020-06-27, but is a day 1 issue
Last Pass
---------
Unknown
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Regression Testing
Workaround
---------- |
Brief Description
-----------------
It was observed that occasionally after a swact, the calico BGP peering is failing.
This is the result of Calico choosing the floating IP on the cluster-host. The unit IP should be used instead. This happens if the calico-node pod restarts on the same host that currently has the floating IP.
If a system is in this condition, a swact results in the floating IP moving, so Calico loses communication with the BGP peers.
Severity
--------
Major - calico issues after swact
Steps to Reproduce
------------------
- Bring up system
- Check address calico is using >> should be the unit IP
- Restart the calico pod on the host w/ the floating IP
- Check the address calico is using >> will now be the floating IP
- Perform a swact
- Verify that calico loses peering with the BGP peers
Expected Behavior
------------------
calico should always use the unit IP address
Actual Behavior
----------------
calico uses the floating IP address if the calico pod is restarted
Reproducibility
---------------
Reproducible given the steps above
System Configuration
--------------------
Any 2-node system
Branch/Pull Time/Commit
-----------------------
Seen on stx master 2020-06-27, but is a day 1 issue
Last Pass
---------
Unknown
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Regression Testing
Workaround
---------- |
|
2020-06-29 16:53:40 |
Ghada Khalil |
tags |
|
stx.containers stx.networking |
|
2020-06-29 20:05:15 |
Ghada Khalil |
summary |
calico binds to the floating IP after pod restart, causing failures on swact |
IPv6: calico binds to the floating IP after pod restart, causing failures on swact |
|
2020-06-29 20:05:33 |
Ghada Khalil |
description |
Brief Description
-----------------
It was observed that occasionally after a swact, the calico BGP peering is failing.
This is the result of Calico choosing the floating IP on the cluster-host. The unit IP should be used instead. This happens if the calico-node pod restarts on the same host that currently has the floating IP.
If a system is in this condition, a swact results in the floating IP moving, so Calico loses communication with the BGP peers.
Severity
--------
Major - calico issues after swact
Steps to Reproduce
------------------
- Bring up system
- Check address calico is using >> should be the unit IP
- Restart the calico pod on the host w/ the floating IP
- Check the address calico is using >> will now be the floating IP
- Perform a swact
- Verify that calico loses peering with the BGP peers
Expected Behavior
------------------
calico should always use the unit IP address
Actual Behavior
----------------
calico uses the floating IP address if the calico pod is restarted
Reproducibility
---------------
Reproducible given the steps above
System Configuration
--------------------
Any 2-node system
Branch/Pull Time/Commit
-----------------------
Seen on stx master 2020-06-27, but is a day 1 issue
Last Pass
---------
Unknown
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Regression Testing
Workaround
---------- |
Brief Description
-----------------
It was observed that occasionally after a swact, the calico BGP peering is failing.
This is the result of Calico choosing the floating IP on the cluster-host. The unit IP should be used instead. This happens if the calico-node pod restarts on the same host that currently has the floating IP.
If a system is in this condition, a swact results in the floating IP moving, so Calico loses communication with the BGP peers.
Severity
--------
Major - calico issues after swact
Steps to Reproduce
------------------
- Bring up system
- Check address calico is using >> should be the unit IP
- Restart the calico pod on the host w/ the floating IP
- Check the address calico is using >> will now be the floating IP
- Perform a swact
- Verify that calico loses peering with the BGP peers
Expected Behavior
------------------
calico should always use the unit IP address
Actual Behavior
----------------
calico uses the floating IP address if the calico pod is restarted
Reproducibility
---------------
Reproducible given the steps above
System Configuration
--------------------
Any 2-node system w/ IPv6 configured
Branch/Pull Time/Commit
-----------------------
Seen on stx master 2020-06-27, but is a day 1 issue
Last Pass
---------
Unknown
Timestamp/Logs
--------------
N/A
Test Activity
-------------
Regression Testing
Workaround
---------- |
|
2020-06-29 20:58:34 |
Ghada Khalil |
bug |
|
|
added subscriber Daniel Badea |
2020-06-29 20:58:44 |
Ghada Khalil |
starlingx: importance |
Undecided |
High |
|
2020-06-29 20:58:49 |
Ghada Khalil |
starlingx: status |
New |
Triaged |
|
2020-06-29 20:59:04 |
Ghada Khalil |
tags |
stx.containers stx.networking |
stx.4.0 stx.containers stx.networking |
|
2020-06-29 20:59:16 |
Ghada Khalil |
removed subscriber Daniel Badea |
|
|
|
2020-06-29 20:59:25 |
Ghada Khalil |
bug |
|
|
added subscriber Allain Legacy |
2020-07-01 01:41:15 |
OpenStack Infra |
starlingx: status |
Triaged |
In Progress |
|
2020-07-01 17:19:05 |
Stefan Dinescu |
bug |
|
|
added subscriber Stefan Dinescu |
2020-07-03 00:24:40 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|