Activity log for bug #1946326

Date Who What changed Old value New value Message
2021-10-07 08:15:49 Bogdan Dobrelya bug added bug
2021-10-07 08:16:08 Bogdan Dobrelya tripleo: status New Triaged
2021-10-07 08:16:11 Bogdan Dobrelya tripleo: importance Undecided Medium
2021-10-07 08:27:02 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call..
2021-10-07 08:30:56 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server beckends (usually 3).
2021-10-07 08:31:31 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server beckends (usually 3). Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3).
2021-10-07 08:36:36 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3). Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enought to accomplish, like vif-pluggings.
2021-10-07 08:36:45 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enought to accomplish, like vif-pluggings. Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enough to accomplish, like vif-pluggings.
2021-10-07 08:42:51 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enough to accomplish, like vif-pluggings. Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enough (longer than tcp-check timeout) to accomplish. While in the source bug the problem is in the unexpectedly long vif-plugging blocking calls, this issue targets "valid" long blocking calls in general.
2021-10-07 08:43:16 Bogdan Dobrelya description Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might instantly mark all its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enough (longer than tcp-check timeout) to accomplish. While in the source bug the problem is in the unexpectedly long vif-plugging blocking calls, this issue targets "valid" long blocking calls in general. Source: https://bugs.launchpad.net/tripleo/+bug/1895248/comments/10 for such cases, we might want to tweak haproxy to avoid false down events, causing unnecessary HA failovers. That could by L5 send/expect scripts. OR instead try to make such backends smarter to reply its aliveness over L7 checks, even while in a blocking call.. If we leave that as is, for Nova example, concurrent POST /v2.1/servers/{server}/os-interface synchronous calls to n-api might eventually mark all of its backends down, what would look like a full cloud outage, while it is not. The number of such concurrent POST calls should exceed the number of nova API workers (usually is CPU count) multiplied by the number of server backends (usually 3), and also be long enough (longer than tcp-check timeout) to accomplish. While in the source bug the problem is in the unexpectedly long vif-plugging blocking calls, this issue targets "valid" long blocking calls in general.