Comment 3 for bug 2019978

Revision history for this message
Michal Arbet (michalarbet) wrote : Re: [Bug 2019978] Re: oslo_messaging kombu strategy round-robin not working correctly

Hi Andrew,

yeah, you are right, we already confirmed that mentioned patch is fixing
issue, but unit tests in yoga are failing ...do you know why ?

Thanks
Michal Arbet
Openstack Engineer

Ultimum Technologies a.s.
Na Poříčí 1047/26, 11000 Praha 1
Czech Republic

+420 604 228 897
<email address hidden>
*https://ultimum.io <https://ultimum.io/>*

LinkedIn <https://www.linkedin.com/company/ultimum-technologies> | Twitter
<https://twitter.com/ultimumtech> | Facebook
<https://www.facebook.com/ultimumtechnologies/timeline>

po 22. 5. 2023 v 15:10 odesílatel Andrew Bogott <email address hidden>
napsal:

> this seems similar to https://bugs.launchpad.net/charm-rabbitmq-
> server/+bug/1993149 which is now fixed with
> https://review.opendev.org/c/openstack/oslo.messaging/+/866617
>
> If you don't want to wait for the backport, you can implement a similar
> fix in config by setting kombu_reconnect_delay to 0.5.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/2019978
>
> Title:
> oslo_messaging kombu strategy round-robin not working correctly
>
> Status in oslo.messaging:
> New
>
> Bug description:
> Hi,
>
> We were doing some HA tests against our openstack cluster and we were
> testing what will happen if we turn off one rabbitmq from 3 node
> rabbitmq cluster and found that default oslo.messaging
> kombu_failover_strategy = round-robin introduced in
>
> https://github.com/openstack/oslo.messaging/commit/6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
> probably not working as expected.
>
> We turned off 10.157.106.71 and clients didn't reconnect. If I grepped
> occurences in logs for this rabbitmq server, i found that it is always
> trying that host which is turned off.
>
>
> root@controller0:/home/ubuntu# grep -Ri '2023-05-17.*Trying again in'
> /var/log/kolla | awk '{print $11}' | sort | uniq -c
> 5 10.157.106.136:5672
> 12 10.157.106.6:5672
> 50381 10.157.106.71:5672
>
> root@controller13:/home/ubuntu# grep -Ri '2023-05-17.*Trying again in'
> /var/log/kolla | awk '{print 11}' | sort | uniq -c
> 2 -]
> 6 10.157.106.136:5672
> 4 10.157.106.6:5672
> 41996 10.157.106.71:5672
>
> I was also checking TCP SYN via netstat and I saw every time it was
> trying to connect to rabbitmq server which was down.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/oslo.messaging/+bug/2019978/+subscriptions
>
>