Comment 6 for bug 1828841

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/stein)

Reviewed: https://review.opendev.org/660277
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=7b8fd6370c8b04c3836ee4f9d06eaa90c7be5197
Submitter: Zuul
Branch: stable/stein

commit 7b8fd6370c8b04c3836ee4f9d06eaa90c7be5197
Author: Hervé Beraud <email address hidden>
Date: Fri May 3 00:55:56 2019 +0200

    Fix switch connection destination when a rabbitmq cluster node disappear

    In a clustered rabbitmq when a node disappears, we get a
    ConnectionRefusedError because the socket get disconnected.

    The socket access yields a OSError because the heartbeat
    tries to reach an unreachable host (No route to host).

    Catch these exceptions to ensure that we call ensure_connection for switching
    the connection destination.

    POC is available at github.com:4383/rabbitmq-oslo_messging-error-poc

    Example:
        $ git clone <email address hidden>:4383/rabbitmq-oslo_messging-error-poc
        $ cd rabbitmq-oslo_messging-error-poc
        $ python -m virtualenv .
        $ source bin/activate
        $ pip install -r requirements.txt
        $ sudo podman run -d --hostname my-rabbit --name rabbit rabbitmq:3
        $ python poc.py $(sudo podman inspect rabbit | niet '.[0].NetworkSettings.IPAddress')

    And in parallele in an another shell|tmux
        $ podman stop rabbit
        $ # observe the output of the poc.py script we now call ensure_connection

    Now you can observe some output relative to the connection who is
    modified and not catched before these changes.

    Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1665399

    Closes-Bug: #1828841

    Change-Id: I9dc1644cac0e39eb11bf05f57bde77dcf6d42ed3
    (cherry picked from commit 9d8b1430e5c081b081c0e3c0b5f12f744dc7809d)