Inter cell communication doesn't support multiple rabbit servers / HA

Bug #1178541 reported by Sam Morrison on 2013-05-10
48
This bug affects 8 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
RedBaron

Bug Description

When nova-cells talks to other cells rabbits there is no way to specify multiple servers and use the HA / mirrored queues with rabbit.

tags: added: cells
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
tinytmy (tangmeiyan77) on 2014-03-25
Changed in nova:
assignee: nobody → tinytmy (tangmeiyan77)
assignee: tinytmy (tangmeiyan77) → nobody
Changed in nova:
assignee: nobody → Dheeraj Gupta (dheeraj-gupta4)
RedBaron (dheeraj-gupta4) wrote :

nova-cells uses the `transport_url` field in the `cells` table to talk to other cells. The `transport_url` is a string representation of the `oslo.messaging.transport.TransportURL` which supports multiple hosts. The `transport_url` field is written to the DB when a new cell is created using `nova-manage cell create`
IMO support for multiple rabbit servers can be added by modifying `nova-manage` script so that it accepts a comma separated list of host:port rather than/in addition to separate hostname and port parameters that it accepts currently.
I have tested the solution by modifying the transport URL in the DB manually and adding multiple host names in that. It worked in my test setup for both parent and child cells.

MariaDB [nova]> select id, name, is_parent, transport_url from cells where id=2;
+----+-------+-----------+-----------------------------------------------------------------------------------+
| id | name | is_parent | transport_url |
+----+-------+-----------+-----------------------------------------------------------------------------------+
| 2 | cell1 | 0 | rabbit://guest:devstack@x.x.x.x:5672,guest:devstack@y.y.y.y:5672// |
+----+-------+-----------+-----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Sam Morrison (sorrison) wrote :

I can confirm this works, we've been running this way in production for a while now. Just had to edit the DB manually as nova-manage needs to support multiple.

Fix proposed to branch: master
Review: https://review.openstack.org/113558

Changed in nova:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/113558
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cc17991d1a5a703d486735b902da8902641dad83
Submitter: Jenkins
Branch: master

commit cc17991d1a5a703d486735b902da8902641dad83
Author: Dheeraj Gupta <email address hidden>
Date: Tue Aug 12 15:12:52 2014 +0000

    Support message queue clusters in inter-cell communication

    Since cells use oslo.messaging to specify and store the message queue URL,
    multiple hosts can be specified by manually modifying that URL in the database.
    However, there is no way to specify multiple hosts during cell creation phase.
    This patch adds a --broker_hosts option to `nova-manage cell create` command,
    which is analogous to the rabbit_hosts option in nova.conf and can be used to
    specify multiple message queue servers as a comma separated list. Each server
    is specified using hostname:port with both being mandatory. The existing
    --hostname and --port options continue to remain but are only considered if no
    --broker_hosts is specified.
    Internally, each host is converted to a oslo.messaging.TransportHost
    and added to the generated TransportURL.
    This patch also adds unit tests for creation of the TransportHosts from
    user given input.

    Change-Id: I14de860b1d12f3e2c0169b58651d580792d6ce0e
    Closes-Bug: 1178541

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2014-12-18
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Mike Dorman (mdorman-m) wrote :

Re: comment #2. I'm not able to get the failover working with multiple hosts listed in the URL. This is under Icehouse.

I've tried both of the following formats:

rabbit://user:password@host1:5672,host2:5672/vhost
rabbit://user:password@host1:5672,user:password@host2:5672/vhost

My particular situation is that host1 is down. When nova-cells starts up, it only ever tries to connect to host1 (which it never can.) It never moves on to try host2.

Is there something in oslo messaging that prevents a failover to the other host(s) in the list, if the first host is never connected to?

Thierry Carrez (ttx) on 2015-04-30
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers