rmq-server endpoints mixed although access-network specified

Bug #1605314 reported by Alvaro Uria
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Juju Charms Collection)
Fix Released
Medium
James Page

Bug Description

Hi,

Ubuntu Trusty
Juju 1.25.5
Liberty
rmq-server revno=110

RMQ LXCs have eth0 (on 192.168.17.0/24) and eth1 (on 192.168.21.0/24 netblock).

access-network='192.168.21.0/24'

All services (cinder, nova, etc.) are configuring their .conf file as:
"""
# 2 hosts on netblock#1 and 3rd host (correctly) on netblock#2
rabbit_hosts = 192.168.17.56,192.168.17.81,192.168.21.56
"""

Cluster status is:
"""
root@juju-machine-0-lxc-12:~# rabbitmqctl cluster_status
Cluster status of node 'rabbit@192-168-17-56' ...
[{nodes,[{disc,['rabbit@192-168-17-56','rabbit@192-168-17-76',
                'rabbit@192-168-17-81']}]},
 {running_nodes,['rabbit@192-168-17-56']},
 {cluster_name,<<"rabbit@juju-machine-0-lxc-12">>},
 {partitions,[{'rabbit@192-168-17-56',['rabbit@192-168-17-76',
                                       'rabbit@192-168-17-81']}]}]
"""

Shouldn't "rabbit_hosts" parameter on .conf files point only to IPs on same netblock as access-network describes? I understand it might be a fallback but all LXCs have both IPs (ie: second one should've been detected).

Please let me know if you would need any further detail.

Thank you.

Revision history for this message
James Page (james-page) wrote :

Yes - the RMQ units should be presenting access details on the:

  access-network='192.168.21.0/24'

configuration; looks like one of them is, and the others are falling back to private-address which is the behaviour if an address in the subnet cannot be resolved for a given unit.

Can you ensure that all nics are fully configured within each container?

FWIW the cluster status looks OK - that will currently always bind to private-address to ensure host resolvability, otherwise RMQ just explodes.

Changed in rabbitmq-server (Juju Charms Collection):
status: New → Incomplete
Revision history for this message
Alvaro Uria (aluria) wrote :

Hi James,

LXCs interfaces are correctly configured although once in access-network block might not have DNS resolution. Would that be fine?

iproute output: http://pastebin.ubuntu.com/20885949/

Alvaro Uria (aluria)
Changed in rabbitmq-server (Juju Charms Collection):
status: Incomplete → New
James Page (james-page)
Changed in rabbitmq-server (Juju Charms Collection):
importance: Undecided → Low
Revision history for this message
James Page (james-page) wrote :

Alvaro

Reading the code in the current development version of the charm, when using configuration options for network binding, if an appropriate address in the configured access-network can't be found, the charm will fallback to 'private-address' which is potentially what you are seeing here.

Revision history for this message
James Page (james-page) wrote :

I think you're using bzr revno 110 from lp:charms/trusty/rabbitmq-server; which has:

        # NOTE(jamespage)
        # override private-address settings if access-network is
        # configured and an appropriate network interface is configured.
        relation_settings['hostname'] = \
            relation_settings['private-address'] = \
            get_address_in_network(config('access-network'),
                                   unit_get('private-address'))

so although the code has been refactored a bit to accommodate network spaces in Juju 2.0, it should be doing much the same thing.

Revision history for this message
James Page (james-page) wrote :

OK I see the issue; the following units will not correctly set the hostname and private-address values - so what you are seeing is the lead unit presenting correctly, and the followers only presenting private-address still; this is resolved in current tip of master branch - checking stable version.

Revision history for this message
James Page (james-page) wrote :

Current stable version still has this problem - I think we can do a minor stable fix to resolve this.

Changed in rabbitmq-server (Juju Charms Collection):
status: New → Confirmed
importance: Low → Medium
assignee: nobody → James Page (james-page)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-rabbitmq-server (stable/16.10)

Fix proposed to branch: stable/16.10
Review: https://review.openstack.org/428065

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-rabbitmq-server (stable/16.10)

Reviewed: https://review.openstack.org/428065
Committed: https://git.openstack.org/cgit/openstack/charm-rabbitmq-server/commit/?id=8f99e64b886718b8b53cca2e561ccf61b0138331
Submitter: Jenkins
Branch: stable/16.10

commit 8f99e64b886718b8b53cca2e561ccf61b0138331
Author: James Page <email address hidden>
Date: Thu Feb 2 09:52:43 2017 +0000

    networking: Fix amqp relation network binding

    When configuration for 'access-network' is supplied, only the
    lead unit in a cluster will present the correct network address
    for access to the RMQ cluster; followers always present 'private-address'
    as they echo out data from the peer relation or leader storage
    on amqp relations, overriding the hostname and private-address
    values with the return value of get_host_ip.

    Pass 'amqp_relation=True' to get_host_ip to ensure that any supplied
    configuration option is correctly used, or that for Juju 2.0, any
    network space binding for the amqp relation is used.

    This is resolved in master branch by a larger refactoring of the
    resolution of addressing for the amqp and cluster relations.

    Change-Id: Ia4921becf149243fab40f9f21a26bfbe9d3da332
    Closes-Bug: 1605314

James Page (james-page)
Changed in rabbitmq-server (Juju Charms Collection):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.