Comment 2 for bug 1486615

Revision history for this message
Daniel Manrique (roadmr) wrote :

This looks like some evil dance between prodstack host naming and the rabbitmq charm.

See in the log how it does this:

2015-08-20 01:55:46 INFO juju-log local nodename: ps45-10-25-62-20
2015-08-20 01:55:46 INFO juju-log configuring nodename
2015-08-20 01:55:46 INFO juju-log forcing nodename=ps45-10-25-62-20
2015-08-20 01:55:46 INFO juju-log Stopping rabbitmq-server.
2015-08-20 01:55:46 INFO config-changed * Stopping message broker rabbitmq-server
2015-08-20 01:55:49 INFO config-changed ...done.
2015-08-20 01:55:49 INFO juju-log Updating /etc/rabbitmq/rabbitmq-env.conf, RABBITMQ_NODENAME=rabbit@ps45-10-25-62-20
2015-08-20 01:55:49 INFO juju-log Starting rabbitmq-server.
2015-08-20 01:55:49 INFO config-changed * Restarting message broker rabbitmq-server
2015-08-20 01:55:51 INFO config-changed ...fail!

The first three lines come from the rabbitmq charm's configure_nodename method:

def configure_nodename():
    '''Set RABBITMQ_NODENAME to something that's resolvable by my peers'''
    nodename = get_local_nodename()
    log('configuring nodename', level=INFO)
    if (nodename and
            rabbit.get_node_name() != 'rabbit@%s' % nodename):
        log('forcing nodename=%s' % nodename, level=INFO)
        # would like to have used the restart_on_change decorator, but
        # need to stop it under current nodename prior to updating env
        log('Stopping rabbitmq-server.')
        service_stop('rabbitmq-server')
        rabbit.update_rmq_env_conf(hostname='rabbit@%s' % nodename,
                                   ipv6=config('prefer-ipv6'))
        log('Starting rabbitmq-server.')
        service_restart('rabbitmq-server')

get_local_nodename has this:
def get_local_nodename():
    '''Resolve local nodename into something that's universally addressable'''
    ip_addr = get_host_ip(unit_get('private-address'))
    log('getting local nodename for ip address: %s' % ip_addr, level=INFO)
    try:
        nodename = get_hostname(ip_addr, fqdn=False)
    except:
        log('Cannot resolve hostname for %s using DNS servers' % ip_addr,
            level='WARNING')
        log('Falling back to use socket.gethostname()',
            level='WARNING')
        # If the private-address is not resolvable using DNS
        # then use the current hostname
        nodename = socket.gethostname()
    log('local nodename: %s' % nodename, level=INFO)
    return nodename

These changes that broke nodename configuration were introduced in revno 100 of the rabbitmq charm. Coincidentally, our runs started failing the same day that revno appeared, and our spec doesn't pin a specific revno, so:

1- We got revno 100 19 days ago
2- Our runs broke

So the quick fix for this is to pin revno 99 in our spec. I'll post a link to the bug on the charm once I file it.