Comment 1 for bug 1710247

Revision history for this message
Drew Freiberger (afreiberger) wrote :

I found that the charm upgrade would complete (and other hooks that call the status check) after creating a symlink /<email address hidden> to <email address hidden> files on all rabbit units. Obviously not a scalable solution.

How best can we handle this corner case when there are multiple reverse DNS entries in a repeatable manner pre and post 16.10? I also checked the 17.02 code and skipping a rev won't help this issue. It seems odd to lookup the hostname for a pid filename instead of checking config files or rabbitmqctl command outputs. for instance, the rabbitmqctl wait <pidfile> command shows "Waiting for 'rabbit@ip-ad-dr-es'" in the log file (and when run manually) as you can see in Peter's log.

# rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbit\@10-76-13-12.pid
Waiting for 'rabbit@10-76-13-12' ...
pid is 16537 ...
(exit code 0)

From what I can tell following the code:

 - in 16.07 wait_app uses get_local_nodename() to determine PID filename
   which in turn calls get_host_ip(unit_get('private-address')) which in turn calls
   get_node_hostname that either uses get_hostname(ip_addr) (coming from
   charmhelpers.contrib.openstack.utils) or falls back to socket.gethostname()
 - charmhelpers.contrib.openstack.utils.get_hostname calls
   charmhelpers.contrib.network.ip.get_hostname which in turn either
   runs dns.reversename.from_address(address) or fails back to
   socket.gethostbyaddr(address)[0]
 - Noting from lp:1710247 ref to lp:1484902 that this is intentional
   for maas2 support.

Perhaps in upgrade-charm, if pid file from hostname code fails, return code should be checked and command output should be used to find the previous pid file name to use and then add a name change routine to re-configure the server and cluster relationships.