The config-changed hook fails if the machine's public-address is an IPv6 one

Bug #1622780 reported by Dominique Poulain
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Percona Cluster Charm
Fix Released
Medium
James Page
percona-cluster (Juju Charms Collection)
Invalid
Medium
Unassigned

Bug Description

A percona-cluster unit whose public-address is an IPv6 one will throw the following exception when running the config-changed hook:

raceback (most recent call last):
  File "hooks/config-changed", line 722, in <module>
    main()
  File "hooks/config-changed", line 715, in main
    hooks.execute(sys.argv)
  File "/var/lib/juju/agents/unit-percona-cluster-1/charm/hooks/charmhelpers/core/hookenv.py", line 715, in execute
    self._hooks[hook_name]()
  File "/var/lib/juju/agents/unit-percona-cluster-1/charm/hooks/charmhelpers/contrib/hardening/harden.py", line 79, in _harden_inner2
    return f(*args, **kwargs)
  File "hooks/config-changed", line 264, in config_changed
    hosts = get_cluster_hosts()
  File "/var/lib/juju/agents/unit-percona-cluster-1/charm/hooks/percona_utils.py", line 192, in get_cluster_hosts
    hosts.append(get_host_ip(cluster_address))
  File "/var/lib/juju/agents/unit-percona-cluster-1/charm/hooks/percona_utils.py", line 123, in get_host_ip
    answers = dns.resolver.query(hostname, 'A')
  File "/usr/lib/python2.7/dist-packages/dns/resolver.py", line 981, in query
    raise_on_no_answer, source_port)
  File "/usr/lib/python2.7/dist-packages/dns/resolver.py", line 910, in query
    raise NXDOMAIN
dns.resolver.NXDOMAIN

juju status output from a minimal repro environment:

MODEL CONTROLLER CLOUD/REGION VERSION
repro lxd lxd/localhost 2.0-beta18

APP VERSION STATUS SCALE CHARM STORE REV OS NOTES
percona-cluster error 1 percona-cluster jujucharms 2 ubuntu

RELATION PROVIDES CONSUMES TYPE
cluster percona-cluster percona-cluster peer

UNIT WORKLOAD AGENT MACHINE PUBLIC-ADDRESS PORTS MESSAGE
percona-cluster/0 error idle 0 fd4b:abcb:4ec5:bf59:216:3eff:feb3:ff75 hook failed: "config-changed"

MACHINE STATE DNS INS-ID SERIES AZ
0 started fd4b:abcb:4ec5:bf59:216:3eff:feb3:ff75 juju-6538a2-0 xenial

Note that in this case, the DNS name is set to a string representation of the IP address. Regardless, line 123 of percona_utils.py can't work for IPv6:

answers = dns.resolver.query(hostname, 'A')

summary: - The config-changed hook fails if the machine's public-address is IPv6
+ The config-changed hook fails if the machine's public-address is an IPv6
+ one
Changed in percona-cluster (Juju Charms Collection):
assignee: nobody → Dominique Poulain (dominique-poulain)
description: updated
Changed in percona-cluster (Juju Charms Collection):
status: New → Confirmed
Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

I have confirmed this, it errors out as Dominique explained, unless 'prefer-ipv6' option is selected - in that case the percona won't start as it won't parse ipv6 addresses properly.

This only manifests when host is ipv6 only - in case of mixed ipv4/ipv6 setup the dns.resolver.query won't error out and the configuration will be properly created.

However, this all can be mitigated with using IPs instead of hostnames, in a manner that percona configuration (specifically the 'wsrep_cluster_address' option) always lists ip addresses with port numbers, like this:

wsrep_cluster_address=gcomm://a1:b1:c1:d1:4567,a2:b2:c2:d2:4567

That way the parser will first read the last colon, take the string right of it as a port number and then string right of it as the ip address (regardless of it being ipv4 or ipv6).

With this we don't need percona-cluster charm putting hostnames in configuration and adding those hostnames to /etc/hosts to make sure hostnames are properly resolved.

Revision history for this message
James Page (james-page) wrote :

My memory says that the issue was less with pxc itself, but with regards to sst resyncs not understanding IPv6 addresss?

Revision history for this message
James Page (james-page) wrote :

Apparently not any longer - see bug 1380747

Revision history for this message
James Page (james-page) wrote :

OK - it feels like we can just drop the part of get_cluster_hosts that does:

    hosts.append(get_host_ip(cluster_address))

its is not IPv6 aware, so will explode nastily in IPv6 only environments where prefer-ipv6 is not supported.

cluster-address is set based on either the bound interface of the cluster-network subnet configuration or the juju 2.0 bound network space binding - lets try passing that directly into the configuration.

Changed in percona-cluster (Juju Charms Collection):
status: Confirmed → Triaged
importance: Undecided → Medium
Revision history for this message
Dominique Poulain (dominique-poulain) wrote :

Unfortunately I haven't had much time to work on this, and just when I thought I had a little window, the fact Juju 2 GA doesn't yet support IPv6 complicated things a bit as I now get this trying to spin up my test environment:

ERROR creating LXD client: /etc/default/lxd-bridge has IPv6 enabled.
Juju doesn't currently support IPv6.

IPv6 can be disabled by running:

       sudo dpkg-reconfigure -p medium lxd

and then bootstrap again.

Regardless, it looks like I won't have much time available in the near future, so I'll unassign myself and include brief notes below in case they can be of help.

The main issues to address in order to write a fix, AFAICS, are these:

- As Mario wrote, a way around the issue is to set the value of wsrep_cluster_address in my.cnf to a list of address specs (regardless of IP version) that include the port number, like so for IPv6:

wsrep_cluster_address=gcomm://a1:b1:c1::d1:4567,a2:b2:c2::d2:4567

The gcomm port (4567 by default) can be tweaked using wsrep_provider_options (per <https://www.percona.com/doc/percona-xtradb-cluster/5.6/faq.html#what-tcp-ports-are-used-by-percona-xtradb-cluster>- applies to 5.5 as well). The charm doesn't expose an option to do this at present.

Modifying `get_cluster_hosts()`, defined in hooks/percona_utils.py, to return an array of addresses instead of hostnames when `prefer-ipv6` isn't set, still leaves open the question of how to add the port specs to the addresses, so that nothing else in the charm breaks. get_cluster_hosts() is called in a number of places:

$ grep -R get_cluster_hosts\(\) *
cluster-relation-changed: hosts = get_cluster_hosts()
cluster-relation-departed: hosts = get_cluster_hosts()
cluster-relation-joined: hosts = get_cluster_hosts()
config-changed: hosts = get_cluster_hosts()
db-admin-relation-changed: hosts = get_cluster_hosts()
db-relation-changed: hosts = get_cluster_hosts()
ha-relation-changed: hosts = get_cluster_hosts()
ha-relation-joined: hosts = get_cluster_hosts()
install.real: hosts = get_cluster_hosts()
leader-deposed: hosts = get_cluster_hosts()
leader-elected: hosts = get_cluster_hosts()
leader-settings-changed: hosts = get_cluster_hosts()
nrpe-external-master-relation-changed: hosts = get_cluster_hosts()
nrpe-external-master-relation-joined: hosts = get_cluster_hosts()
percona_hooks.py: hosts = get_cluster_hosts()
percona_utils.py:def get_cluster_hosts():
shared-db-relation-changed: hosts = get_cluster_hosts()
start: hosts = get_cluster_hosts()
stop: hosts = get_cluster_hosts()
update-status: hosts = get_cluster_hosts()
upgrade-charm: hosts = get_cluster_hosts()

And at least unit_tests.test_percona_utils.UtilsTests.test_get_cluster_hosts will need to be changed accordingly.

IMHO likeliest site for a mod ATM seems to be `render_config()` in hooks/config-changed, so that the value associated to the `'cluster_hosts'` key of the `context` dictionary is changed from `",".join(hosts)` to e.g. `':4567,'.join(hosts) + ':4567'` (conceptually).

Changed in percona-cluster (Juju Charms Collection):
assignee: Dominique Poulain (dominique-poulain) → nobody
James Page (james-page)
Changed in charm-percona-cluster:
importance: Undecided → Medium
status: New → Triaged
Changed in percona-cluster (Juju Charms Collection):
status: Triaged → Invalid
Revision history for this message
James Page (james-page) wrote :

I've put up:

  https://review.openstack.org/#/c/443073

which will make the charm more IPv6 aware; but I suspect its not the complete fix.

Changed in charm-percona-cluster:
status: Triaged → In Progress
assignee: nobody → James Page (james-page)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-percona-cluster (master)

Reviewed: https://review.openstack.org/443073
Committed: https://git.openstack.org/cgit/openstack/charm-percona-cluster/commit/?id=bb30f468aabbfa5a2b791c576326514d12e39294
Submitter: Jenkins
Branch: master

commit bb30f468aabbfa5a2b791c576326514d12e39294
Author: James Page <email address hidden>
Date: Wed Mar 8 11:08:13 2017 +0000

    Correctly detect IPv6 addresses

    Use the is_ip function from charmhelpers to correctly
    detect and return IPv4 and IPv6 addresses.

    Support DNS querying for IPv6 addresses using ipv6
    argument (defaults to false).

    Resync tox.ini from release tools to resolve libcharmstore
    compatibility issues for 1.25.x testing.

    Change-Id: I719ac7db350b2b257ae057acc4299a8e97501a7b
    Partial-Bug: 1622780

James Page (james-page)
Changed in charm-percona-cluster:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-percona-cluster:
milestone: none → 17.08
James Page (james-page)
Changed in charm-percona-cluster:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.