percona cluster getting wrong private ip

Bug #1657305 reported by Narinder Gupta
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Invalid
Undecided
Unassigned
OPNFV
New
Undecided
Unassigned
OpenStack Nova Compute Charm
Invalid
High
Unassigned
OpenStack Percona Cluster Charm
Fix Released
High
David Ames
juju (Ubuntu)
Invalid
Undecided
Unassigned
nova-compute (Juju Charms Collection)
Invalid
High
Unassigned
percona-cluster (Juju Charms Collection)
Invalid
Critical
James Page

Bug Description

Percona cluster charm from master branch is not getting correct private ip when using spaces.

internal CIDR: 192.0.5.0/24
public CIDR: 172.21.159.0/24

wsrep_node_address=172.21.159.137

Below are the logs requested.

root@juju-5f683e-1-lxd-2:/var/log/juju# ls
machine-1-lxd-2.log unit-mysql-0.log
root@juju-5f683e-1-lxd-2:/var/log/juju# pastebinit unit-mysql-0.log
http://paste.ubuntu.com/23819314/
root@juju-5f683e-1-lxd-2:/var/log/juju# pastebinit machine-1-lxd-2.log
http://paste.ubuntu.com/23819315/
root@juju-5f683e-1-lxd-2:/var/log/juju#
root@juju-5f683e-1-lxd-2:/var/log/juju#
root@juju-5f683e-1-lxd-2:/var/log/juju# cd ../mysql
root@juju-5f683e-1-lxd-2:/var/log/mysql# ls
error.log
root@juju-5f683e-1-lxd-2:/var/log/mysql# pastebinit error.log
http://paste.ubuntu.com/23819317/
root@juju-5f683e-1-lxd-2:/var/log/mysql# cd /etc/mysql/
root@juju-5f683e-1-lxd-2:/etc/mysql# ls
conf.d debian.cnf my.cnf my.cnf.fallback percona-xtradb-cluster percona-xtradb-cluster.cnf percona-xtradb-cluster.conf.d
root@juju-5f683e-1-lxd-2:/etc/mysql# cd percona-xtradb-cluster.conf.d/
root@juju-5f683e-1-lxd-2:/etc/mysql/percona-xtradb-cluster.conf.d# ls
client.cnf isamchk.cnf mysqld.cnf mysqld_safe.cnf
root@juju-5f683e-1-lxd-2:/etc/mysql/percona-xtradb-cluster.conf.d# pastebinit mysqld.cnf
http://paste.ubuntu.com/23819320/

Tags: oil oil-2.0
David Ames (thedac)
affects: charms → percona-cluster (Juju Charms Collection)
Changed in percona-cluster (Juju Charms Collection):
importance: Undecided → Critical
milestone: none → 17.01
status: New → Confirmed
Revision history for this message
David Ames (thedac) wrote :

According to John Meinel:

'The charms should be updated to use "network-get <bindname> --preferred-address" instead of just "unit-get private-address". unit-get doesn't pass the information to Juju for us to know which bit of the configuration we're supposed to be reporting.'

In the OpenStack charms most of the time we use charmhelpers.contrib.openstack.ip resolve_address() which does the right thing checking network-get first.

However, we have some isolated instances where unit_get(private-address) is defaulted to. Which according to John can be unpredictable.

Specifically, calls to get_host_ip() functions that return the private-address by default in nova-compute and percona-cluster. Leading to unpredictable IPs in configuration files.

We need to also check other charms for this issue.

Nova compute HostIPContext returns the private-address:
class HostIPContext(context.OSContextGenerator):
    def __call__(self):
        ctxt = {}
        if config('prefer-ipv6'):
            host_ip = get_ipv6_addr()[0]
        else:
            host_ip = get_host_ip(unit_get('private-address'))

        if host_ip:
            # NOTE: do not format this even for ipv6 (see bug 1499656)
            ctxt['host_ip'] = host_ip

        return ctxt

percona-cluster in render_config returns the private-address:
    context = {
        'cluster_name': 'juju_cluster',
        'private_address': get_host_ip(),

Changed in nova-compute (Juju Charms Collection):
status: New → Confirmed
importance: Undecided → Critical
milestone: none → 17.01
Revision history for this message
David Ames (thedac) wrote :

According to John Meinel:
'The charms should be updated to use "network-get <bindname>
--preferred-address" instead of just "unit-get private-address".
unit-get doesn't pass the information to Juju for us to know which bit
of the configuration we're supposed to be reporting.'

I would make the argument that private-address *should* be
predictable. It *should* be the PXE boot IP if not configurable.
As expressed in this bug:
https://bugs.launchpad.net/juju/+bug/1591962

These two bugs express the same issue. They say they are fix-released
but the fixes only deal with the symptom not the underlying problem.
https://bugs.launchpad.net/juju/+bug/1616098/
https://bugs.launchpad.net/juju/+bug/1603473/

Revision history for this message
Narinder Gupta (narindergupta) wrote :

here is the comment from juju team John Meinel:

There is nothing to say that the PXE address is better than the other address for any given *user's* deployment. If a charm isn't updated to support network-get, even if we did make private-address stable, there would be no way to put the application on the second interface when you really wanted it. And if a charm is updated to support network-get, then the stability of private-address is not as important.
FWIW, I believe we actually default private-address to the first address that MAAS returns. I haven't found any particular ordering to their value, as while it seems to be stable per node, it is not stable between nodes. (If I configure 2 nodes with similar network devices, named the same, in similar address ranges, one Node has a 172.* address first, and the second node has a 10.* address first, etc.)

I'm not against making private-address more stable, but I do feel like it is completely papering over the real issue, which is being able to allow a user to specify where they want the application deployed by making the charm ask Juju where it has been configured to run.

John

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in juju-core (Ubuntu):
status: New → Confirmed
Larry Michel (lmic)
tags: added: oil oil-2.0
Revision history for this message
Larry Michel (lmic) wrote :

I was able to get the percona-cluster to fetch the correct private-address by using access-network option to specify the correct network where I expected this private-address to be on.

Note that I also was using the access binding to specify space associated with that subnet so not sure if both binding and access-network are needed. However, the access binding by itself did not work.

For example:

With space oil set mapped to 192.0.5.0/24

  percona-cluster:
    charm: cs:trusty/percona-cluster
    num_units: 1
    options:
      sst-password: root
      root-password: root
      max-connections: 1500
      access-network: 192.0.5.0/24
      source: cloud:trusty-mitaka
    bindings:
      access: oil
    to:
    - lxc:3

Revision history for this message
James Page (james-page) wrote :

Larry - for percona-cluster, you'll need to provide a binding for the shared-db relation to oil as well; no need to use the access-network configuration option (infact that prevents the use of juju network space binding in the charm).

Revision history for this message
James Page (james-page) wrote :

I appreciate the need to move charms to the network-get --primary-address charm hook tool to support use of network spaces, but I still feel that there might be a 'undefined behaviour' use case in this code path.

If a user does not bind a relation or extra-binding, then what gets returned from network-get --primary-address <relation-name>? Does the unbound, default, behaviour mimic what unit-get private-address does?

affects: juju-core (Ubuntu) → juju (Ubuntu)
Revision history for this message
James Page (james-page) wrote :

In the scope of the percona-charm, we see two problems:

a) the network space binding for the cluster relation is not being used for the local wsrep address for mysql (although the cluster relation space binding is used to build the list of hosts participating in the cluster). This might not actually impact function

b) for the db and db-admin relations, network space binding is not supported - however this is relatively easy to switch based on the fact that the shared-db relation already dtrt.

James Page (james-page)
Changed in percona-cluster (Juju Charms Collection):
assignee: nobody → James Page (james-page)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-percona-cluster (master)

Reviewed: https://review.openstack.org/424911
Committed: https://git.openstack.org/cgit/openstack/charm-percona-cluster/commit/?id=881591871513489a67733d593cdb0721d41b5012
Submitter: Jenkins
Branch: master

commit 881591871513489a67733d593cdb0721d41b5012
Author: David Ames <email address hidden>
Date: Tue Jan 24 16:14:23 2017 -0800

    Fix support for network-spaces

    Fix misc issues with use of Juju 2.0 network spaces:

     - Ensure that wsrep address for local unit is correctly
       set using the binding for the cluster peer relation.

     - Correctly set the DB access hostname for db and db-admin
       relation types.

    This includes a little refactoring to support reuse within
    the charm.

    Closes-Bug: 1657305

    Change-Id: Id1a800e2ada6fd196422b003fd8e251cab5ad724

Changed in percona-cluster (Juju Charms Collection):
status: Confirmed → Fix Committed
James Page (james-page)
Changed in nova-compute (Juju Charms Collection):
importance: Critical → High
status: Confirmed → Triaged
James Page (james-page)
Changed in charm-nova-compute:
importance: Undecided → High
status: New → Triaged
Changed in nova-compute (Juju Charms Collection):
status: Triaged → Invalid
James Page (james-page)
Changed in percona-cluster (Juju Charms Collection):
status: Fix Committed → Invalid
James Page (james-page)
Changed in charm-percona-cluster:
status: New → Fix Released
milestone: none → 17.02
Changed in charm-nova-compute:
milestone: none → 17.05
Changed in charm-percona-cluster:
importance: Undecided → High
assignee: nobody → David Ames (thedac)
Ryan Beisner (1chb1n)
Changed in juju (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Based on Ryan's assessment and marking of this as 'Invalid' for juju (ubuntu), I am marking this as 'Invalid' for "juju" project too.

Changed in juju:
status: New → Invalid
James Page (james-page)
Changed in charm-nova-compute:
milestone: 17.05 → 17.08
James Page (james-page)
Changed in charm-nova-compute:
milestone: 17.08 → none
Revision history for this message
Liam Young (gnuoy) wrote :

I'm going to mark this is invalid against nova-compute as nova-compute does not have a relation with percona anymore (Icehouse+ I believe).

Changed in charm-nova-compute:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.